Copy, Shake, and Paste: ECEIA25, Day 2

Day 1 - Day 2 - Day 3 - Day 4

Day 2 of the conference! We begin, as always, with the security message on how to find the exits, in Swedish and English. Sweden is *very* security conscious!

The keynote today will be from Ana Marušić, from the University of Split School of Medicine and member of the COPE council. She will be speaking on "Challenges to maintaining the integrity of academic publishing"

Ana Marušić first introduces COPE and notes that now universities can be members of COPE, not just journals. And she notes that they are *not* the academic integrity "police", although they are often seen as such.

https://publicationethics.org/

She then notes that people think that being published in a peer-reviewed journal means that the results are true. But there are many new challenges that have arisen: Data and image integrity questions, plagiarism, missing attributions, authorship isses, manipulation of the peer-review process, paper mills, authorship for sale, citations for sale, recycled images/data, duplicate submissions, fake editor acceptance letters, identity theft, unverifiable authors, tortured phrases, use of LLMs, special issue "cartels", citation hacking, H-index boosting, affiliation buying, fictious names, AI generated images/text/peer review.....

This is overwhelming!

Ana Marušić has chosed three areas to focus on: Paper mills, Pre-prints, AI

1) Paper mills now don't just produce papers, but pretty much everything in the publication process is for sale. STM has started "United2Act against paper mills" https://stm-assoc.org/what-we-do/strategic-areas/research-integrity/united2act/

The list of terminology that can be subsumed under the term "paper mills" is breath-takingly long!

Even just trusting in the identity of authors, reviewers, and editors entailes many problems, despite identification methods such as ORCID or watchlists such as Retraction Watch and PubPeer

Ana Marušić notes that there is now a possibility to obtain a persistent identifier for tracking and sharing research projects: https://raid.org/

Post-publication corrections: the number of retractions is growing. There are also corrections, errata, expressions of concern, corrected and re-published articles, retracted and re-published articles and retracted papers that have pervasive errors or unsubstantiated or irreproducible data.

Ana Marušić continues with

2) pre-prints. They blossomed during COVID. PubMed has a good way of marking pre-prints. There is a lot of discussion about whether pre-prints meet the criteria of responsibility to society about science. But only about 7% of pre-prints posted to bioRxiv and medRxiv recieved any comments.

She has a long list of rules about using pre-prints. The most important, I think, is being cautious about referencing pre-prints that were posted but never subsequently published in a peer-reviewed journal.

3) AI generated content

As we know, AI is everywhere. By January 18, 2023 there were already 4 papers with ChatGPT as author, but a month later one of them had this removed. She then shows some publications with AI slop in the abstracts.

How do we deal with this? We can find paper mills, but detecting AI use is very hard. People trust people who declared the use of AI *less* than those who do not declare using AI.

A study found that more than half of users found the feedback from ChatGPT better and more beneficial than feedback from a human. But this study was funded by Zuckerberg...

STM has a paper about the use of AI in scholarly publishing: https://stm-assoc.org/new-white-paper-launch-generative-ai-in-scholarly-communications/ COPE's position is here: https://publicationethics.org/guidance/cope-position/authorship-and-ai-tools

There are so many guidelines, there is now "The Collection of Open Science Integrity Guides (COSIG)": https://zenodo.org/records/15588204

Next up: a panel on "A holistic view of institutional policies on integrity and ethics in education and research" moderated by Irene Glendinning

The panelists are:
* Isidoros (Dorian) Karatzas, Head of the Research Ethics and Integrity Sector, European Commission (EC), DG Research & Innovation
* Signe Mežinska, associate professor of bioethics at the University of Latvia
* Jonas Åkerman, associate professor of philosophy and Research Integrity and Ethics Coordinator at Stockholm University, author of the Swedish Research Council's guidance document on research integrity and ethics
* Laura Flynn, Head of Partnerships division, Quality and Qualifications Ireland (QQI)

Jonas notes that policies themselves are not enough. The have to be embedded in a system.

Signe notes that people find that there are too many guidelines, and now they all have to be updated to reflect AI issues.

Laura notes that Ireland (and England) now have laws making offering contract cheating to students a criminal offense.

Dorian is now recently retired. During his time with the EU he set up the system of ethics checks for research appraisals [Transparency note: I am one of the many ethics experts participating in these checks]. He noted that artificial intelligence has quite compressed the time that needs to be invested to look at the ethical aspects, as everything is happening so fast. The coming revisions to the EU AI Act will be focusing on training and education. He uses an analogy: you can't run superfast trains on old tracks, you have to change the tracks. We have moved to "quantum ethics".

I asked Laura about the contract cheating law in Ireland. Irene reported on England: There is a law in both countries outlawing advertising for contract cheating, but it does not have any teeth. The Irish institution does not have a legal department or an investigation unit.

Dorian notes that many courses at university on research methods and ethics are considered boring. We have to work hard at making discussing research integrity exciting. Irene notes that positive language is very necessary, and we need to have buy-in from the top management of an institution.

A participant asks about including ethics in a meriting system. Jonas notes that is hard, similar to open science issues. We are not helped by having bad science more freely available, but he does note that he does not know how this is done.

After lunch Mary Davis (Academic Integrity Lead, Oxford Brookes University) is presenting on "Developing and improving an accessible and inclusive course to teach ethical decision-making with AI using Universal Design for Learning"

She notes an issue: while there is a need for education about AI, there is also a need for inclusion, as AI can also increase the digital divide and amplify gaps between student groups, so there is a continuous need to prioritise inclusion in academic integrity.

She has developed an inclusive course on teaching ethical decision making with AI. She first analyzed 500 student declaraions about using AI. Then she used the "Universal Design for Learning" in developing the course.

She tries to use a traffic light metaphor to show: appropriate use, at risk practice or inappropriate use. I personally feel that this is too few. Even Mike Perkins' 5 step level probably needs a few more variations.

She notes that she has to continually be changing the course, even moving certain aspects from inappropriate use to at risk use (literature search).

Next up:
Martine Pellerin, University of Alberta with "Fostering Academic Integrity in the GenAI Era: A Macro-Competency Framework for Ethical AI Engagement in Higeher Education"

She notes that traditional policies are inadequate to deal with AI, and that there are serious flaws in fear-based approaches to academic integrity.

She argues that students need to be included in setting up an ethical AI framework. This makes them co-designers of their learning experience.

Next presentation:
Lorna Waddington and Michelle Sutherland-Alle with "GenAI and Academic Integrity: Rethinking Ethical Perspectives"

* Ethical neutrality is an illusion
* Built-in filters in GenAI equals unseen gatekeeping

They looked at tracing patterns of erasure through genocide and indigenous studies.

How do you identify "presence of absence" in records? This is especially seen in indigenous practices that are just missing. They asked the systems why this was missing, and they were able to spout that they were indeed missing such information, but still could not include such knowledge in their answers to questions.

AI doesn't just shape what we know - it shapes what we don't know! Systematic omissions are becoming institutionalised knowledge gaps.

Commercial interests are shaping academic content, and this is found in *all* GenAI systems (they looked at 6). DeepSeek "admitted" that it was not able to speak about certain issues. Gemini calls this "content avoidance".

Lorna asked a system to OCR a pdf about Hitler Germany, and although it started, once it came to some words, it quit and said it was sorry, but it was unable to OCR. She was unable to use the word "genocide" to generate an image, and other issues with her research on genocide being rejected by GenAI systems.

So we have to think about teaching students about history and realizing that certain voices are not being heard, certain topics are being omitted by GenAI systems.

And finally for this session, Mads Goddiksen (presenting), Mikkel Willum Johansen, Christine Clavien, I. Anna S. Olsson, and Orsolya Varga from Copenhagen University with "High cost for limited effect? Using software to detect plagiarism and unauthorized AI use in higher education"

Buying text-matching software (TMS) promises to prevent plagiarim, enable detection of plagiarism and now also offer AI detection. There are concerns that TMS use leads to plagiarism anxiety, phobia, or paranoia.

They investigated TMS-related worries: How common are the worries among European students, why does it happen and how do students react. They interviewed 36 students in 3 EU countries and conducteda survey among 3100 students in 6 EU countries.

Students shift their focus from writing well to not being caught by the TMS. This includes refraining from citing relevant sources to reduce similarity scores, or overciting. This happens because institutions are misusing the percentages reported by TMS.

Should institutions abandon such software? I shouted "yes" :) But they suggest better teacher, administrator, and student training in what the software actually does.

After coffee we now have
"AI Ethics Courses in Higher Education: A Global Analysis" presented by Nafisa Anjum from the Heriot-Watt University in Dubai and Leon Regi John from the University of Wollogong in Dubai.

They looked at 358 academic programs worldwide in 2024 to see if they were offering AI ethics courses.

Problem: University web sites are notoriously bad at presenting information and keeping them up-to-date.

Canada had the most courses, but of course universities that did not have an English-language curriculum description page could not be evaluated and thus the sample was not representative.

They found 161 programs that addressed ethical aspects somewhere, but only 4 programs integrated AI over science, technology and humanities.

There needs to be structured AI ethics education at all levels.

Next up: "Empirical Study of Generative Artificial Intelligence Use in a Programming Task in Undergraduate Studies" presented by Alla Anohina-Naumeca and Ilze Birzniece from the Riga Technical University in Latvia.

In the context of a course on "Fundamentals of Artificial Intelligence" for bachelor's students, they were asked to program a two-person game with perfect information. The students were permitted to use GenAI, but they had to defend their solution orally. Students had a specific form for stating which tools were used and requested to color text and code generated by GenAI, and include all prompts in an appendix.

The research was to understand how and why students use GenAI for solving a programming task.

They started with 104 teams, but only 100 participated in the research. 92 decided to use GenAI. One team used 4 tools, 3 three tools and 16 two tools. Various GenAI tools were used.

Interesting reasons given for tool use or non-use and usefulness of the tool, but I can't type that fast :)

The students didn't do a very good job of reporting the GenAI generated code or recording the prompts. From what they did declare, the research group isolated what tasks were assigned to GenAI and classified these along Bloom's taxonomy. Student's approach is: creating first, and then later understanding, which is the opposite of traditional instruction!

Yocac Eshet had to present on Zoom, as there are currently no flights from Israel to anywhere. He spoke on "Plagiarism in unforeseen contexts".

He notes that GenAI systems generate all sorts of nonsense. He extends a definition of plagiarism to include GenAI.

He defines "unforseen contexts" to have something to do with COVID, but I really don't understand.

He took 25,864 assignments from 42 out of 62 Israeli higher education institutions that has been submitted via Moodle and used a plagiarism detection software. The government says between 70 and 100 means the work is authentic [WHY do governments get into such nonsense??].

He found more plagiarism during Covid-19 than before or after. There was less plagiarism at highly-ranked institutions. [I question if this is really plagiarism detection and not just text similarity!]

The system used was the software Originality purchased by the Israel government for all institutions.

The final session today starts with Roger Larsen and Vegar Andreas Bergum on "Cheating has changed - we need to redefine the solution: A novel method to authenticate original student writing", a company presentation by a Norwegian company, norvalid.

They note an important difference at level 3 of the AI Assessment Scale, whether the student originates the work and AI assists, or AI generates the work and the student assists. They are trying to implement a safeguarding tool for AI-assisted work (that is supposed to also detect ghostwriting). [Color me sceptical.]

Part of their score is the ability of a student to reproduce text they purportedly wrote [I thought this type of Cloze test has been discredited?]. The other part of their score involves extracting linguistic features from previous student writing. [But students improve over their education?] They note that a new verified sample is needed every year.

Next up: Eman AbuKhous and Meruyert Demeuova from the University of Europe for Applied Sciences in Dubai on "The Role of Women in Shaping Integrity-Driven Higher Education - A comparative study of Germany and the UAE"

These two countries, Germany and UAE are very different with respect to gender equality. In the UAE there is leadership of women, but their role is culturally constrained. In Germany there are rigid hierarchies.

[I didn't understand the differences that were listed on a slide.]

Copy, Shake, and Paste

Thursday, June 19, 2025

ECEIA25, Day 2

No comments:

Post a Comment

Search This Blog