Showing posts with label Genuine Text. Show all posts
Showing posts with label Genuine Text. Show all posts

Monday, March 9, 2009

Umeå Study of Plagiarism Detection Software

An article in a Lund, Sweden, student newspaper I found complained that the university's choice of plagiarism detection software was bad because it only found 25% of the plagiarisms in a study done in 2006 by the University of Umeå.

After some unsucessful Googling I entered in a few words from the article and quickly found them in the abstract of two reports from 2006:
  • Anna Nordström och Susanne Sjöberg: UTVÄRDERING AV URKUND, ETT VERKTYG FÖR PLAGIATKONTROLL. Oktober 2006. Report 16 (yes, the names of the reports seem to be a sick joke of the IT department on this sociology unit)
  • Anna Nordström: UTVÄRDERING AV GENUINETEXT, ETT VERKTYG FÖR PLAGIATKONTROLL. Oktober 2006. Report 17
The first is a report commissioned from CERUM by an organisational unit of the University of Umeå about a test of the Swedish plagiarism detection software Urkund. CERUM is the Center for Regional Studies at the University of Umeå.

The university purchased a license for Urkund for four departments at the school for a year and had CERUM look at the usability, effectiveness, and moral problems associated with the use of the software. They interviewed teachers and students prior to and after use and describe very clearly the issues found. Among them:
  • Teachers found the system easy to use, as students send their papers to an email address at Urkund. They then send on the paper to the teachers and later send a report on anything found to the teachers. This does, however, pose a problem as student's email addresses are used as the subject line and were often difficult to connect up to real names.
  • Teachers felt that the system was effective.
  • Using the system didn't save any time, but they feel that it is important for them to deal with plagiarism.
  • Teachers and students alike felt that using the system worked as a deterrent to plagiarism.
  • Students were generally happy with the software and had no moral or ethical problems, as they felt that everyone was being handled equally, something that is very important for Swedes.
  • There was no mention made of the copyright problems. Urkund keeps a copy of all papers unless the students answer their acknowledgement email and request their paper not be put in a database.
  • Neither teachers nor students felt that there was a problem in their relationship with each other based on the use of this system.
In order to test the effectiveness of the system, CERUM requested the four departments give them some common texts from textbooks, online resources, and magazines from their field, as well as from the Internet. Urkund says that it checks the Internet, many publications (including the Swedish National Encyclopedia), and of course their own database.

The researchers added material on their own in order to have 20 sources for each department. They then constructed test material (it is not clear from the report if they put all into one text or made a few papers, the report states differently in different chapters) and ran it through Urkund. Only 18 of the 80 sources were found by Urkund, a rather sorry result.

While the investigation was being done, it was discovered that there were a handful of teachers at the school using the Genuine Text system. This Swedish system, which according to their web page is used in Sweden, Denmark, Russia, and some parts of Africa, only searches the Internet and its own database. It is a web-based system that has three ways of submitting: students upload a file, teachers upload files, or they copy and paste material into a field. They offer statistics for administrators and plagiarism reports that are suitable for submitting to the appropriate disciplinary bodies. It is not clear from the report if students have a way of opting out of having their material stored in the database.

CERUM was commissioned to have a look at this as well, and they interviewed the teachers and students who had similar responses as the Urkund subjects. Interestingly enough, although Genuine Text does not test publication data bases, they managed to score 22 out of 80 plagiarisisms found. There were only 14 plagiarisms from the Internet, Genuine Text found 7 of them.

Both reports lead me to the following conclusions:
  • The software proved just as ineffective in these tests as in mine. They find about half of the Internet sources and very little else.
  • Teachers are so happy to have something found, that they believe it to be effective.
  • The major use of plagiarism detection software is in deterrance. I found many forums in which students wondered how good the systems really were and what their chances for being found out were. There were tips being given on changing around words so as to confuse the system. Unfortunately, this works for many software systems, although Google can often deal with changed word order.
  • No one considers copyright (or patent!) issues in student works, universities and companies keep copies without explicit permission, as would be necessary by EU copyright law.
  • If the students are informed properly (orally and in writing) about the use of the software, they are happy about it being used.
I am attempting to contact the authors, who are no longer with the department, to see if I can find further information about the study.