Showing posts with label turnitin. Show all posts
Showing posts with label turnitin. Show all posts

Tuesday, October 13, 2020

Plagiarism Detection Software: Publication, Mergers, News

Finally found some time for a post!

First off: The TeSToP working group (of which I am a participant) at the European Network for Academic Integrity has finally published its test of support tools for plagiarism detection. It looks at the results from various angles such as effectiveness on various European languages, one source or multi-source plagiarism, and amount of rewriting done.

Foltýnek, T., Dlabolová, D., Anohina-Naumeca, A. et al. Testing of support tools for plagiarism detection. Int J Educ Technol High Educ 17, 46 (2020). https://doi.org/10.1186/s41239-020-00192-4

Abstract:
There is a general belief that software must be able to easily do things that humans find difficult. Since finding sources for plagiarism in a text is not an easy task, there is a wide-spread expectation that it must be simple for software to determine if a text is plagiarized or not. Software cannot determine plagiarism, but it can work as a support tool for identifying some text similarity that may constitute plagiarism. But how well do the various systems work? This paper reports on a collaborative test of 15 web-based text-matching systems that can be used when plagiarism is suspected. It was conducted by researchers from seven countries using test material in eight different languages, evaluating the effectiveness of the systems on single-source and multi-source documents. A usability examination was also performed. The sobering results show that although some systems can indeed help identify some plagiarized content, they clearly do not find all plagiarism and at times also identify non-plagiarized material as problematic.

So just a few months later these two press releases show up:

  • Turnitin announced in June 2020 that they have purchased the company Unicheck. Both systems participated in the TeSToP test.  
  • Urkund and PlagScan, two more systems that were in the TeSToP test, announced a merger in September 2020: They will now be known as Ouriginal, and will be combining the plagiarism detection results of Urkund with the author metrics of PlagScan. 

These four systems just happened to be the best ones in combined coverage and usability, although none of the systems are perfect, averaging 2.5 ± 0.3 on a scale of 0 to 5. We plan on retesting in 3 years, so it will be very interesting to see how these combined systems fare then.

In other news, the proceedings of the "Plagiarism Across Europe and Beyond 2020" (PAEB2020) that ended up being held online instead of Dubai is now ready and available for download. PAEB2021 will be held in Vienna, September 22-24, 2021, COVID-19 permitting.

And in very sad news, academic integrity researcher Tracey Bretag from Australia passed away in October 2020. Jonathan Bailey has written an excellent obituary on his blog Plagiarism Today. I am glad that I was able to meet her many times and experience her great ideas and energy. It was a pleasure to contribute to her Handbook of Academic Integrity. She will be sorely missed.

Saturday, June 21, 2014

6IIPC - Conference

I previously reported about the pre-conference of the Sixth International Integrity and Plagairism Conference in Newcastle upon Tyne, I will now discuss some of the talks that I was able to attend.
There were four keynotes at the conference:
  • Toni Sant from Wikimedia UK spoke about student online research, aka using the Wikipedia. I was astounded at how many educators in the room were not very familiar with various aspects of how the Wikipedia is researched and written. Toni suggested that teachers have their students write articles for the Wikipedia – I strongly objected to that in the discussion, as the subsequent deletion of articles that are not encyclopaedic will frustrate the students.
  • Tricia Bertram Gallant, the academic integrity officer at UCSD, gave a fanstastic talk about integrity for the "Real World." She pointed out that people cheat, period. We have to quit pretending that we are only interested in academic integrity, that is, integrity that is only valid in school. Instead, we need to reframe our thinking and focus on building integrity for the real world and not just for school. Our students cheat and plagiarize because they are human, we need to help them obtain skills in acting in an ethical manner in any situation, not just academic ones.
  • Samantha Grant presented parts of her documentary about Jason Blair, a New York Times Journalist fired for plagiarism, called A Fragile Trust.  She and Teddi Fishman from the International Center for Academic Integrity then discussed questions that arose from the film. Samantha is now producing a game for journalists called Decisions on Deadline that presents ethical dilemmas for students to solve. The Society of Professional Journalists even has a hotline that journalists can call when they need to speak to someone anonymously.
  • Dan Ariely gave us the honest truth about dishonesty via video conference: We lie. We don't steal if given the opportunity, but if we think we can get away with something, we lie through the teeth, according to the many studies he has conductd. He suggests that we as educators need to teach our students about temptation and how to deal with it. 
In between the keynotes there were nine paper sessions of five papers or workshops each. Unfortunately, the program had some glitches, such as both papers about finding plagiarism in Arabic being scheduled in parallel with each other or three workshops in the area of embedding institutional policy and practice offered at the same time.
One talk was especially amusing: Rui Silva-Sousa from Portugal spoke about whistleblowers on plagiarism and the moral grey area. That is, he was speaking about GuttenPlag Wiki or VroniPlag Wiki, among others. He notes that there is currently a moral panic with respect to plagiarism. The general population perceives an increase of plagiarism among politicians on the basis of media coverage. This legitimizes the culture of control and people will now more than ever report wrongdoing, especially for egoistic reasons, on the part of people who are now in the public eye. He tried to explain the motivation of the researchers documenting plagiarism, and decided they are somewhere between weird mobbers and serious scientists. They must be acting on ethical egoism and through their making the cases public, can cause excessively harsh results in the life of the person who plagiarized. He felt that knowing the names of the whistleblowers would make it easier to judge the morality of their work. 
I noted in the discussion that he was completely ignoring the person whose work was plagiarized, and that a thesis was plagiarized irregardless of who speaks up about this fact. During a discussion over lunch we cleared up some misconceptions, the usual ones such as VroniPlag Wiki not only documenting politicians, and such. He admitted to not having looked at the sites that closely. I do wish that people would observe carefully before coming up with wild theories.
Mike Reddy, who teaches Games Development at the University of South Wales, gave a session on putting the "play" into plagiarism. We were to develop a game concerning some aspect of academic integrity within the hour. Our group didn't do too bad, we came up with a game we called "Freeloader", similar to Spoof, for 5–6 players (the size of a typical student project group).  Each person has three coins and behind their backs chooses how many coins to hide in their fist and put out into the middle, representing their contribution to the project. Each person starts out with three peanuts/candies/whatever. Each person guesses how many coins in total are now in the middle, no two guesses can be the same. All fists are opened and the coins counted – if you guess right, you get a candy from everyone else in the group. If you run out of candies, you get a dog's chance (one last round). If there are only two people left, the amount of candies you have to surrender upon being wrong is increased by one each round, so that there a winner is found quickly who will get the top grade (i.e. a stash of candies).
Phil Newton from the university of Swansea gave a workshop about paper mills and custom-writing companies. He showed live demonstrations of things that are available for sale. In a nutshell: If we are asking for it (such as research diaries or multiple revisions), there is someone out there willing to sell it, and the less time there is left to complete it, the more expensive it is. We got into groups and tried to come up with ideas that focus more on the learning and less in producing items that can be easily ghosted. The ideas ranged from only giving examinations, using peers to police, flipped classrooms, thinking positively, using progression portfolios, decreasing the price of doing the right thing, and increasing the fear factor: if we catch you, it will hurt. In all, we didn't come up with THE solution, but it was good to commiserate with others about the problem.

It was great to meet old friends and meet new people interested in plagiarism, although it was sad to have to miss so many sessions. The conference was co-sponsered by Turnitin and ICAI, so of course many of the talks dealt with Turnitin. It was rather shocking to see how many newish users were so sure that the so-called "similarity index" that Turnitin reports is the true value of "plagiarism" in a paper. Some schools even define Turnitin similarity index levels for determining the sanction to be meted out. However, people with more experience using the system often temper their words, they understand that the number does not mean anything, really, and that the software is just a tool. Even Turnitin has started to speak of itself as a text-matching software in some instances. I suggested to one of the Turnitin top brass that they ditch the number and focus on what their system does best: find matching text strings (and not plagiarism!). Turnitin has just recently been acquired by a venture capital company, so they have some money to invest in making the product better. I hope that the focus will be on the usability and the reports and not on suggesting that they find more plagiarism. The decision as to whether something is to be considered plagiarism or not must rest firmly with the instructor and the institution, not with a software package.

Jonathan Bailey has blogged extensively on Day One - Day Two - Day Three of the conference. 

Sunday, March 23, 2014

Link collection

A few links, submitted by a reader:
  • An article in the Guardian "How can universities stop students cheating online?"
    The author of the article believes all the marketing nonsense that MOOC-offering organizations can detect the student's identities by their typing and a web-cam picture. Hogwash. I can feed any film of myself into a stream and make it look like I am currently being watched. Even if they can identify my typing speed, they can't see the person standing across from me or the paper I just purchased online. That's why in Germany we proctor exams for online courses by having the students show up somewhere where a proctor will be watching over what they are writing.
  • A student at a Christian university blogs about plagiarism. The article quotes statistics from a site that I caught in 2010 (and in 2008) taking money from students to "check their papers", then submitting them to Turnitin, and doctoring the report to make it look like it was from their software. Turnitin put a honeypot paper in their database in 2010 and we checked all of the software systems with this paper. Only this system returned "100% plagiarism".
    The student blogger repeats myths such as "Professors use turnitin.com to screen student work before it is graded. If the work is original, it passes, but if it is derived from another paper, the teacher is made aware of it." This is just not true. Even if they call it an "originality score", one cannot ever prove originality. One can only prove plagiarism by finding a close source that was previously published.
  • From India: Two PhD guides found guilty of plagiarism: "Two professors from Zoology department working at an Ahmednagar-based college affiliated to the University of Pune, have been stripped off their status as PhD guides and two increments have been stopped, after they were found guilty of plagiarism."
  • Rodney Smith, in a bid to become president of the University of the Bahamas, tried to explain away the plagiarism in a speech he gave in 2005 while president of New York University. He was forced to resign over that incident.  It was a small mistake, he says, it was the writer of the speech who was at fault, it was the press' fault, etc.

Thursday, March 27, 2008

Ruling in Student Suit against Turnitin

The United States District Court for the Eastern District of Virginia has pronounced judgement in the case of four students who sued iParadigms, the company selling the Turnitin plagiarism detection service, for violation of copyright. The opinion can be read online at http://www.nacua.org/documents/AV_v_iParadigms.pdf. Two blog articles brought this to my attention, Plagarism Today and ©ollectanea.

The minors filed suit against iParadigms alleging copyright infringement, because they are required to give iParadigms the use of their own papers as part of their school's fight against plagiarism.

iParadigms filed countersuit accusing the students of all sorts of hacking violations, as one of the four had used an account to submit his paper that was not his school account, but one he found on the Internet, and other details. They had, for example, agreed to the terms of service with no modifications allowed, then put a modification at the top of their papers forbidding Turnitin from keeping a copy of their papers.

The way Turnitin works, is that a school contracts with Turnitin for plagiarism detection services, although there is no guarantee that Turnitin can indeed find all plagiarisms - and my test of the service in 2007 was rather sobering, as they were only able to discern correctly plagiarism or not in half of the 20 test cases.

The teachers set up a submission account with Turnitin, and the students are required to set up their own account and submit their papers to Turnitin, which send an "originality report" to the teacher with the paper and keep a copy for their own database. The students are required to agree to a "Clickwrap Agreement" giving Turnitin permission to so use their papers. There is no opt-out option, that is, the student cannot decide if they want Turnitin to keep a copy of their papers or not. This is a decision the school and/or the teacher makes for the student.

There is also a "Usage Policy" that is linked to, but which is does not have to be agreed to explicitly, meaning that you have to pay their legal fees if anything you do in using the site violates third party rights or breaches the contract they just forced you to "sign".

The judge decided, rightly so in my opinion, that the students have no case against Turnitin, but rather against the school, which is forcing them to use this service in this manner or get a zero mark for the assignment. But the cynical remark quoted from another case ("[i]f parents do not like the rules imposed by those schools, they can seek redress in school boards or legislatures; they can send their children to private schools or home school them; or they can simply move") completely disregards that many poorer families have no other choice than to use the public schools where they reside.

The judge also ruled that Turnitin's use of their original papers was "fair use". I quite disagree with this. The judge noted that the papers was just stored in "digital code", so it was not published as a paper. I see fair use as being using a portion of a text, not the text in it's entirety. And the expression used in the original paper is exactly the same if it is written on a typewriter, or stored in ASCII code. If they were only storing a hash code, this might be different, although that could be construed as a derivative work.

I object to a student being forced to give Turnitin their own original expressions for Turnitin's use for making money. The judge noted that they can still do what they want to with their works - but that is not the point for me. Digital media changes intellectual property, and it is time that the courts begin to understand this. Turnitin is making money off of the honest students' work because there are so many cheaters and teachers don't want to do the plagiarism detection themselves, but leave it to a company that promises more than it can be expected to deliver.

The judge states: "iParadigms' use in no way diminishes the incentive for creativity on the part of students. On the contrary, iParadigms' use protects the creativity and originality of student works be detecting any efforts at plagiarism by other students". Why on earth a judge would take advertising copy to be the truth is beyond me. It is impossible to detect "any efforts at plagiarism", one can only get lucky and detect some attempts.

Continuing, the judge found no basis for iParadigms countersuits. The "Usage Policy" is not binding, as they could not prove that the students saw it. The accusations of hacking were found to be unfounded. They agreed to the "Clickwrap Agreement" - this is why the charge against iParadigms is unfounded.

I do not understand why iParadigms cannot respect the intellectual property rights of students and give them the choice of deciding if they want Turnitin to keep a copy or not. I hope the students file suit against their respective schools. The schools need to learn that their job is to teach students about plagiarism and to themselves check for plagiarisms, not turn the responsibility for this over to a third-party. Perhaps they need to start thinking about new ways of assessing what students have learned, instead of asking for the same old papers over and over again.

An interesting point to the fair use analysis is the "transformative" use for the greater good. This is, for example, what Google and Co. do when they index your site. They make copies for the greater good. Of course, there is an opt-out: you can set a robots.txt file, and well-mannered search machines will keep out. But the result of this is that there is a much broader fair use possible as seen by the court than many copyright owners think they have under current laws. It will be interesting to see download sites for music and films using exactly this fair use defense - a transformative use for the greater good - the next time the music companies sue them.

Friday, March 30, 2007

Students sue iParadigms LLC over alleged copyright violation with turnitin

The Washington Post reports on a case of two high-school students, two in Fairfax County, Virginia, and two in Arizona, who are taking iParadigms LLC to court over copyright violations. (Also blogged on RealTechNews in English and Golem in German.)

iParadigms LLC sells the plagiarism detection service "turnitin", which retains a copy of all papers submitted for use in future plagiarism detection. I obtained a legal expertise in 2004 from the Intellectual Property Helpdesk of the EU which noted that such a service is illegal according to EU copyright law (which ist, technically, not copyright but Urheberrecht oder droit d'auteur, author's rights in the French tradition) as it forces the creator of a work to do something with the work against his or her will.

In the US, copyright is the right to make copies and can actually be sold to a person, natural or legal. The students first registered their papers with a copyright authority before submitting them with explicit instructions not to store the papers, which was ignored by iParadigms.

The company, of course, insists that it is not violating the student's rights, so this is a perfect situation for a legal test: straight-A students objected to being considered plagiators, to having to prove their innocence instead of being assumed innocent until proven guilty.

Students at other schools such as McGill and Mount St. Vincent University in Canada have succesfully protested the use of turnitin without getting a court opinion. This case may eventually make it's way to the Supreme Court so that this bit of copyright law can be determined. The students are to be lauded for their courage in exercising their rights in this question.

Wednesday, December 13, 2006

In Criticism of turnitin

I have been telling all the journalists that call me for some time now that I don't believe in solving social problems - in this case, plagiarism at university - with software. It gives the illusion of a solution without actually doing anything about solving the basic problem, namely, that many students do not have any idea how to do research or how to write up something.

I found this blast against turnitin, the multi-million-dollar plagiarism detection service that conviniently keeps copies of all of the material offered to it for testing, so that it has more good stuff for its database. Scroll down to the scenarios - these are not just made up, I had a similar story reported to me just the other day.

A student had turned in a pre-version of his paper to his professor, who checked it against turnitit - no problem, seemed original. When the paper was finished, he turned it in to the office. Official policy was to run a check before giving the paper to the professors to read and grade - and guess what, the paper now registered as a "high-probability of plagiarism" - because it was very similar to the first draft stored in the database.

I don't want to repeat the article here - just get over there and read the article yourself, following all the links to convince yourself that this is not just an angry competitor writing.