Tuesday, October 13, 2020

Plagiarism Detection Software: Publication, Mergers, News

Finally found some time for a post!

First off: The TeSToP working group (of which I am a participant) at the European Network for Academic Integrity has finally published its test of support tools for plagiarism detection. It looks at the results from various angles such as effectiveness on various European languages, one source or multi-source plagiarism, and amount of rewriting done.

Foltýnek, T., Dlabolová, D., Anohina-Naumeca, A. et al. Testing of support tools for plagiarism detection. Int J Educ Technol High Educ 17, 46 (2020). https://doi.org/10.1186/s41239-020-00192-4

Abstract:
There is a general belief that software must be able to easily do things that humans find difficult. Since finding sources for plagiarism in a text is not an easy task, there is a wide-spread expectation that it must be simple for software to determine if a text is plagiarized or not. Software cannot determine plagiarism, but it can work as a support tool for identifying some text similarity that may constitute plagiarism. But how well do the various systems work? This paper reports on a collaborative test of 15 web-based text-matching systems that can be used when plagiarism is suspected. It was conducted by researchers from seven countries using test material in eight different languages, evaluating the effectiveness of the systems on single-source and multi-source documents. A usability examination was also performed. The sobering results show that although some systems can indeed help identify some plagiarized content, they clearly do not find all plagiarism and at times also identify non-plagiarized material as problematic.

So just a few months later these two press releases show up:

  • Turnitin announced in June 2020 that they have purchased the company Unicheck. Both systems participated in the TeSToP test.  
  • Urkund and PlagScan, two more systems that were in the TeSToP test, announced a merger in September 2020: They will now be known as Ouriginal, and will be combining the plagiarism detection results of Urkund with the author metrics of PlagScan. 

These four systems just happened to be the best ones in combined coverage and usability, although none of the systems are perfect, averaging 2.5 ± 0.3 on a scale of 0 to 5. We plan on retesting in 3 years, so it will be very interesting to see how these combined systems fare then.

In other news, the proceedings of the "Plagiarism Across Europe and Beyond 2020" (PAEB2020) that ended up being held online instead of Dubai is now ready and available for download. PAEB2021 will be held in Vienna, September 22-24, 2021, COVID-19 permitting.

And in very sad news, academic integrity researcher Tracey Bretag from Australia passed away in October 2020. Jonathan Bailey has written an excellent obituary on his blog Plagiarism Today. I am glad that I was able to meet her many times and experience her great ideas and energy. It was a pleasure to contribute to her Handbook of Academic Integrity. She will be sorely missed.