Friday, October 12, 2012

Stumping Plagiarism Software

A correspondent shared an email correspondence with me he had with Ephorus, the Dutch plagiarism detection software company. It seems that his school pays good money for the Ephorus system for general use.

Although Ephorus had given a student's paper a clean bill of health, the professor had not been satisfied and she sat down to google. She found over 30 % of the paper was plagiarized from online sources!

They wrote to Ephorus to ask how this could be. The answer is rather shocking: the texts aren't identical, you see. The punctuation was changed, and the student paper often had two blanks where the source only had one. Ephorus wrote:
The erroneous punctuation has implications for the effectiveness of the plagiarism scan. we [sic] will examine how large the effects are and what we can do about it.
Um, guys? If your system can be tricked by inserting a blank after every second or third word, we might just as well flip a coin to determine if a paper is plagiarized. This does, however, confirm that the false negatives are a big problem with Ephorus. In our study with former German defense minister Karl-Theodor zu Guttenberg's doctoral thesis, which was determined by the GuttenPlag Wiki to have 63 % of the lines on 94 % of the pages to be plagiarized, Ephorus reported only 5 % plagiarism:

No comments:

Post a Comment

Please note that I moderate comments. Any comments that I consider unscientific will not be published.