arXiv:2002.04279 [cs.DL]
Monday, February 24, 2020
Testing of Support Tools for Plagiarism Detection
arXiv:2002.04279 [cs.DL]
Thursday, December 17, 2009
Microsoft Admits Plagiarizing Code
Is this the current trend? Plagiarize, and then apologize if caught? I hope not.
Thursday, September 25, 2008
Three words suffice!
The system did not find our source, the Süddeutsche Zeitung, but the Swiss Tagesanzeiger. Putting just these three words into Google proved something I have been saying all along: three to five words suffice.

As it happens, the author, Henrik Bork, is the author of these identical articles. He sold one in March, one in April. The ethics of this is another discussion. But the use of PDS is so time-consuming, one really just needs to pick out phrases like this upon reading, and use a search engine. Full stop.
Wednesday, September 3, 2008
Plagiarism Detection Software Test 2008
- turnitin
- Ephorus
- Plagiarism-Finder
- Docoloc
- Urkund
- StrikePlagiarism
- TextGuard
- CopyScape
- WCopyFind
- CatchItFirst
- SafeAssign
- ArticleChecker
- JPlag
- PaperSeek
- YAPLAF
- AntiPlag
- PlagAware
- PlagiatCheck
- PlagiarismDetector
If you have plagiarism detection sofware you would like to have tested, please leave a link here or contact me. We will publish our results on September 30, 2008.
Saturday, February 2, 2008
A Legal Twist on Plagiarism Detection
This gives plagiarism detection an interesting twist: software that runs locally does not normally have its own database - so it basically is just doing the search machine searches for you, in which case you might as well be doing the testing yourself. It is conceivable that a university might start a papers database of the locally submitted papers, but that will only be of marginal use, as copying from the Internet would not be found.
I have heard that locally installed plagiarism detection software has trouble negotiating licenses with large search machine companies for fast, repeated searches. So maybe what we need is some sort of Plagiarism Workbench that helps teachers do their searches themselves, recording what they tested when and helping them do documentation.
But it seems there is no substitute for doing one's own searching. Since we are, one hopes, actually reading all the papers and not just assigning random grades, we might as well do a quick check after reading on a few paragraphs. As I have often shown: 3-5 nouns suffice.
Tuesday, January 8, 2008
BitScan
- Viking - a trivial shake-and-paste plagiarism of one source: the source and only the source was found
- Döner - a complicated three-source plagiarism with Wikipedia: two mirrors of Wikipedia found, no other sources
- Jelinek - another three-source plagiarism with an automatic translation: one of the three sources found as the only source
- Djembe - an impossible (for machines) machine translation: nothing found
- Lettau - an easy plagiarism of the German Wikipedia (his publication list also appears 1:1 in the English-language Wikipedia: nothing found
- Blogs - a plagiarism from a pdf: nothing found
- Atwood - a trivial plagiarism from Amazon: found, along with some copies
Thursday, October 4, 2007
Another Plagiarism Detection Test
It was interesting to see that they tested some of the same systems we did, often having similar experiences with the systems although they only tested a few cases using computing-related texts. They did, however, have a very fine-grained points system looking at things such as legal issues and the technical basis for the server systems and the presence of licenses for using search machines such as Google or Yahoo.
Not surprisingly, Turnitin comes out on top. Why do I say "not surprisingly"? Well, JISC seems tightly entwined with the NorthumbriaLearning and the latter are the European re-sellers for Turnitin as well as a resource center for teachers. I am not quite clear on how close these two organizations are.
JISC did give this survey to an outside person to conduct, and had an academic advisory board look at the evaluation questions and suggest products to test. But the appendix entry on turnitin is a glowing sales document that avoids all of the issues with Turnitin (such as being overeager to store copies of papers in their database), whereas the others are more apt to have problems noted - problems that we, too, had in many cases.
The survey is still a very valuable collection of data - all the more so because they used questionnaires to elicit more data (or more refusals to give information) from the various companies. I am just curious as to how independent the study really is.
Update October 5, 2007: William Murray from NorthumbriaLearning has sent me this clarification of the relationship JISC/NL. Thanks, William, glad to post it!
"The relationship between JISC (the government funded Joint Information Systems Committee in the UK) and NL needs explaining. The confusion occurs because all JISC services are branded with JISC in front of them. We run JISC-PAS not JISC!
Turnitin won a national tender in 2002 put out by JISC to run a national detection service in the UK and Northumbria University (our original parent company) won a second national tender for the advisory service JISC-PAS (Plagiarism Advisory Service) that supports it.
We (Northumbria Learning) have been managing JISC-PAS and reselling Turnitin ever since with JISC’s endorsement. JISC wanted an independent survey to reaffirm (or otherwise) their support for their original choice of detection solution in 2002. NCC Group Ltd were chosen because they are independent of NL and JISC-PAS.
Within JISC-PAS our primary aim is to encourage holistic change within institutions through better information literacy, better course design, better research practice and better teaching of core skills. We happen to think that solutions like Turnitin provide the ‘ah-ha’ moment (Jude Caroll’s term not mine) that focuses the minds of all concerned. In my view detection is a change agent for better practice (I taught informatics at Northumbria University for ten years so I think this is a good thing. I would have loved to be able to use Turnitin, our class sizes were huge 300+ in some cases which made consistency in marking a nightmare). But specifically to address your points:
JISC are not entwined with Northumbria Learning, we run the JISC-PAS and Turnitin service on behalf of JISC.
NCC group ran an independent survey
NCC group allowed *ALL* providers to vet *ALL* the information in their report and agree it as factually correct before publication.
All providers were given the opportunity to improve their scores prior to publication
The extent to which they contributed ‘sales’ information was entirely up to the companies concerned.
Its aim was to identify which system could be deployed enterprise wide, with high volumes of through put and used on a national scale *in the
Having a central database was on of the reasons Turnitin was selected by JISC. This is why (in this context) it was not a flaw."
Thursday, September 27, 2007
Test of Plagiarism Detection Software
We held a press conference this afternoon, cutting over to the new version of the plagiarism portal and the E-Learning unit on plagiarism detection ("Fremde Federn Finden", in German) at the start of the conference. We had 5 reporters in attendance and many who requested virtual press materials. The online magazine "Spiegel Online" had requested that we write a summary article for them, so we just cut out sleep for a few days in order to get it done.
We have had a lot of interest from reports and of course the companies tested. If we learn of other systems, we will be glad to test them as we have time (which will be spare time, as the financing for this project runs out tomorrow), although the results might not be comparable, as the Internet is constantly changing.
Here is a copy of the ranking page:
Ranking
Excellent Systems
No system was ranked as excellent - but there have been many people who attended plagiarism detection seminars who scored 100% on the same tests!
Good Systems
Nr. 1 : Ephorus
Acceptable Systems
Nr. 2 : Docoloc
Nr. 3 : Urkund, Copyscape (premium), PlagAware
Nr. 6 : Copyscape (free)
Nr. 7 : TextGuard
Nr. 8 : turnitin, ArticleChecker
Nr. 10 : picapica
Unacceptable Systems
Nr. 11 : DocCop
Nr. 12 : iPlagiarismCheck, StrikePlagiarism
Nr. 14 : CatchItFirst
Friday, June 29, 2007
Test of Plagiarism Detection Software
For the repeat of the experiment I have 10 more papers and will be conducting tests over the summer of the following products from various countries:
- CopyCatch Investigator - UK - http://www.copycatchgold.com/index.html
- Doc Cop - Australian - http://www.doccop.com/
- Docoloc - German - http://www.docoloc.de/
- Ephorus - Dutch - http://www.ephorus.nl/
- Eve2 - Unknown, no address given - http://www.canexus.com/
- MyDropBox/ SafeAssignment - English - http://www.mydropbox.com/
- Plagiarism Finder - German - http://www.plagiarism-finder.de
- StrikePlagiarism - Polish - http://strikeplagiarism.com/
- TextGuard - German - http://www.TextGuard.de
- turnitin - USA - http://turnitin.com/static/index.html (iThenticate seems to be the same software but targeted to businesses and not schools - http://www.ithenticate.com/static/home.html)
- Urkund - Swedish - http://www.urkund.com/
The results will be published online in September 2007.
Tuesday, June 5, 2007
Plagiarism increased four-fold in Sweden
They cite a report put out by the Hogskoleverket, the government university agency, which will be published next month. The report finds more plagiarism in term paper writing than in cheating on exams. Under the Swedish system, students who are caught cheating or plagiarizing are brought before a board, the disciplinnämnden, which decides if punishment should be meted out. Punishment is suspension from school for a period of up to 6 months - usually pronounced just before exam time, so that the deliquient cannot take some exams.
Taking exams in Sweden is vital - if you pass 75% of the credits of your first year at college, you can get funding for the next year, and so on. So there is quite an incentive to get those 75% credits.
The university uses a so-called plagiarism detection software for checking term papers, the article does not mention which one. Out of 15 400 submitted papers last year there were 36 suspensions meted out. 2003 there were only 10 suspensions pronounced. That is a quota of 0.23 % - and far, far below what teachers report when they hand-check term papers. There are reports accumulating pointing to figures more in the 10-30% range.
The report continues that 3 of the suspended students took the university to court - and won their suspension rescinded. That looks to me like an 8% false positive rate in the software. Perhaps they need to look hard at their software, or find other methods - like using search machines - for assessing this problem.
You can't solve social problems with software - and most certainly not with software that is this bad.
Sunday, March 4, 2007
Docoloc
I find it troubling, though, that software that purports to fight plagiarism itself uses a layout that is a blatent plagiarism of Google's layout.....
Tuesday, December 26, 2006
Plagiarism "finding" software
But the article is still quite euphoric about using software. Sigh. My tests in 2004 were not encouraging - often, you could just flip a coin an be just as right about whether a paper was plagiarized or not. But many companies scream now "We are NEW! We are IMPROVED!", so I am forced to spend my summer term's research allowance (all of 4 hours a week out of 18 off teaching to do research) in order to repeat the tests. Stay tuned for the results after the summer break 2007.
Until then: just use a search machine and your brain. You will get better results.