Tuesday, March 8, 2011

It's Google's Fault

I was rather flabbergasted this morning when Spiegel called me for a comment on the press release by the first and second corrector for Karl-Theodor zu Guttenberg's doctoral thesis. Instead of letting the press frenzy die down, they fanned the flames. Here's a fine bit from the middle:
While discussing the thesis one should keep in mind that it was not common to check dissertations [for plagiarism] with technical tools in 2006 and it is still not common today.  In addition, the technical tools available in 2006 were barely able to detect plagiarism. Plagiarism software [sic] as well as other methods were not as highly developed as they are today. Even Google did not have the finely tuned searching methods as they do today. In particular, software that works with legal texts are still being developed. In the interests of all participants there will surely be technical examinations done on dissertations as well before [they are read by the examiners] (The entire text can be found at the FAZ site in German)
After a good laugh I recovered enough to look up my first paper on plagiarism detection that was published in a real journal: "Kein Kavaliersdelikt: Wie man Plagiate entdeckt und was dagegen getan werden muß." Forschung & Lehre, 6/2003, S. 307-308. (Not a trivial offense - how to discover plagiarism and what to do about it). This magazine is the monthly publication of the association of university professors, Hochschulverband, and is widely read in German universites. And this was published way before zu Guttenberg handed in. 

Google worked fine in 2003 for discovering plagiarism. Why, I even found plagiarism before the Internet was born! You notice the different styles while reading it.

This puts a rather bad light on the situation in Bayreuth. Did they really not know that the other universities were using Internet and software? Sure, the software doesn't pick up much. But let's assume they took a good sample, let's say 10 pages. With the current tally being 75% of all pages are plagiarized, that would mean that 7 of these 10 pages would contain plagiarism. Now, the software is only partially useful and only finds about 60-70 % of the plagiarism. So let's assume that out of 10 pages tested, 3 came up plagiarized - and plagiarized big time. Shouldn't that ring a bell and have a more intense investigation get started?

The comments on the newspaper web pages are caustic at best. This did a great disservice to science at large, demonstrating that this corner of the scientific world has remained blissfully unaware of what is going on around them.

I used a metaphor in a letter this morning: the entire Guttenberg affair has lifted the corner of the rug under which German academics have been sweeping their academic misconduct for ages. It is time to pull the rug out from under them, smack it clean and hose it down.


  1. I always thought that all sorts of "plagarism detection"-Tools (Software or a dedicated "plagarism-detection-office") were standard. I guess I was wrong. This will surely change in the future I hope

  2. I had a very similar notion when reading the press release: Are these colleagues ignorant or just blissfully unaware? I understand the part about not expecting anything because we trust our grad students, and rightfully so because we would not get anything done if we were suspicious all the time ... But not getting suspicious when reading it ... in YOUR SPECIAL FIELD OF EXPERTISE ?????. Hell, a much younger professor from Bremen got suspicious even before having read the complete thesis text! So much for your reputation as a scholar ...

    Initially I calmed a bit when I thought 'maybe Google was indeed not that good then' ..., yeah right. I used search engines since the mid 90ies and so far I have nearly always found what I was looking for ... boy was that easy for KTzG. My worst student works harder to hide plagiarism (or avoid it) ... and now I am beginning to believe that Germany indeed has a huge problem with plagiarism.

    Armes Deutschland ...

  3. As you are teaching at a Fachhochschule I doubt whether you have any experience controlling dissertations with 400 p. and more.

    Normally with a text of this length you will get so many "false-positive" results that the outcome is completely useless.

    And to be fair concerning the Bayreuth professors you should mention that without Google Books (5 years ago!) Google was almost useless for dissertations in law as you normally wouldn't find useful sources in the www.

  4. Actually, I do. We have a system of cooperative dissertations running. I was also officially brought in as a third "Gutachter" for a thesis that was suspected of plagiarism, which I was able to confirm.

  5. You are absolutely right, the excuse is just ridiculous. However, I would not expect a PhD advisor to routinely check dissertations using Google. In my field (computer science), professors usually work with the PhD candidates over the course of several years, so it is usually possible to say whether the candidate was actually doing the research or not (apart from the fact that at least some of the contents will have been previously published in a journal or on conferences, where there is also a quality check in place).
    In addition, an advisor should read the dissertation thoroughly, and therefore be able to discover changes in the writing style.
    Personally, I would use Google only if a specific reason for suspicion arises.


Please note that I moderate comments. Any comments that I consider unscientific will not be published.