Friday, January 30, 2015

A Patchwork Thesis

In 2008, a graduate of a German Fachhochschule (University of Applied Sciences, abbreviated FH) submitted a dissertation to the Tomáš-Baťa-Universität in Zlín, in the Czech Republic, a good 800 km from his place of residence. At that time he was in his mid-40s and had been working as a public official for the past 18 years.

One may wonder why a mid-career public official would go so far afield to obtain a dissertation, when there are excellent universities near his hometown. In Germany at that time, a Diplom from a Fachhochschule was not sufficient to be admitted for doctoral work. Often extra coursework would be required, or even a Diplom or Master's at a university had to be completed before beginning work on a doctorate.

Today, the Diplom is no longer offered, but instead Master's degrees from both universities and FHs are acceptable for beginning work on a doctorate at a university. Then as now, however, doctorates earned in other EU countries can be used back home, so there is quite some interest in obtaining degrees outside of Germany.

Zlín offers a four-year doctoral program in Management and Economics that charges 1,600 €/year in tuition that requires submission of a written dissertation that is generally published online in the Zlín Digital Library.  One of these dissertations, the one submitted by the German public servant, is a 100-page dissertation that is now documented as case #140 on the VroniPlag Wiki site.

The "barcode" representation for the manual VroniPlag Wiki documentation for the case looks quite like a bit like a patchwork quilt. This barcode is often misunderstood as being the result of a software-based plagiarism investigation. Nothing could be further from the truth: All discovery and documentation is done manually with the help of small software tools for various tasks, and all documentation is reviewed by a second researcher.

The bar code uses five different colors:
  • white is for pages that have not yet been investigated, or for which nothing has yet been found;
  • bright red is for pages that contain text parallels on over 75 % of the lines of text on a page. The line counting is not automatic, but must be done and reviewed by two researchers;
  • dark red is for pages that have text parallels on between 50 and 75 % of the lines;
  • black is used for pages that have text parallels, but that make up 50 % or less of the page;
  • blue is used for pages that are excluded from consideration. These are normally the title pages, the table of contents, the literature list and any appendices.
For this dissertation, there are some additional blue bands: Towards the end of the thesis there is a list of abbreviations and one of figures and tables that are sandwiched in between pages of content.
[Msc 2008]
The two blue bands after the first few pages are quite interesting. The first one, extending from page 13 to page 21, is taken verbatim from the Catechism of the Catholic Church. The second one, running from page 26 to page 33, is verbatim from a European commission document. Discounting these pages, as each one has a brief reference given when the copy begins, there is only a total of 65 pages of content in this dissertation.
The patchwork continues when looking at the individual pages, as there are also problems on many of those pages. For example, extensive swatches of text are taken verbatim from seven Wikipedia articles without reference. The pages 2325, which deal with some topics in German history, are lifted entirely from the Wikipedia with only minor adjustments. fragments are taken from a journal article that appeared in the Academy of Management Review. There are occasional references to the article given in the text, but it is not made clear that the pages 5663 are almost entirely from this article, and taken verbatim. Page 58 includes an interesting copy & paste error: the printed version of the article has a footnote from the previous page continued at the bottom of the left-hand column. In the dissertation, this text can be found sandwiched-in  between the text from the left-hand column and the text from the right-hand one. The sentences thus make no sense whatsoever.

On page 72, the Daimler-Benz sustainability report is copied with the "we" pronouns changed to "they" or "their" or "Daimler".

Pages 7779 are taken verbatim from a discussion paper for the European Sustainable Development Network Conference 2008. A copy & paste error on page 79 caused quotations marks from the original to be reproduced as | or —.

Whenever the writing shifts from proper English sentences to word-for-word literal translations from German, the thesis becomes quite unreadable. I quote from page 50:
Germany applies in 2001 above a surface of 357,020 sq. kms, the population around catches 82,330,000 million people (2000: 82,260,000). Of it 40,326,000 persons (49.1%) were gainfully employed in 2000. In 2001 there were 2.4% of the employed persons in the agriculture, forestry and fishery, 22.0 % in the producing trade without the building trade, 6.7 % in the building trade, 25.4 % in trade, guest's trade and traffic, 15.2% in the area of financing, renting and services for companies and 28.3% in the sector of public and private service providers (cf. "Germany in figures" in 2002). While trying to explain how many of these persons have been employed in small and middle companies or to define the boundary between them and large companies in Germany one pushes fast to his borders, because there are not enough actual statistical facts offered from the Statistical Federal Office16. Merely on data delivered by the Institute Of Middle Class Research17
[Note: The institution referred to in the last sentence is the Institut für Mittelstandsforschung in the original. It actually researches small and medium-sized enterprises, not the middle class.]

The University of Zlín publishes the reviews by the thesis examiners online, a commendable gesture. Two excerpts are documented in the Findings section on the VroniPlag Wiki:
  • Review 1 (17.11.2008): "Author considers that CSR [Corporate Social Responsibility] is suitable way how to change current managerial thinking which describe from different point of views, e. g. historical progress in religious aspect."
  • Review 2 (17.11.2008): "The dissertation is written very cultivate, digest at the high academic level."
I beg to differ. I don't find this patchwork of other people's words to be either at a high academic level or acceptable scholarship. Above all, it is a mystery to me that people are not aware that when they publish their works in a digital library that they are available world-wide for discussion. Does no one read the theses critically before publication?

The University of Zlín has been informed of the situation and has been sent this report containing all the documentation produced manually by VroniPlag Wiki about the dissertation. The university promptly acknowledged the receipt of the documentation by email. Many other universities, sadly, don't manage to do even that.

Saturday, December 20, 2014

Christmas Links

I seem to be getting more and more links I can't adequately deal with, but which I don't want to withhold from readers. So here is some Christmas reading:
  • The "Neurosceptic" blog of Discover Magazine has a piece about The Strange Case of “Publication Integrity and Ethics” which details a number of integrity and ethics questions around the supposed new journal.
  • The Times Higher Education has a piece on post-publication peer-review that describes more of the chilling consequences that occur when lawyers meddle with scientific inquiry. Physics professor Philip Moriarty is quoted with: “If you are publicly funded and you put your research into the public domain but no one can criticise you for it without facing legal proceedings, that seems to me to be a very badly damaged system.” Exactly.
  • Retraction Watch obtained a $400.000 grant to set up a retractions database! This is great news, I hope that the database can be used to calculate a Retraction Index, that is, how many retractions per article published a journal has, and perhaps how long did it take for the retractions to take place after the initial information of the journal.
  • Bernd Kramer recently published a book in German about obtaining a doctorate in Germany without doing the work ("Der schnellste Weg zum Doktortitel. Warum selbst recherchieren, warum selbst schreiben, wenn's auch anders geht?"). The cover is a horrible stock photo, but the book makes quite interesting reading. Kramer gave an interview in Deutschlandradio in November 2014 about it.
  • Reports of fake peer reviews are increasing. Vox has an article about 110 papers retracted in the past two years on account of faking peer reviews. Retraction Watch reported on SAGE publishers retracting 60 papers from just one journal for this reason. The Minister of Education in Taiwan, Wei-ling Chiang, had been added to some of these papers as a co-author (he says without his knowledge). He stepped down because of the scandal in July 2014, according to IEEE Spectrum
  • Taipeh Times reported in August of 2013 that Andrew Yang, the former Taiwanese Minister of National Defense was forced to resign in a plagiarism scandal a few days after taking office. He had published a book in 2007 that friends had ghostwritten for him. They had, however, plagiarized large parts of the book.
  • The University of Nevada in Las Vegas fired an English professor for "serial plagiarism." The student newspaper, The Rebel Yell, also reports on the case.
  • End of November 2014 the Vice Chancellor of Delhi University in India was jailed and released on account of plagiarism.
  • There is a nasty case of plagiarism reported from early 2014 at the Chicago State University. The dissertation of the Senior Vice President and Provost of the university was being investigated, and the university confirmed to press that they were doing so. She sued the university for violating privacy laws, stating that she did not plagiarize [1]. There exist documentations of plagiarism in her dissertation in a blog ([2] - [3] - [4] - [5] - [6]). Despite the documentation, the University of Illinois, Chicago has ruled that her dissertation is not a plagiarism ([7]). The Chicago Tribune had three plagiarism experts (Tricia Bertram Gallant, Teddi Fishman, and Daniel Wueste look at the thesis ([8]). All three find the thesis problematic. The question is, are the students to be held to a different standard than the person who is enforcing that academic standard? A thorny question.

Monday, December 1, 2014

Diverse links

Here are some links that need documenting:
There will be more, I'm afraid, to come. 

A visit to the Academy

The Berlin-Brandenburg Academy of Sciences and Humanities invited me to speak this past week at a non-public meeting about plagiarism detection software for the working group Zitat und Paraphrase (Quotation and Paraphrase). I was a bit leery of speaking there, as some of the members of the group have publicly demonstrated a quite problematic interpretation of plagiarism as far as it concerned the dissertation of one particular person (see [1] - [2] - [3] for detailed online articles in German about this particular case, and a recent German essay [4] that compares this plagiarism case with one from the early 90s).

Since I do enjoy a good discussion, I agreed to speak. Unfortunately, the meeting was not open to the public, so I am only able to repeat the points of my presentation here, not the ensuing discussion. As it turned out, there were not many members of the group there, and none of the vociferous members I had been expecting.

I first made it exceedingly clear that VroniPlag Wiki is not a machine or software of any sort, but an academic community in which I take part. After discussing Teddi Fishman's definition of plagiarism, which I would extend to include "without properly attributing the work" in point 3 and removing point 5 entirely, I gave a few examples of some of the different forms of plagiarism. These were followed by screenshots of a few plagiarism detection systems that have complicated reports or report essentially meaningless numbers.

One important point that is often overlooked when using such systems is that they all suffer from both false positives as well as false negatives: This is an inherent problem with attempting to determine plagiarism using software. Quotations are difficult to detect reliably, especially if they are only indented; literature references should of course be similar to references used in other papers; and some systems begin to mark anything longer than 6 or 7 words as text similarity. All of these can be the source of a false positive, in addition to simple programming errors, which I have also seen. The other side of the coin is the false negatives, and they are quite simple to understand: If the software does not have access to a source, it will not be able to determine that it is indeed a source. Translated text, for example, is next to impossible to identify with software, as well as non-digitized content.

I then discussed the small, general tools that can be used to manually detect and document plagiarism. After a few examples of documented plagiarism from historic cases and from current cases at VroniPlag Wiki, I closed by asking some ethical questions that I include in my forthcoming chapter on plagiarism detection software for the "Handbook of Academic Integrity":
  • Is it necessary to find all the plagiarism in a text?
  • Is it ethical for a university to use plagiarism detection software?
  • Is it ethical for a university to use plagiarism detection software as a formative device?
  • Is it ethical for a university to offer plagiarism detection software for teachers to use?
  • Is it ethical for a university to offer plagiarism detection software for researchers to use?
We had a good discussion afterwards unfortunately, there was no time to linger on and talk further over a cup of coffee. I do hope that those who were present can serve as multiplicators, explaining to their peers that there is no magic silver bullet software for finding plagiarism, just a number of useful tools, large and small, that all incur a cost of time and effort to use.

[1] Causa Schavan (n.d.) Articles about "Zitat und Paraphrase." [Blog]. Retrieved December 1, 2014, from
[2] Erbloggtes. (n.d.) Articles about "Zitat und Paraphrase." [Blog]. Retrieved December 1, 2014, from
[3] Dannemann, G. (2013, March 3). Die Ex-Ministerin und ihre Unterstützer: Schavanzentrisches Weltbild. Retrieved December 1, 2014, from
[4] Ebert, T. (2014). Sag mir, wie hältst Du es mit dem Plagiat? Von Elisabeth Ströker zu Annette Schavan. Merkur, 68(12), 1070–1080.

Thursday, November 20, 2014

French journalism school executive suspended during plagiarism investigation

The Guardian reports that Agnès Chauveau, an executive from a journalism school in Paris, has been suspended for plagiarizing in columns that she published for the French-language web site Le Huffington Post.

The columns in question have been updated with a notice that the references have now been fixed:
Mise à jour: Ce billet est la reprise d'une chronique faite et lue chaque dimanche sur France Culture. Certaines références manquaient dès la version orale. Elles ont été ajoutées ici dès que ces erreurs ont été signalées afin que les citations et les sources apparaissent plus clairement.
The Institute of Political Sciences has launched an inquiry and suspended her during the inquiry.

Chauveau is said to have lifted material from various online and printed publications for her weekly radio show, then re-used the texts for her online column. Chauveau is quoted as having said that she had “forgotten to cite certain papers, but never on purpose”, and insisted: “I’ve rectified this each time there’s been a problem.” According to the Guardian, she is quoted as not having had the time "to cite all of her sources on the radio.”

In French media, there are articles in Liberation (with the quotations in French: «J’oublie de citer certains papiers mais ce n’est jamais volontaire et je rectifierai chaque fois que ça pose problème.» Elle a aussi expliqué qu’elle n’avait «pas le temps de citer à l’antenne toutes [ses] sources».) and Le Monde

Saturday, November 15, 2014

Münster tackles plagiarism problem head-on

The medical colloquium for advanced medical students in Münster, Germany, invited me to speak about plagiarism there on Nov. 15, 2014. VroniPlag Wiki has identified 23 medical doctoral dissertations from the University of Münster to date that have extensive text parallels that could constitute plagiarism, including one thesis with plagiarism on 100% of the pages. This was widely reported on in the local media, so they decided in addition to just inviting me to open the seminar for all members of the university, and invited alumni and the general public to attend as well.

Imagine 120-130 people in a typical medical school lecture theater with steep seating on a Friday afternoon at 4 pm. I was glad there was so much interest in the topic, and that the dean of the medical school, Wilhelm Schmitz, participated actively in the discussion.

After introducing the topic and noting that Münster has had plagiarisms documented in their school of law (Jam - Psc - Tr - Mb), in the political science department (Ahe), and a book published by a retired computer science professor withdrawn for extensive plagiarism from the Wikipedia (FAZ article), I pointed out that Münster had a case of a duplicate dissertation in 2011. At that time the dean had spoken of a singularity. Now, with 23 additional dissertations documented, it is clear that this is a systemic problem, not 23 additional singularities.

I spoke a bit about the history of doctoral degrees, drawing on the work of Ulrich Rasche (Geschichte der Promotion in absentia. Eine Studie zum Modernisierungsprozess der deutschen Universitäten im 18. und 19. Jahrhundert. In: R. D. Schwinges (Ed.) Examen, Titel, Promotionen – Akademisches und staatliches Qualifikationswesen vom 13. bis zum 21. Jahrhundert . Basel:Schwabe, pp. 275–352, 2007;  Mommsen, Marx und May: Der Doktorhandel der deutschen Universitäten im 19. Jahrhundert und was wir daraus lernen sollten. In: Forschung & Lehre , No. 3, pp. 196–199, 2013) and then briefly presented Bernd Kramer's theory about why medical doctors in Germany are so in love with their titles. This has to do with the history of the field, Kramer postulates.

Early on the clergy was also occupied with health matters. Pope Alexander III proclaimed in 1163 at the Council of Tours that the clergy was not to sully their hands with blood. Two professions sprang up to fill the void, the academic internal medicine scholars and the practical surgeons. More and more "specialists" and quacks sprang up touting their sure-fire cures for what ails you. The academic doctors were often personal physicians to the aristocracy, making house calls at the castle. Even though they were just a special sort of servant, they were learned doctors of medicine.

When Otto von Bismarck introduced free health insurance in Germany at the end of the 19th century, Kramer theorizes, there was a sudden change. The free health care was only if you went to a medical doctor with a diploma, not any of the various quacks. The "unwashed workers" now came to the doctor's surgery, and the doctors needed something to make them feel special. So that was the doctoral degree, according to Kramer, that was a symbol held dear that needed to be obtained at all costs. For the general public, the difference between a quack and a real doctor was that the latter had a doctorate from a university. So it came to be taken as a sign of quality. Kramer goes on to note that the law profession soon picked this up as well.

So the main reason for getting a doctorate in medicine in Germany was to have that symbol of quality on the nameplate, not an interest in research. The quality of many of the dissertations leaves much to be desired. Even the Wissenschaftsrat, normally a very reserved body, lashed out at the medical profession in 2004, but this was generally ignored. A discussion arose in 2009 when Ulrike Beisiegel, at that time the Ombud for good scientific practice for the German research funding organization and currently the president of the University of Göttingen, published an article (p. 488-9) about "Türschildforschung", research for the name plates.

She met with a lot of resistance in the medical field, including a flaming defense (p. 582-583) of the current medical practice by Dieter Bitter-Suermann, the president of the medical school in Hanover and the chair of the German medical school association. He focused there on the quantity of dissertations accepted and stressed how important it was that students start to understand research as early as possible.

I then gave some examples of the plagiarism in Münster. In my experience, people will talk about cases of plagiarism only on the basis of what they have read in newspapers, they seldom make the effort to actually look at the documentation that is available online. This was shocking for the audience, as it was utterly clear that what they were seeing was unacceptable. When I got to the data falsification that was found by accident, a sort of of "collateral damage" while documenting plagiarism, the anger in the audience was palpable. Text is often seen as not so important, but making up data is a major violation of research ethics.

The closing of my talk was about possibilities for changing the situation, moving to avoidance of plagiarism and inculcation of good scientific practice. The medical school in Münster is already moving to include obligatory courses in scientific writing and the scientific process for their medical students, and they are examining all the dissertations accepted in the past few years with plagiarism software, although I explained to them that due to false negatives they will not find all the plagiarisms. Dean Schmitz noted that it was indeed a lot of hard work to interpret the results, but that it was necessary to take care of this now and then to see how to avoid plagiarisms being accepted in the future.

He then opened the floor for questions, and a lively, hour-long discussion ensued. We touched on questions of the role of the advisors, of how to properly reuse descriptions of methods, on the question of having a dual doctorate program, MDs for all, PhDs for those interested in research. Doctoral candidates asked about how to go about avoiding plagiarism, what they needed to reference, and also wondering who these VroniPlag Wiki people are, anyway.

An important point came up in connection with the scandal over plagiarism in habilitations in Freiburg that recently moved Handelsblatt to print the front page headline "Dr. med. plagiat". I noted that Münster does not oblige their researchers to print their habilitations and they do not even have to deposit a copy in the library. I had tried unsuccessfully to obtain a copy of one that was referenced in a dissertation. The dean was surprised - he thought that they, too, had to be published. He promised to look into the regulations for habilitations and to insist on them, too, being publicly available. Even it is an accumulated habilitation with a number of published journal articles bound together with an explaining text, it has to be possible for any researcher to see which journal articles were used.

The session closed at 6pm, but a long line formed up front with people who had personal questions. One was on how to deal with their advisors publishing their own work, one was on where to find more information on scientific writing. The speaker of the student's group was concerned that I was making the University of Münster look bad by only showing examples from Münster. I assured him that I had chosen Münster examples for this talk only, I normally have other examples that I use. But since I had such a wide selection of plagiarized medical text from Münster, it was natural to use them. A radio journalist hung around to interview the dean, me, and a number of students. (Link)

On a final note, I feel that the universities need to be utterly transparent about how they deal with cases of plagiarism. The informer and the accused need to be heard by the investigating committee. They need to be informed about how the investigation is proceeding, and that needs to be timely. It is inacceptable for a plagiarism investigation to take more than a year (some are currently entering their fourth year, probably because the universities in question hoped that the problem would go away if they ignored it). Especially when the plagiarism documentation was raised publicly, as is the case both in printed book reviews as well as online documentations of text parallels, the university needs to publicly announce the results.

Since the doctorate was granted in public, it must also be publicly announced when it has been rescinded. That means naming the person. If they published a plagiarism, they have to accept the consequences. If the degree is kept and the grade lowered, or an expression of concern written, this needs to made public as well. The text parallels are visible to all, as both dissertation and source are published. The reasons why this is acceptable need to made clear: perhaps the plagiarism was the other way around, the supposed source may have been published first. If the reason for not rescinding a doctorate is that the advisor told the doctoral student to do so, all the more reason for it to be made clear that this advisor has problems with good scientific practice. Science does not thrive in secrecy.

The introduction of lawyers into the process of determining bad scientific process does not help, either. The publisher of the plagiarism should respond to the accusations, explaining why the texts are the way they are, not send lawyers to find possible problems in the process. The university grants doctorates, and the university can take them away again. And the government should quit putting the doctorate on identification papers that alone would do a world of good.

Wednesday, November 12, 2014

Chinese students in Australia use ghosting service

The Sydney Morning Herald and Western Australia Today are reporting on a Sydney company called MyMaster that is offering ghostwriting services to Chinese students enrolled in Australian universities. I've collected the links and the first paragraphs of the articles here. It is excellent to see such widespread reporting on academic misconduct.
  • WA's Curtin University caught in NSW 'essay writing' scandal
    "Western Australia's Curtin University has been caught up in a cash-for-results scandal involving thousands of students who paid a Sydney company up to $1000 each to write essays and assignments for them, as well as sit online tests." The article has links to other articles on grade changing scandals.
  • Students enlist MyMaster website to write essays, assignments
    "Thousands of students have enlisted a Sydney company to write essays and assignments for them as well as sit online tests, paying up to $1000 for the service. Their desire to succeed threatens the credibility and international standing of some of our most prestigious institutions."
  • Students buying assignments online could be charged with fraud
    "Students who pay essay writing services to complete their university assignments are not only breaching university plagiarism protocols but could also be charged with fraudulent conduct under NSW [New South Wales] legislation, legal experts say."
  • Yingying Dou: The mastermind behind the University essay writing machine
    "At the helm of the company embroiled in a large-scale academic cheating scandal is a Chinese-born businesswoman named Yingying Dou. The enterprising 30-year-old, who also goes by 'Serena', has used her accounting degree to build a lucrative ghostwriting service, called MyMaster, aimed at Chinese international students."
  • Yingying Dou takes the day off as students and tutors tell of others who cheat
    "Tutors and students at Yingcredible Tutoring, the coaching college run by the mastermind of essay-selling website MyMaster, Yingying Dou, have spoken of the widespread practice of international students paying for university essays as they struggle with language barriers."
  • Universities in damage control after widespread cheating revealed
    "NSW universities are in damage control following a Fairfax Media investigation that revealed hundreds of students across the state were engaging the services of an online essay writing business.
    On Wednesday, the Herald exposed an online business called MyMaster, run out of Sydney's Chinatown, that had provided more than 900 assignments to students from almost every university in NSW, turning over at least $160,000 in 2014."
The site has now been taken offline.
Thanks to Sven for spotting these articles!