Saturday, November 30, 2013

Peer Review, Impact Factors, and the Decline of Science

[I've had interesting tabs open for weeks waiting for time to report on them. Sorry for the delay. -dww]

The Economist reported on October 19, 2013 (pp. 21-24) that there is "Trouble at the lab". Indeed. And trouble has been brewing for quite some time without a single identifiable culprit or an easy way to solve the problem. This problem is concerned with predatory publishing, irreproducibility of scientific results, and the use of quantitative data as an attempt to judge quality.

University administrations, search and tenure committees, governments, funding associations, and other bodies need some way of judging people they don't know in order to decide whether to offer them jobs or promotions or funding. This has often boiled down to counting the number of publications, or the impact factors of the journals in which their articles are published. Coupled with the crisis in publishing, with the subscription price of subscription journals exploding, an unhealthy mix is brewing.

Predatory publishers promise quick publication in good-sounding "international" journals, using the Open Access "golden road" to extract fees from authors. They promise peer review, but if at all they only seem to look at the formatting. Established publishers trying to keep up their profits have incorporated more and more journals into their portfolios without keeping a watchful eye on quality control.

Enter John Bohannon. In October 2013 Bohannon published an article in Science, Who's Afraid of Peer Review? He details a sting operation that he conducted between January and August 2013, submitting 304 papers with extremely obvious deficiencies to journals that he chose both from Lund University's "Directory of Open Access Journals" as well as from Jeffrey Beall's list of predatory publishers.

Bohannon has put his data online, showing that 82% of the journals chosen from Beale's list accepted the fabricated paper, as well as 45% of the journals on the DOAJ list. Predictably, DOAJ is not amused and accusing Bohannon of, among other things, racism because he chose African-sounding names for the authors (1 - 2).

In August 2013, Nature journalist Richard van Noorden detailed a scheme by publishers called "citation stacking" in which a group of publishers collude to quote extensively from each other's journals in order to avoid being sanctioned for coercive citation. This activity was described in Science in 2012 by Allen W. Wilhite and Eric A. Fong as a process by which authors are instructed to quote from a publisher's own journals in order to increase the so-called impact factor. van Noorden's article focused on a group of Brazilian journals, so he, too, was accused of racism. This is unfortunate, as it detracts from a very serious problem.

We find ourselves today in a rapidly expanding world with scientific research being conducted in many different places and much money being invested in producing results. People need publications, and have little time for doing peer review, a job that is generally not paid for and performed as a service to the community. Universities in countries without a tradition of rigorous scientific practice have researchers who need publications, and there are people out to make money any way they can. Researchers competing for scarce jobs in countries that are trying to spend less on science and education than they have in the past are also sometimes tempted to follow the path of less resistance and publish with such journals. And some are not aware that they have just selected a publication that sounds like one that is well respected, as Beall has noted.

I don't have a solution to offer, other than boycotting the use of quantitative data about publications and getting people to be aware of the scams going on. We need to get serious about peer review, embracing such concepts as open access pre- and post-publication peer review in order to get more rigor into the publication process. I realize that people have been complaining about the decline of science since at least Charles Babbage (Reflections on the Decline of Science in England, And on Some of Its Causes, 1830). But we are in grave danger of letting bad science get the upper hand.

And what happens to those who try and point out some of the dicier parts of science? Nature just published another article by van Noorden, together with Ed Yong and Heidi Ledford, Research ethics:  3 ways to blow the whistle.

Update 2013-12-01: fixed a typo

At least they are being honest about their dishonesty

Jonathan Bailey, a consultant for iParadigms (the company that markets Turnitin) and author of the "Plagiarism today" blog, noted in a recent article that the "free" plagiarism detection software Viper moves papers that it checks to its paper mill subsidiary about 9 months after the paper has been submitted for checking. They market primarily to students, so they will be harvesting papers that students have either written themselves, or may have plagiarized and then didn't read the fine print.

They do make this very clear on their site, but only at the very bottom of the download page and only on a page that is linked as "How does Viper use my essay?":
Aside from that, 9 months after your scan, we will automatically add it to our student database and it will be published on one of our study sites to allow other students to use it as an example of how to write a good essay.
Right.

We tested the system in 2010 a the HTW Berlin, where it not only came in last as far as effectiveness goes, but we also observed that there was an essay mill at the same street address and with a telephone number just one number on from the Viper number. We called and tried to obtain more information, but the number only gave us an email address to contact, and our emails there were returned as undeliverable. We also observed that the email address that was used for the system was now getting regular emails like this:
 or this:
I find it highly dishonest for a company to be so blatantly offering to write papers for students. The reason for attending university is to learn how to do research, structure information, think things through, and write about the experience. People who purchase and submit ghostwritten papers are cheating themselves.

I suppose we should be happy that they are at least publicly stating what it is that they do with the papers submitted. As Bailey points out, however, they also encourage universities to use the system, but since teachers do not hold the copyright to papers written by their students, they must violate the terms of service in order to use the system. What a tangled mess ...

Wednesday, November 6, 2013

University of Gießen sees no misconduct in plagiarism case

The University of Gießen in Germany announced on Tuesday that they see no scientific misconduct or plagiarism on the part of Frank-Walter Steinmeier (previous posts here 1 - 2). In just under six weeks they have announced a result, although there are older cases at the same university that are still not resolved. Apparently, they did not conduct an independent investigation, but only examined the accusation that was published by a marketing professor on the basis of his automatic plagiarism detection software.

The press release focuses on two aspects of the case:
  1. The issue of self-plagiarism. There was a good bit of text parallel with previously published works both of the author and the author with a co-author. The committee did not see any misconduct, as the co-author was named a few times in the thesis and thanked in the forward.
  2. The use of text verbatim without quotation marks. There are many fragments of text taken verbatim from other sources in the dissertation. The source is named in a footnote, but it is not made clear that the text is practically a textual copy, as can be seen in the current VroniPlag Wiki documentation. This is often referred to as a pawn sacrifice in the plagiarism literature. The university states that in the modern day this would be considered to be a problem, but that it is only misconduct when the author assumes intellectual property (Urheberschaft für fremde Ideen) with the intent to deceive. The problem, as always, is that a reader cannot know what the intent of the author is while reading the text.
Press reports can be found on Spiegel online, Zeit online, and a number of other venues. A long discussion in German about the procedures followed in Gießen can be found on the Erbloggtes blog.

Sunday, November 3, 2013

Dr. Z fights corruption and plagiarism in Russia

[I'm posting some oldish news that need documenting -- dww]
The German daily newspaper Die Welt published an article by Julia Smirnova on August 23, 2013 about "Dr. Z.", a scientist who is fighting corruption and plagiarism in Russia. A Swedish blogger reported on one of his revelations in February 2013, giving Russia today from February 22 as his source, but unfortunately no links to such an article can be found.

Andrej Zajakin, according to Smirnova, is a physicist who has lived in Spain for the past six years, doing research at the University of Santiago de Compostela. Using the pseudonym "Dr. Z.", he has been publishing extensive documentation of corruption in Russia. In February the Russian blogger Aleksej Navalnyj revealed that the politician Vladimir Pechtin had lied to voters by stating that he did not own foreign property when he actually owned a house in Florida. Pechtin was forced to step down. Navalnyj gave Zajakin as his source.

Zajakin is one of the persons behind Dissernet, a site that documents corruption and plagiarism in dissertations. The site is in Russian, but is said by Smirnova to have documented plagiarism in over 100 dissertation, including Russian politicians such as Pavel Astachov (children's rights commissioner), Olga Batalina (member of parliament), Vladimir Burmatov (member of parliament), Vladimir Gruzdev (governor of Tula Oblast), and Oleg Kowalyov (governor of Ryazan Oblast). The plagiarism documentations are linked from the page http://www.dissernet.org/expertise/ and use a similar type of documentation as is found at the German VroniPlag Wiki:
Plagiarism documentation at http://www.dissernet.org/expertise/kozlovaa2005.htm
The German online portal Spiegel online also published an interview (in German) with Andrej Rostovzev, one of the Dissernet scientists, in April 2013, in which he explains that this site is not organized as a wiki, but only permits vetted individuals to contribute to the effort. I don't speak Russian, but I would be happy to offer guest blogging privileges to anyone who would like to report on the progress being made by this group.

Guest commentary: Plagiarism Probabilities

[I have offered guest commentary privileges to anyone interested in posting longer pieces than the comments section will accept. This is the first such comment. -- dww ]
Plagiarism Probabilities
by Gerhard Hindemith
In response to the Copy, Shake, & Paste article  "Automatic plagiarism accusation?" on the documentation of plagiarism in the dissertation of Frank-Walter Steinmeier I would like to make the following comments:
There is an intermediate version of the computer-generated report available online: http://web.archive.org/web/20131012050153/http://www.profnet.de/dokumente/2013/8048r.pdf. The latest update, including most Vroniplag Wiki finds, can be found here: http://www.profnet.de/dokumente/2013/8048r.pdf
Surely the author of the report would claim that most commentators criticizing it did not understand it correctly. The report claims for each fragment a plagiarism probability (Einzelplagiatswahrscheinlichkeit), and if that is low, say 1%, the claim of the report is not that the documented fragment constitutes in fact plagiarism, but rather that it will be plagiarism only in one of 100 comparable cases. Having that in mind, one should disregard those "low probability fragments" for most purposes (and it is not quite clear, why they are included in a report aimed for the public in the first place). The system then goes on and calculates an overall plagiarism probability (Gesamtplagiatswahrscheinlichkeit), which supposedly (none of this is explained in the report unfortunately) then is the probability that the entire thesis should be considered plagiarism. With this understanding, the report makes sort of sense conceptually -- but that doesn't really help much. I think the report should never have been published, even if these clarifications are understood. Here are my reasons:
  1. If the outcome of this report really is only a plagiarism probability below 100% (it was around 60% in the beginning), publishing it seems quite unethical, because at that point there was a 2/5 chance that the dissertation was perfectly OK (if we believe the system ... see further down, whether that is a good idea).
     
  2. The professor made confusing claims with respect to how these probabilities should be interpreted. Here for instance: http://www.n24.de/n24/Nachrichten/Politik/d/3599146/-wir-finden-noch-mehr-bei-frank-walter-steinmeier-.html he claims that in Steinmeier's thesis 400 passages are not OK, that in fact Steinmeier has forgotten the quotation marks in 400 instances. ("Und bei Herrn Steinmeier kam eben heraus, dass 400 Textstellen nicht in Ordnung waren. Konkret heißt das: Er hat bei 400 Stellen die Anführungszeichen vergessen."). Now, this is much more than the claim that the system has found certain (low) probabilities for plagiarism. It seems that the professor is swinging back and forth between quite bold claims (400 instances of forgotten quotation marks) that attract attention and cautious remarks about the correct interpretation of the findings (the system only finds "plagiarism indicators", often with low probability attached, not certain plagiarism), when challenged with concrete examples.
     
  3. As I said, conceptually I can imagine that the probability set-up could make sense, but in practice the details matter a lot. Not only for the technically minded insider, but also for the reader trying to make sense of such a report and the so prominently placed probabilities. The following questions would need to be explained in order to ensure one can interpret the findings of the report:

    3a) What is the definition of "plagiarism" for a single fragment? Is every small citation mistake considered plagiarism, or only more severe cases of verbatim copied text of a certain length? Is self-plagiarism included? One needs to know this to understand what a plagiarism probability of say 20% actually means.

    3b) Equally, one would need to know what the definition of "plagiarism" for the entire dissertation would be. This could be as severe as "at least one citation error in the entire thesis" or as forgiving as "the plagiarism is so severe that even a medical dissertation in Germany would be rescinded". Without this definition, the overall probability figure is meaningless.

    3c) Important for the interpretation of the probabilities would also be an explanation, how these probabilities are conditional on the choice of potentially plagiarized sources, whether they are independent of text length and with what suitable confidence intervals comes their estimation.
     
  4. I have very strong doubts that the probability calculator has been developed on a methodologically sound basis. But maybe I am wrong. In order to check this, the following questions would have to be answered:

    4a) How has the probability calculator on fragment basis been built? Given that surely an automatic system cannot be taught to detect plagiarism directly, it would pick up certain factors that point towards plagiarism (like identical text and the length of it, lack of quotation marks, lack of reference, etc.) and estimate the plagiarism probability on the basis of those factors. The question is then: on the basis of what pool of text fragments with known plagiarism status has the tool been calibrated? Surely this pool would have to be fairly diverse, covering different plagiarism types, citation styles, and subject areas, and hence would have to be quite large, to achieve a certain statistical validity of the calibration results? How has been assured that this pool includes a representative proportion of fragments that are not considered plagiarism? What is the discriminatory power of the resulting probability calculator?

    4b) According to which logic have the fragment level plagiarism probabilities been combined to form the overall plagiarism probability? How has the overall probability been calibrated, on the basis of what pool of dissertations with known plagiarism status? How has the fragment-to-fragment correlation been accounted for that surely is induced by a consistent writing style throughout the thesis? (e.g. if in the whole thesis italics mark quotations (or a different text size/color for that matter), this might not be picked up by the system and the plagiarism probability on fragment level would be consistently over-estimated.) What is the discriminatory power of the overall probability calculator?

    4c) In the case of the Schavan thesis, a similar report has been generated, apparently following the same methodology: see here: http://www.profnet.de/dokumente/2013/11357profnet.pdf. This report gives an overall plagiarism probability of 100%. How can an automatic system reach absolute certainty, particularly for a dissertation where the passages of verbatim copied text are very limited? This 100% value makes me suspect that the system has in fact never been calibrated, and the probabilities given have no empirical basis, but are rather heuristic constructs that somewhat point in the right direction? If this was the case, one would have to ask, why such probabilities have been calculated in the first place, with apparent high precision up to a single percentage point?
I leave it at that. If there are reasonable answers to all these questions, I would be amazed, and the probability calculator would constitute a very interesting diagnosis tool and certainly an advance scientifically (the author of the report should then definitely publish his research). But given the complexities of plagiarism detection (as opposed to text parallel detection), I suspect the discriminatory power of an empirically built tool would be disappointingly low and its calibration extremely involved and costly. I don't believe the report about Steinmeier's thesis is generated by such a tool because the provided documentation (close to none) gives me no reason to believe it was but rather several indications that it most likely was not.

Saturday, November 2, 2013

Automatic plagiarism accusation?

In October 2013 Germany saw another dissertation plagiarism case involving a politician.  Frank-Walter Steinmeier, former Foreign Minister and the leader of the opposition in the previous parliament, submitted a thesis in 1991 to the law faculty of the University of Gießen. Entitled Bürger ohne Obdach, Citizens without Shelter, the 395-page thesis deals with the legal aspects of homelessness. The weekly magazine Focus reported in an article (not linked here because Focus participates in the consortium claiming intellectual property on links and snippets) on Sept. 29, 2013 that a marketing professor who has been trying to sell his software system for detecting plagiarism for many years had found extensive plagiarism in the dissertation, and that Steinmeier had called the accusation absurd.

The daily newspaper taz soon dug out the news that Focus had paid the marketing professor to investigate the theses of politicians, supposedly without targeting a specific individual. Focus had also asked a law professor to evaluate the accusation. He had stated that most of the passages were unproblematic, for example a seven-word name of a book listed in a footnote, but that there were three portions that were more extensive. The report was published online, 279 rather incomprehensible pages worth of text in tiny print, colored text, and some so-called "total plagiarism probability scores". The press immediately whipped itself into a frenzy, presumably because they believed that a computer program would be much more accurate at finding plagiarism than people. After all, the software reported numbers on every page! But since no one quite understood what the numbers actually meant, extensive discussion ensued.

A number of blog commentators and online media picked the report apart (among them Spiegel Online, Causaschavan, Archivalia, Erbloggtes, HajoFunke). The marketing professor was interviewed many times, giving conflicting statements: In Deutschlandfunk he stated that he had examined the report before publishing it; in an interview with Main-Netz he is reported as having said
 [...] bei der Überprüfung durch unsere Software gab das System bei Steinmeiers Arbeit einfach »Rot« an und verschickte einen automatisch generierten Prüfbericht an die Universität.
(our system checked Steinmeier's thesis and it registered "red" and sent an automatically generated report to the university)
If the statement that the report was automatically sent is true, it demonstrates quite vividly the grave danger that the use of plagiarism detection software without understanding the report can cause. While false negatives are bothersome – a plagiarized text is not flagged – a false positive can potentially be devastating for the author of the text, as an accusation of plagiarism will still hang in the air, even if it later turns out that the accusation cannot be substantiated.

Perhaps this case demonstrates the problem with automatic accusations made solely on the basis some number generated by a software system. The algorithms by which the number was derived is usually not published and thus unverifiable. The numbers in general do not mean anything until they have been checked – every single one – by an experienced teacher or researcher to determine if the result is at all meaningful. As I have said repeatedly: A software system cannot determine plagiarism, only a human can. 

A software system can, however, find indications that there might be plagiarism, although it would be helpful if not so many irrelevant "hits" were to be reported. VroniPlag Wiki began documenting the three more extensive text parallels in Steinmeier's thesis that were reported by the system and soon found more plagiarism from these sources, as well as additional sources that were not listed in the automatic report. When the extent of the text copying was determined to be severe enough, the case was published online as VroniPlag Wiki case #57. Many of the fragments documented are so-called "pawn sacrifices", the source is given, but no indication is made that a copy or near copy of the source text was used.

Does this vindicate the computer-generated report? (The automatic report produced by the marketing professor appears to be constantly updated as new sources are found by VroniPlag Wiki, the original version is unfortunately no longer available online.)  Hardly. If there is plagiarism in a thesis, a software that reports plagiarism on every page will, of course, be right in a way, even if most of the flagged pages are false positives.

Thus, if plagiarism detection software is used by an institution, no accusation should be made until the report has been checked in detail by a person who understands what the results actually mean. Schools that define "plagiarism" to mean any report by a plagiarism detection software that is above a specified threshold should re-think their policy. Any sort of automatic plagiarism accusation should not be tolerated.

Update: For guest commentary on the mathematics of the plagiarism probabilities, see the next article.