More duplicate papers being published?

Corie Lok

Thursday, 24 Jan 2008 19:54 UTC

Nature this week has published a commentary by two researchers who say there may be a growing problem with the publication of duplicate papers in different journals, either through plagiarism or self-plagiarism (similar papers with the same author).

Mounir Errami and Harold Garner of the U of Texas Southwestern Medical Center report how they searched and analyzed more than 62,000 abstracts indexed in Medline and found that 1.35 percent of them were duplicates with the same authors (the authors give caveats for this analysis so please read the commentary). They say that with the rapid growth in the number of journals and papers, publishers and database curators have not kept up with the detection of duplicate papers. They call on journals to use new software tools to better identify duplicate papers and on the community to expose people who are clearly not following established publishing policies.

What do you think? Have you seen more duplicate papers published in your field?

Do you think this is a real problem?

Are there legitimate reasons to publish similar versions of the same paper in different journals?

What else can be done to prevent this problem from growing?

Before diving into the conversation, please have a close read of the commentary. The authors there give plenty of caveats for their Medline analysis, which I didn’t summarize fully in this post.

  • Replies

    Post a reply
    • Corie,

      I summarized my view on this topic in a blog post. Briefly, I believe that this is a real problem, although I don’t believe there are more duplicate papers than 10 years ago. This problem will only go away if grants and jobs are no longer given to those with the longest publication record. Alternatives? Ask applicants to select their best 3,5 or 10 papers.

    • In examining the Medline database, the authors were limited to looking for duplicates by title and abstract. They note that the lack of full text availability makes it harder to detect sophisticated duplications, whether the later versions attributed the earlier versions, and to verify whether the two articles are in fact duplicates. Most plagiarism detection software programs require full text to be effective. (See Stein et al.’s overview, Sorokina et al., and Yang and Callan)

      It seems that the new “public access” mandate recently signed into law (which will require NIH-funded investigators to deposit the full text of their articles in PubMed Central within 12 months of publication) should make detection and verification of plagiarism slightly easier (at least for plagiarism by NIH funded authors).

      Errami and Garner also note that “as many as one-third of the manually verified duplicate abstracts in Déjà vu sharing at least one author are also published less than five months after the original”. This suggests that detection of duplicate submissions prior to publication may not be possible. However, if the author has already been identified as a potential duplicate publisher and if, as suggested by the article, authors tend to be repeat offenders, then perhaps there should be additional checks on the manuscript prior to publication. In the absence of publishers sharing information about which manuscripts are under consideration, it’s unclear (to me, at least) what these checks could be.

      Errami and Garner note that one rarely sees simultaneous (or nearly simultanous) publication of duplicates with different authors. They claim that this is “undoubtedly due to the fact that it is usually difficult to re-use someone else’s work before it appears in print — unless the duplicating author also happens to have been a referee of the original”. Mining preprint servers such as Nature Precedings and ArXiv, which make the full text of preprints available, may make it easier for journals to detect non-self-plagiarism prior to publication. This may also be an incentive for original authors to post manuscripts on preprint servers concurrently with journal submissions.

      Finally, the authors ask “What of the examples of text directly translated with no reference or credit to the original article? Is this justified or acceptable? And is such behaviour more widespread for review-type articles for which greater dissemination may be justified?” It is my understanding the copyright law addressed the question of translations and other derivative works long ago, requiring them contain a significant amount of original material in order to qualify for copyright protection. (See the US Copyright Office on derivative works) Surely copyright infringement is neither justified nor acceptable (unless it is fair use), even for review-type articles.

    • This isn’t a reply but a request for help. Well I was told that my comment on this paper has been approved by moderator, but, notwithstanding my two e-mails as request of help…I cannot find my comment!?!
      I am interested since I’dd like to write a critical article on an italian website, www.ilpungolo.com.

      Thanks

      Sergio Stagnaro MD

    • I have read the article by Mounir Errami and Harold Garner with interest and I recommend all interested in this thread to read it too. I believe this is an area of increasing concern. Easy access to electronic copy makes it relatively easy to snip that handy phrase that encapsulates a thought. From there it is easy to succumb to the temptation (and I too have been tempted) to use that useful introductory paragraph. If you are prepared to take a paragraph why not the whole section and so on… Undergraduates regularly expropriate whole articles from Wikipedia and are not scared by anti-plagiarism software as they know it is not routinely applied. Today’s undergraduates will become tomorrow’s researchers and hence I see the problem only getting worse.

      As a referee I have identified duplicate or severely overlapping content while reviewing papers in the past (for reasonably high profile/impact factor journals). I do not search for duplication routinely but, as someone who is used to referee papers in particular niche areas, I received both papers in one instance and in another I had read an on-line pre-pub before receiving the duplicate. The authors will not be named as that would break referee confidentiality but they were from well known institutions in the developed world.

      What was the common factor (apart from the paper!) – the authors were relatively junior new appointments. Younger academics seem to feel themselves under a lot of pressure to publish. In my department I believe that my younger colleagues are much more sensitive to impact factor than is possibly healthy when they consider where to publish an article.

      How to prevent plagiarism? The journal editors must refuse plagiaristic pieces and explain why. Identification of duplication post-publication should lead to high profile withdrawl of the papers by the journal that was misused. Plagiarism should be identified as a disciplinary offense in the employment contract of academic and research staff.

    • Please also see this blog just posted by Prof Peter Suber.

    • There is a highly relevant discussion going on over at the “Ask the Nature editor” forum. Here is my comment about duplication, which includes a link to the Nature journals’ policies and procedures. There are lots of other interesting points raised in that discussion, also.

    • A study in 2002 by Gaines and Braumoeller (“Actions Do Speak Louder than Words: Deterring Plagiarism with the Use of Plagiarism-Detection Software”) concludes that telling college students that the instructors were using plagiarism detection software was a significantly stronger deterrent for students than just warning them not to plagiarize. (Originally at PSOnline but now unavailable without a subscription; a brief summary of their findings is here). It would be interesting to find out if this same effect occurs with journals.

      CrossRef has announced a program called CrossCheck, (via DigitalKoans) which allows participating publishers to check for plagiarism prior to publication. The project is focused on creating a closed database of full text (post-publication only?) content for use by automated plagiarism detection programs. Eight publishers are participating in the project, which was initiated at the 2006 CrossRef annual meeting. Their initial report says to look for an update at the 2007 CrossRef annual meeting, but the agenda for that meeting doesn’t mention CrossCheck. There is a press release describing the launch of the product in August 2007.

      If the eight publishers participating in CrossCheck inform their authors that they are using the service, one could see if they have fewer incidences of plagiarism. Presumably Errami and Garner can segment their data by journal or by publisher, and the results would be very interesting.

    • The following comment should not be taken as an endorsement to self-plagiarize.

      If someone submits their same (or overtly similar) paper to more than one journal and we presume that the articles go through independent review processes before publication, does this lend extra credence to the findings since, in essence, it will have been twice reviewed? I realize there is a false bit of logic to this, but I could see that this would be a potential argument made by a practitioner of this type of double-dipping.

      To what extent will someone read the same article twice? Or is it that the authors recognize that publication in two distinct journals – while within the same scope of research – provides access to a broader audience? Finally, does the proliferation of journal titles (along with the perception that publication quantity is a reasonable substitute for quality) only enhance the chances of this plagiarizing to occur?

      Craig

      p.s. For the record I like the idea of a faculty candidate (or someone up for promotion) having to pick their top 3-5 papers to submitt to the committee for their promotion/job consideration. Then, basicly, having to defend those articles much like a PhD defense.

    • So far as the Nature journals are concerned, Craig, simultaneous submission is explicitly against our policies (see the link in my earlier reply in this thread). Authors are asked to send us copies of any mss they have submitted elsewhere at any stage of the consideration process of their paper by a Nature journal. We also ask for URLs of any preprints in servers, etc.
      If two papers are actually published simultaneously by the same authors that are very similar, and one of them is in a Nature journal, the editor will check out the recieved/accepted dates and follow up with the authors. If misconduct has occurred, we will take action — an editorial correction to a paper is linked on nature.com and in external databases such as Pub Med. We might even withdraw a paper if the circumstances warranted it.
      I do think that the vast majority of authors are honest, and that the system has to be designed with that premise (but with checks and balances, of course).

    • As Brian also mentioned – I have also once received two manuscripts at the same time for review…The manuscripts were exactly same (except the manuscript and reference format), submitted in two high-impact journals of different publishing groups and were from a young investigator of a reputed laboratory from developed world)… I also agree that instead of emphasizing on numbers, it would be a good idea to ask for career’s best 5/10 publications, when it comes to grants approval, which would significantly reduce the pressure on young investigators…

    Post a reply

Search forums Advanced search

Submit this topic to

Advertisement