• Gobbledygook by Martin Fenner

    Martin Fenner's blog on scientific publishing in the internet age.

    • Problem
      You want to regularly go through the papers published in the most important journals in your research field.

      Solution
      Subscribe to the journal table of contents (TOC) RSS feed. Almost all journals now provide their TOC as RSS feed that is updated with every new issue. RSS is a standard web format used to publish frequently updated works. A journal article RSS feed usually contains one item for every article, each with title, authors, abstract and link to the fulltext article. To subscribe to the RSS feed of the journal TOC, look out for the RSS icon at the table of contents page. Links to the RSS feeds of some popular scientific journals are:

      Although most web browsers (e.g. Internet Explorer 7, Firefox or Safari) will understand RSS feeds (so you can just click on the links provided above), you should use a dedicated RSS reader if you subscribe to more than a few RSS feeds. There are web-based RSS readers (Google Reader and Bloglines are popular choices) and dedicated programs for every platform (e.g. FeedDemon for Windows or NetNewsWire for Macintosh).

      Dedicated RSS readers have two important features: they keep track of the journal articles you have already read, and they allow you to mark interesting articles for late use: reading the fulltext article (online or after printing the PDF), and storing the article in your reference manager of choice.

      RSS readers are also available for mobile devices such as the iPhone and are great for quickly going through a journal table of contents on the way to work.

      Discussion
      Regular reading of journal table of contents in your field (browsing) is still an important way to keep up with the literature, even though the use of online databases to find specific articles (searching) has become more common in recent years.

      Some people prefer to regularly flip through the printed journal when the latest issue arrives. But not only is there a delay between electronic publication and arrival of the printed journal, but most individuals can’t afford to personally subscribe to more than a few journals at most. And looking at the printed copy subscribed to by the department or library is often no longer practical to do on a regular basis.

      Receiving the journal TOC by email is a popular alternative, but has several disadvantages:

      • Receiving the TOC by email requires a few extra steps, including providing your email address, and often signing up for a (free) account with the journal
      • Organizing the TOC emails with your email program (e.g. moving to appropriate subfolders) requires extra work
      • Marking an interesting article for later reading requires extra work, because the TOC is sent in one big email message
      • Sharing the TOC with coworkers is more difficult than with RSS feeds

      Because RSS is a universal computer-readable format, receiving the journal TOC can easily be extended. One example would be the integration of the journal RSS feeds into reference managers. CiteULike has this feature (e.g. the most recent issue of Nature), but I hope that more reference managers will do the same in the future.

      Although almost all journals now provide RSS feeds to their TOC, how they do it might differ. Not every journal RSS feed uses the DOI – now the preferred way to link to a journal article. There are also small differences in what information is provided in the RSS feed.

      Further Reading

      This blog post was inspired by a recent discussion about the digital divide among scientists.

    • The term digital divide usually describes the troubling gap between those who use computers and the Internet and those who do not (Wikipedia). Many if not most scientists are experienced users of computers and the internet, and use email or public databases such as PubMed on a daily basis. But few scientists regularly use Web 2.0 tools, which would include both general tools such as Twitter, FriendFeed or Facebook, as well as tools specifically targeted at scientists (and this would of course include Nature Network).

      Regular readers of this blog know that I am fascinated by technology, especially if this technology makes it easier to publish scientific papers. And like others I sometimes get carried away (Google Wave is a good recent example). Even among those scientists open to blogs, wikis, etc., not everybody wants to follow every technology trend. This could simply be because that would take too much time, but most people probably just don’t care that much about technology.

      So what can we do about this digital divide among scientists? Science is often very specialized, and sometimes only a few people participate in a discussion about a particular topic. Tim O’Reilly has coined the term alpha geek for people that are the first to use new technologies, and there certainly is a place for science alpha geeks. But Science Online is about science communication, and communication tools that are used by only a handful of people usually don’t fulfill their purpose.

      One easy solution would be to simply wait 10-20 years until most senior scientists are digital natives (those that have grown up with digital technology such as computers, the Internet or mobile phones), but that seems to be an awfully long time for something this important.

      We could build better tools. Good tools simply work and don’t need a lot of explanations. For me Papers is such a tool, but strickly speaking not really Web 2.0, because it has no collaboration features. Google Wave could be another example, but only the next few months will tell. What makes a good Web 2.0 tool for scientists? Most importantly, that the tool solves an important everyday problem. Equally important, that there aren’t high hurdles in using this tool in terms of cost and learning curve. Another hurdle: some Web 2.0 tools only start to become useful once they have signed up a large enough number of users.

      But we also need to do more to communicate the usefulness of online tools for scientists. The original definition of the digital divide has a negative meaning and everybody probably agrees that we should at least try to overcome this divide. Although there certainly is also a digital divide among scientists, the general perception is probably not that those scientists that are not Web 2.0-savvy are at a disadvantage. We should have a much closer look at the tools that are currently available, define the scenarios where they can be useful, and focus on that. We talk too much about the details, technical or otherwise. One example: most scientists probably want to have an idea of when an online reference manager can be helpful rather than the tools they currently use, rather than discuss the subtle differences between the very similar CiteULike, Connotea, and 2collab. Part of the problem is that people want to make money with their Web 2.0 tools for scientists, but forget that collaboration is more important than competition when the market still has to grow and is currently probably too small for viable business opportunities.

      This makes closing the digital divide among scientists very much a science education exercise, and I think that science librarians should play a central role in this. Not surprisingly, a seminar last week by our local science librarian in our department and a blog post by science librarian John Dupuis (and the FriendFeed discussion around his blog post) were the inspiration for this post (another FriendFeed discussion started by Bora Zivkovic made be write the post today instead of going to bed early).

      Update 06/15/09:
      One good strategy to overcome the digital divide among scientists would be a Science 2.0 Cookbook. Similar in format to the O’Reilly Cookbook series for programming problems, the Science 2.0 cookbook would use the format problem/solution/discussion to provide a solution for problems like How do I share references with my coworkers in the lab? This could be started as a Wiki project.

    • Why do we go to conferences?

      Sunday, 07 Jun 2009

      I just returned from the American Society of Clinical Oncology (ASCO) meeting in Orlando, with approximately 30.000 participants one of the largest oncology conferences. Like other conferences of this size, the experience can be overwhelming, but thankfully the organizers are getting better every year in using technology that helps in finding the most interesting sessions. Most sessions are made available as video podcasts or online presentations. There is currently still a delay of 24 hours, but I wouldn’t be surprised to see the sessions streamed live as video over the internet in coming years. This year’s ASCO also had the first official Twitter meet-up, although there still was relatively little Twitter activity compared to other conferences.

      Last week we announced Science Online London, the follow-up conference of last year’s Science Blogging 2008: London, to take place August 22 at the Royal Institution. I am helping to organize the conference, and I’m finding myself in the middle of discussions about session topics, speakers and the right format to present and discuss science blogging, wikis, and other science-related activities happening online.

      These two events started me thinking about the reasons I go to conferences. After all, traveling to conferences is not only expensive, but can also be exhausting, and too much airline travel is certainly not good for the environment (places like Dopplr can calculate your carbon profile). Here are a few points that I think make a conference a good conference worth attending in person:

      Conferences should present new and exciting information which can not be presented differently
      Oral presentations at conferences are usually the first public presentation of interesting research findings before they appear as published paper a little (or much) later. What we don’t want to see is the presentation of the same old data that we have already seen the year before. Good educational sessions and keynote lectures find informative or entertaining ways to present their information, and again should not be simply a repeat performance (unless the audience is completely different).

      Most conferences encourage the presentation of unpublished work, but speakers are often careful in doing so for a variety of reasons, e.g. fear of getting scooped, fear of problems with journal submissions or fear of problems with patentable work. This fear can make conference presentations rather boring, as presenters might hold back with the real exiting stuff until these results are at least accepted for publication.

      Cold Spring Harbor Laboratory try to solve this problem by policies that essentially make their meetings non-public. During the last week we have seen an intensive discussion (e.g. On the challenges of conference blogging) whether or not the sessions in a CSHL meeting can be communicated publicly by participating scientists via blogs or Twitter. I think that there is nothing wrong with small conferences being non-public, and that the same rules should apply to science bloggers as they do apply to journalists. But the conference organizers should clearly state their policies regarding blogging (including Twitter, FriendFeed and other microblogging tools).

      Conferences should enable as much active participation of as many participants as possible
      If we just want to listen to a presentation, we could do that without going to a conference, thanks to video streaming, SlideShare and other ways of presenting online. Sometimes a paper is even published on the same the day as the plenary session, as happened recently with the JUPITER trial.

      Smaller sessions that leave enough room for discussions, unconferences and poster sessions are all good formats to encourage active participation. It should be a goal at least for smaller conferences that every attendee has had a chance for active participation in one way or another. But active participation in a session is much more focussed than disussions in coffee breaks between sessions or afterwards in the bar.

      Conferences should facilitate personal networking
      Meeting someone in person is very different from interacting online via email, Twitter or social network. For many people this is the real reason to go to a conference. Or as Henry Gee puts it (slightly out of context), “the most important part is to hang around in bars.” Smaller conferences (e.g. 100-150 people), enough time for coffee and lunch breaks between sessions, and social activies around conferences (from science tours to skiing) all facilitate networking.

      I’m looking forward to go to a very special conference in five weeks: Science Foo Camp. And let’s see whether we can put together an interesting Science Online London conference. Please suggest and discuss session topics and speakers in the Nature Network Forum, FriendFeed group or via email until June 19. We are still looking for interesting session ideas, speaker suggestions and other suggestions to make this an exciting conference.

    • Google Wave - don't forget the scientists

      Thursday, 28 May 2009

      Google Wave is a new tool to communicate online and collaborate and was announced today at the Google I/O conference. Google Wave is not only a product, but also an open protocol that anyone can use to build his own wave server.

      Google Wave is already very interesting by itself, but can also be extended further:

      • by robots that automate common tasks and run on the server, and
      • by gadgets that allow new ways of user interactions and run on the client.

      This sounds all rather geeky, but why should a scientist care about Google Wave? Part of the job of every scientist is to communicate and collaborate, and email is by far the most widely used tool to do that. Email has many shortcomings, some of which can be overcome by blogs, wikis, and a constantly growing number of other Web 2.0 tools from Twitter to FriendFeed. But Google Wave goes one step further. The basic idea of a wave is a document (and this can be everything from text to pictures) combined with the discussion about that document, and that is a very natural design for many scientific communications.

      Google Wave will be publicly available later this year. I hope that by that time it will also have the first extensions designed specifically for scientists, e.g. for

      • references with embedded metadata and discussions about these references
      • molecular structures and other scientific data types
      • scientific manuscripts in progress (Google Wave has nice tools for collaborative document editing)
      • lab notebooks (again because of the wiki-like editing features)

      Google Wave could turn into serious competition for the The Life Scientists Room at FriendFeed. And it is a great topic to discuss further at Science Foo Camp 2009.

      Update 5/29/09: You can now watch the Google I/O presentation. And Ricardo Vidal also blogged about Google Wave from a scientist perspective (Using the (Google) Wave to surf the streams).

    • OAI-PMH: Interview with Tony Hammond

      Monday, 25 May 2009

      Most of us find, store and sometimes read scientific papers electronically. Although abstracts and fulltext papers are usually available as web pages in HTML format, PDF is clearly the preferred format for storing and printing papers.

      But publishing scientific papers in electronic form obviously requires more than providing the content in HTML or PDF format. We want to find the papers we are interested in on the journal homepage or in a digital library (e.g. PubMed), and for this we need metadata about the paper. The metadata could simply be an digital object identifier (DOI), but the metadata could also contain important information required to find a paper in a search strategy (e.g. authors, title, publication date or keywords).

      As Duncan Hull et al. noted in a PLoS Computational Biology paper last year (Defrosting the digital library: bibliographic tools for the next generation web), metadata are often disconnected from the data, and there are no universially agreed standards to represent these metadata.

      But why should we as scientists care about the technologies used to publish and distribute a paper? We shouldn’t forget that these technologies could allow new and innovative ways to find and read scientific papers. One simple example: storing the metadata in the PDF file (using XMP) could make it much easier to import a large collection of PDF files into a reference manager.

      One such initiative to provide the metadata of a scientific paper is OAI-PMH. I asked Tony Hammond from Nature.com a few questions about this newly supported protocol, as well as some more general questions about metadata provided by the Nature Publishing Group journals.

      continue reading this post
    • Reference managers are essential tools to read and write scholarly papers. In the last few years we have seen both a number of new reference managers (most of them web-based), but also a trend for the established reference managers to gain social networking features. More choice is great, but it also creates confusion about the right tool to use. I have talked about reference managers before, but in this slideshow I look at the features that I find important.

      And there are at least two features that I like, but haven’t really seen implemented in a reference manager:

      • Integration of an RSS reader for journal table of contents (TOC). Currently I use a standard RSS reader, and it requires too many steps to get interesting references from a TOC into my reference manager.
      • Tracking the post-publication discussion. I want my reference manager to link to the papers that cite a particular reference (I currently use Scopus for that) and link to Faculty of 1000 or ResearchBlogging.org comments on that paper.

      In the last slide I wonder whether there is a) one perfect reference manager, b) one perfect reference manager for my particular needs, or c) I will always need more than one reference manager and have to move references back and forth between them. Currently I’m at c), using mostly Papers, Endnote and Connotea. But Mendeley, Zotero, Refworks and Endnote are moving in a direction where they try to cover all requirements.

    • Scientific papers are submitted to a journal as word processor files, usually in Microsoft Word format. After the paper is accepted for publication, the journal takes the manuscript and translates the text into a format that is better suited for publication online and/or in print. XML and the NLM DTD – a set of XML schema modules – have evolved as the standard data format for this purpose. Files in the NLM DTD format can in turn be translated into HTML and/or PDF for publication. The NLM DTD format is also used to transfer journal articles from publishers to archives (e.g. PubMed Central) and for long-term archiving.

      eXtyles is a tool that facilitate the translation between these different document formats, and in the process also help to clean up broken references and other errors in the manscript. As paper authors usually don’t see much of what happens to their manuscript after submission, I thought I’d ask Elizabeth Blake and Bruce Rosenblum from Inera (the company behind eXtyles) a few questions.

      continue reading this post
    • Faculty of 1000: Interview with Richard Grant

      Tuesday, 28 Apr 2009

      Richard Grant, who needs no introduction here on Nature Network, has just moved to London to start a new job as information architect for Faculty of 1000. I took this opportunity to ask Richard a few questions not only about Faculty of 1000, but also about his role in the company and future plans for the service that they have in mind.

      continue reading this post
    • I’ve recently asked a few questions about author identifiers for scientists. Here are the results (based on 48 responses). The results are also available as .xls file.

    • Popularity of online reference managers

      Sunday, 19 Apr 2009

      Now that we have a number of online reference managers to choose from, I thought it would be interesting to look at their popularity – both in absolute numbers of visitors and the in changes during the last 12 months. Online tools such as Compete allow everybody to do just that, and their basic functions are free to use. I’ve picked unique visitors, but there are of course other statistics to look at, including total number of visits.

      Click on the graph for more detailed statistics.

      CiteULike is the most popular online reference manager, and it is obvious that the announcement by Springer to sponsor them last August has helped their site traffic. Only CiteULike and Labmeeting show a significant increase in unique visitors in the last 6 months.

      The statistics are more complicated for tools that include both a desktop client and online database (Endnote, Mendeley, Zotero) and these numbers should be interpreted with caution. I’ve included RefWorks in both graphs for better comparison. It is probably safe to say that both Endnoteweb and the online version of Mendeley are not as popular as the online only reference managers in the first graph. This could either mean that online only tools are far more popular than desktop applications (which I doubt) or that most references are still primarily stored in desktop programs and not shared online. Something that Eva Amsen already described last year (How to get scientists to adopt web 2.0 technologies). To put these numbers into perspective: ncbi.nlm.nih.gov (the home of PubMed and other NCBI databases) sees about 2.5 million unique visitors a month.


Search blogs

web feed Request a blog Send an invite

Advertisement