It is possible that I am about to preach to the choir, but I am going to come right out and say it anyway. I hate PubMed. I hate it with a burning passion. For a site that is as vital to scientific progress as PubMed is, their search engine is shamefully bad. It’s embarrassingly, frustratingly, painfully bad.
I have spent an absurd amount of time on PubMed recently and can say in no uncertain terms that it is making my dissertation writing way more painful than it needs to be. I can hold a paper in my hands, search for two authors’ last names and have PubMed come up with nothing. My friend searched for microRNAs and her virus of interest. The search engine (can I even call it an engine? It’s more like a tricycle) came up with papers dating back to 1997. I am pretty sure no one knew about microRNAs in 1997. Yet another friend was only able to find publications about his compound of choice after empirically defining one of its functions in the cell… which is when he found out this information had been available all along. He couldn’t pull up the relevant papers without searching specifically for the compound and that one effect on the cell.
Science cannot proceed at a decent clip if researchers cannot find the most basic necessary information. Has this study already been done? What else has this author published? What papers are related to the one I am reading?
I would now like to mercilessly butcher a quote from John Wilbanks from a talk he gave at the recent Publishing in the New Millennium forum. It went something like “It’s unacceptable that the web is better suited to searching for pizza than it is for furthering scientific research,” or something to that effect. Hopefully, he can correct me. But his point stands. It is way easier to search successfully for restaurants and pizza deliveries than it is for papers relating to neurotrophins and herpes, as thrilling as that topic may be.
Why is PubMed so behind the times? Why? How does it even work? Does it search only the abstract? Does it also search the body of the papers that are available online? Why does it get so massively confused by an author’s initials and last name together, in one search? Why can’t it alert me when papers relevant to my work are published? When is it going to get better? Is there any chance this might happen before my dissertation is due? Because frankly, it’s driving me more bats than the dissertation itself.
I’ve tried PubMed a couple of times, but it just leaves me confused. Fortunately, we have access to Web of Science, which is still imperfect, but at least I understand it.
I must be getting old though. I remember the days when the Science Citation Index was a multi-volume series of books. The youth of today etc. etc.
PubMed is far from perfect, but it is still my first choice. You can save your searches as RSS feeds that are updated regularly. This feature has helped me find a few papers in my field. I wrote about Google Scholar and some other alternatives in this blog post.
But I agree with you that a lot can be improved. For your particular problem unique author IDs would probably help. That feature is high on my wish list for PubMed.
I use PubCrawler (not as exciting as it sounds) for email alerts. For searches, I often resort to ‘Googling’ and adding ‘pdf’ after the search term. I agree, PubMed leaves a lot to be desired.
The Discovery Initiative: NCBI’s databases are highly integrated. For example, a PubChem record may have links to records for chemically similar compounds, to a protein structure that was crystallized along with the chemical, to relevant journal articles, etc. These records, in turn, are linked to other related records. While this extensive network of links provides users with vast opportunities for exploration and for making the kinds of connections that underlie the discovery process, many users do not go beyond retrieving the basic results from a search query. The Discovery Initiative aims to improve the presentation of results so that users are more readily drawn to these related data that could lead to serendipitous discoveries.
More under the fold
Source:- Nature.com
The Discovery Initiative was brought to my attention last year (personal communication) by
David J. Lipman, M.D.
Director,
National Center for Biotechnology Information
@ Anna,
You might just be in luck thanks to voicing your frustrations online !!
I brought this post to the attention of Dr Lipman who I’ve just heard back from.
He’s authorized me to post here on his behalf. (Thanks Dr Lipman)
Although the current engine works well for some users and some queries, I understand Anna’s frustration and we are in the midst of a number of changes that will make PubMed work better for her and many other users.
We will be adding a number of other “sensors” which will run in parallel with the default search. From monitoring results of enhancements we’ve added to some of our other Entrez databases.
A number of these complaints are fair and we’ll be doing our best to address them. With the large number of users we have, it will be clear what areas we’ll improving and what areas will need more work.
Part of the problem is that none of these searches are trivial, and NCBI doesn’t have the resources that Google or other search engines have to work on better algorithms and data integration. About ten years ago PubMed was the leader of the pack when it came to search engines—automatic mapping of queries to a controlled vocabulary (MeSH), ability to limit searches by semantic field (author, journal, abstract) and so on. A lot of this requires support from the journals, though. Martin mentiones a unique author id, but unless that code is used by all journals there is no way for the Library of Medicine folks to make use of it.
There’s a lot of good stuff going on at the SKR Group, and you usually get feedback on every suggestion you send towards the help desk. Wouldn’t bet on any changes anytime soon, though ;-)
Only abstracts are indexed, to the best of my knowledge, although that might have changed with PubMed Central. Make sure to look at the ‘details’ page when a query doesn’t return the expected results, it’s frequently due to some term expansion you did not expect. Subscribing to search results has already been mentioned, although I prefer the HubMed interface for this. And if you want a bit more control over your query without having to fiddle with all the square-bracket tags, Slim might be worth looking into.
Happy Easter and all that ;)
Sorry for double posting but here are Dr Lipman’s comments again but in full.
Hi Graham,
Although the current engine works well for some users and some queries, I understand Anna’s frustration and we are in the midst of a number of changes that will make PubMed work better for her and many other users. One type of query she is doing – essentially a form a targeted search/citation matching – will be handled much better within the next couple of weeks. We’re putting in a CitationSensor approach that will run the default search (but one which is itself somewhat improved) but have a separate set of heuristics for picking up Anna’s type of query.
Especially for someone writing a thesis or paper, they are often simply trying to find a particular paper, perhaps using author names, or terms from a title, or even just cutting & pasting a reference from another online paper. So this will be a big improvement for them.
We will be adding a number of other “sensors” which will run in parallel with the default search. From monitoring results of enhancements we’ve added to some of our other Entrez databases, it’s clear that users will be taking advantage of this.
A number of these complaints are fair and we’ll be doing our best to address them. With the large number of users we have, it will be clear what areas we’ll improving and what areas will need more work. And wish her good luck on her defense!
I personally feel Web of Science is far and away the best literature searching tool. You can save searches to have the results emailed or RSS’ed (or exported to Endnote), and of course they have the excellent citation factors. It’s also incredibly easy to use. To my knowledge, it blows PubMed away with the number of journals that are indexed.
Thus far on this thread, it seems that many use a variety of tools as one might expect. That’s great.
To put things into a much broader perspective, in general ‘web 2.0’ terms, PLEASE check out this vast directory.
With so much going on online these days in scientific terms, it’s simply unfeasible for us humans to aggregate and navigate all this knowledge/data without robotic assistance.
Anyone for Semantic Web?
May one throw Web 3.0 and Health Librarians: an introduction by Cho and Giustini 2008 et al into the mix/this thread.
Go easy on this – it be only a two day old baby….
Hi from Google scholar blog,
As a biomedical librarian, I didn’t realize that some scientists felt this way about the premier search tool in medicine. I’d like to take on the challenge of helping you out, taking your questions, and maybe sharing some pointers about how to use this powerful database. Chat me up!
http://weblogs.elearning.ubc.ca/googlescholar
I agree, searching with PubMed can be very painful and there are no good solutions around. One alternative which is sometimes helpful and seems to be on the right way is GoPubMed.
Wow. Thank you, everyone, for your suggestions of alternative search engines. I had never heard of most of them, irritatingly enough. I will be sure to try them in the coming weeks.
Martin – I didn’t know I could do that with the PubMed RSS feed! That’s great.
As far as searching on PubMed is concerned… Here goes.
I searched for my PI and a drug I knew was a key player in one of the papers I was trying to locate. I searched for “Schaffer PA K252a”. PubMed came up with zero hits. None. Google Scholar brought up the paper in the top ten hits, happily enough. My issues with Google Scholar are less in the efficiency of the search but the presentation of the search results. The entire paper title, nor the entire list of authors is displayed, which I find inconvenient. There is also too much scrolling – too much blank space on the page. Additionally, the links are rather haphazzard – some link to html pages, others directly to PDF files, which I may or may not want to be downloading. While I know that the extension is shown in the address the link is directing to, I find it difficult to find a discern. I spend the same amount of time looking for the hit I am interested in on Scholar as I do trying to come up with the correct search terms on PubMed (a skill I have yet to master).
Another PubMed question/issue – initials confuse PubMed, as do searches for the last names of multiple authors. What is the best way to search?
Hi again Anna,
I’ve done a post about searching for authors. Let me know if this is helpful :)
Dean
http://weblogs.elearning.ubc.ca/googlescholar/archives/045602.html
Reading over my last comment, I realize I should edit better. I meant to say that Scholar does NOT bring up the entire paper title, which I dislike.
If you know exactly what you are looking for a combination of two ‘uncommon’ author names usually does the trick, although Pubmed frequently gets confused by first/last author names. An uncommon name and title word tends to yield best results when I have the paper next to me and just need the reference (using Papers).
Using your example, I’ve found this paper, not sure it’s the one you were looking for though. Switch from ‘Abstract’ to ‘Medline’—this is the information PubMed has available for querying purposes, along with the field specifiers. If it’s not in there (like K252a) you’ll come up with zero hits; a broader search (Schaffer PA[au] drug) worked in this case.
And yes, sometimes the easiest way is to a) hop over to Google Scholar, find the document, then plug in the PMC or PMID number into PubMed.
Hi Anna,
I see Oliver has posted some excellent suggestions.
Would you be willing to answer a few quick questions kind of like an interview for me? I could post them on the blog, and it might get some interest?
With best wishes (good luck on the dissertation)
Dean
I’m a medical librarian, and PubMed can function as a great way to locate biomedical literature, but it helps to have a few tricks up your sleeve. If you have a medical librarian handy, it might be worthwhile to spend an hour with him or her learning a few of these. I’m willing to correspond via email, etc. with anyone who would like some tips on their specific search, especially if you don’t have one of my expert colleagues handy – contact me at rachel r walden at gmail dot com.
Oliver – Thanks so much for the suggestions. I have a question though – what is the advantage of searching for the PubMed ID in PubMed once you locate it on Google? Won’t it just link you to the PubMed entry? Or does it not do that all of the time?
Dean – I would be happy to answer any questions you may have! Lord knows I have enough opinions to go around, for better or for worse.
Rachel – Your offer is wonderful! Thanks so much. At the moment, I have neither a medical librarian to bother nor the time to do it in. I hope I make it through the next month on the rudimentary knowledge I do have.
Harumph. In my day we just went to the library, or asked a friendly neighborhood oracle. When I was a graduate student I spent a lot of time in the palaeontology department at the Natural History Museum. I was working on fossil bovids (don’t ask) and the World Authority on this subject worked two doors down the hall. If I had a vague query about a paper, I’d go and ask the World Authority, who’d inevitably recall – purely from memory, this – the reference. The author, title, journal, the lot. This chap was retired in a wave of staff cuts and never replaced. And just try to search PubMed for palaeontology…
Harumph.
Trawling round the Streets for Papers in 2007 Goldacre et al 2007 MP3 (minutes ~ 2 to 4 are of relevance to this discussion).
Anna, good luck, and feel free to send me an email any time if you get stuck.
Anna,
I am another medical librarian who has read these comments with interest and would like to offer two additional points of advice.
As your search is on genomics topics including microDNA, the first source I’d suggest you try to search is ENTREZ, a different point of access to National Library of Medicine databases, at text to link“
ENTREZ is a cross-reference source for all the National Library of Medicine databases, including MEDLINE (PubMed) but many more such as OMIM, gene libraries etc. It is worth your time to try it if you have been unsatisfied with results from the PubMed database alone.
The other idea is, when you search PubMed are you using the MeSH (Medical Subject Headings List)? MeSH is available online in PubMed at the left-hand side of the main search page. MeSH is a list of 300,000 highly specific medical terms which expert searchers use to extract the ‘best’ subject headings to use when search PubMed (which after all, is a database with 17,000,000 records covering a period of 34+ years!).
MeSH terms are what librarians refer to as ‘indexing terms’. When a new article is sent to NLM to be input into MEDLINE, an indexer reads the full-text of that article, then selects 10-14 MeSH terms to apply to that unique citation electronically. Then, when you or I search for those terms, that article pops up on the retrieval list. If you are just going to the main search box of PubMed and putting in terms, then you are coming up with retrievals in a “keyword” field… that is why people sometimes get millions of retrievals. Consider this example: a search in PubMed on “breast cancer” is a keyword search. A search done using MeSH terms for “breast carcinoma” will pull up many more-relevant citations than the keyword search.
Also available from the MeSH page in PubMed is a group of what librarians call ‘clinical subheadings’. These groups of words are general or generic qualifiers invented to be used in conjunction with MeSH terms. Here is an example: MeSH term ‘warfarin’ combined with ‘adverse effects’ would be a highly-specific search strategy.
A search constructed using MeSH for “breast carcinoma” with clinical subheadings “genetics”, “epidemiology” and “drug therapy” is going to pull up a group of highly-relevant terms (and exclude much of what you don’t want to see anyway).
Selecting appropriate MeSH terms, and then attaching relevant clinical subheadings is how librarians search… and how we teach others to search in an academic medical library setting. Once people understand how the information-architecture of MEDLINE is constructed, it can become quite an elegant way to search in a mathematical or algorithmic sense.
Please feel free to email me or call me if you would like to walk through this. I also am a blogger like you… my blog is targeted for medical students, residents, researchers and physicians who are interested in trends in evidence based medicine literature, sources, search strategies and library 2.0/web 2.0/medicine 2.0. It is on Wordpress at text to link“
K Crea
Even though you aren’t yelling out loud, we PubMed fans hear you.
Here are some Quick Tips for Using PubMed:
1. Use “Single Citation Matcher” if you are looking for a known article. The button is under PubMed Services about halfway down the lefthand side of the PubMed search screen. Many search options are available: Author (autofill), Journal, Date, vol/issue/page number, words in the article title. Click on “help” by the author search box to find out how to search for articles by more than one author. By the way, “Single Citation Matcher” is a misnomer. You can find as many citations as there are that match your query.
2. Use MeSH to find subject headings if you are having problems finding articles on a particular topic. See where it says “Search PubMed” at the top of the PubMed search screen? Click the down arrow next to PubMed, select MeSH (stands for Medical Subject Headings) and do your search. Using microRNA as an example, you will see that it is a MeSH heading (there are others listed that may be more what you are looking for). If you click on the microRNA hyperlink, you will find subheadings you can use (administration and dosage, or chemistry, or isolation and purification are just a few subheadings available). Click on the box by the main heading if that’s all you are interested in, or on just those subheadings you want. Then click on the down arrow where it says “Send to” on the second gray line. Select “Search box with AND” and click. That opens a window in the middle of your screen. You have to click on “Search PubMed” below that window for the search to run.
3. Use the History tab to see the searches you’ve done and to combine searches.
4. Use the “Send to” box for other purposes (printing, emailing, placing citations on your Clipboard and then emailing them).
5. Use Limits to limit your searches (by language, type of publication, date added to database, just to name a few).
6. Use “My NCBI” to set up searches to be run on a regular basis and RSS’d or emailed to you.
Compared to a bicycle, PubMed is more like a Lamborghini than a tricycle. It has many features that really can’t be appreciated unless you take some time to become familiar with it. Once you’ve taken the time, you will be like many others around the world who have become dependent on it and who become upset when it’s down, even if it’s only for a few hours. To become familiar with all of PubMed’s features, do the Tutorials or click on FAQs (they are on the left hand side of the PubMed screen), or take a PubMed class (here is the schedule and they are free).
Sincerely,
Another Medical Librarian
Hello Anna!
There might be a solution for you: a personalized medical metasearch engine.
Give it a try!
http://scienceroll.com/2008/03/28/do-you-hate-pubmed-here-is-the-solution/
Berci Meskó
As a fairly basic user of PubMed I can only provide a couple of limited, but I hope helpful comments (BTW, the other comments have been fantastic).
It sounds like you are using PubMed to try to find papers that you know exist for a reference list or something similar. I do this in one of two ways.
For a reference list use EndNote and connect to PubMed for your search. This allows you to enter up to three search terms in various categories ie Author, Title, Year etc which can narrow down things a lot.
Alternatively, if you know the page number the paper starts on (“I am holding the paper in my hands”) type in an authors sirname (no initials) and the page number of the first page of the paper. It really limits your results.
Hi Anna,
RE: “At the moment, I have neither a medical librarian to bother nor the time to do it in.”
I see you are a PhD student affiliated with Harvard Medical School. I would be very surprised if you do not have available to you a medical, biology, chemistry, or biosciences librarian who would be more than willing to schedule an hour one-on-one session to show you some more advanced searching tips that would address your needs. I encourage you to call, email, or IM your library’s reference desk.
An hour or two spent learning more about PubMed can save you countless hours searching later on.
Good luck on your dissertation.
Tia
Anna and others,
“At the moment, I have neither a medical librarian to bother nor the time to do it in.”
How much time did you waste trying to do your searches? It would be a better use of your time to take a class on using PubMed and doing literature research rather than thinking you know how to do it already. If you took a class you would also learn that Google Scholar doesn’t have nearly as much content as PubMed and is woefully behind.
Your rant, has already made it to the library listservs and one librarian already mentioned how sad it is that you and others like you won’t take the time to learn.
“Students spend so much time learning all the parts of their trade, but when it comes to finding literature they aren’t willing to spend one minute learning how to search. It doesn’t even occur to them that they need to. This researcher admits to wasting hours trying to find information and then turns down help from two medical librarians and says she doesn’t have time to seek the help of a medical librarian to learn how to search.”
Your comments about PubMed are important. By consulting with your local medical librarian, you will save yourself a lot of time. I hope that you also take the time to contact the National Library of Medicine to relay your comments/frustrations. User feedback is really useful to them. You may contact them at: http://apps.nlm.nih.gov/mainweb/siebel/nlm/index.cfm
PubMed is not easy to use for expert searching which is what you need when working on a dissertation. 1. Search using SUBJECT HEADINGS AND NOT KEYWORDS. Click on the “MESH” database to find subject headings. “Micrornas” is the official medical subject heading. If you read the scope note, it tells you that this term was first used in 2003. PRIOR INDEXING included “antisense rna” and “untranslated rna”. If you combined these 3 terms with an “OR”, the broadest search would get you 32,936 citations. If you use micro-rnas as a keyword you get only 110 citations, or microrna as a keyword = only 2534 citations. The following article was written by two physicians who took the time to write an article about searching so that you would save time. JAMA, 271(14):1103-8, Apr. 13, 1994, “Understanding & using the medical subject headings (MeSH) vocabulary to perform literature searches”, Lowe, HJ, Barnett, O.
2. When searching authors,use the “Single citation matcher” and the left side of the PubMed page and put in the LEAST amount of information to get your results. Your subject “micrornas” is written many different ways and you would have to guess the EXACT way that the researcher listed it in his paper to get a hit. Micro-rna is listed as singular or plural, with or without the dash, sometimes only as mirna etc. So when searching for authors, using keywords from the title can be a problem if you don’t use the EXACT WORDS that the author used. When you are searching for a subject, if you use the official subject heading, it doesn’t matter how the author listed the term, you will get all references on that topic. This is why searching by using subject headings is so important to a comprehensive literature search.
Carol J.
Thank you everyone, for all the advice and attention. I think my point may have gotten a little buried in my rant.
I don’t think I should have to be, or enlist the services of, a medical librarian in order to do a simple search on a literature search engine. PubMed should be an intuitive search engine such as Google, or others. I don’t know of many researchers, either MDs or PhDs, who have had extensive training in computer science or search algorithms. I am going to go out on a limb and say that I am representative of many other biomedical researchers in my struggles with PubMed. I am trained in Cell Biology and Virology. PubMed should be tuned to my needs and my skill set. I should not have to tune to it. Harsh as it may sound, PubMed is most useful for biomedical professionals, not for medical librarians or for computer scientists. Yes, if I devoted an afternoon or more to learning the system I dare say I would become a proficient, but my question stands – why should I have to?
Are there lessons here for both librarians and researchers? Medical librarians are talking about this issue on their MEDLIB listserv – take a gander.
My feeling is that cell biology and virology searching is particularly tricky on PubMed and on the other Entrez tools. For example, I know that certain virus-related searches (Anna’s area) can only be keyworded. Could this be the reason why Anna is struggling with PubMed and why she feels that literature reviews are particularly challenging there?
It must be extremely difficult to cumulate the literature where you have to work around lack of MeSH and keyword variants.
http://weblogs.elearning.ubc.ca/googlescholar/archives/045699.html
Are there lessons here for both librarians and researchers? Medical librarians are talking about this issue on their MEDLIB listserv – take a gander.
My feeling is that cell biology and virology searching is particularly tricky on PubMed and on the other Entrez tools. For example, I know that certain virus-related searches (Anna’s area) can only be keyworded. Could this be the reason why Anna is struggling with PubMed and why she feels that literature reviews are particularly challenging there?
It must be extremely difficult to cumulate the literature where you have to work around lack of MeSH and keyword variants.
PubMed’s actually pretty clever at recognising author names, as long as you get them the right way round (Surname Initials). Put them in quotes and put [AU] at the end if you want to be sure.
The problem with that specific query was that, yes, PubMed only indexes abstracts wherease Google Scholar has the full-text to work with. Nothing wrong with using two search engines to find a paper though.
One other thing: if you have the reference for a paper, try pasting that directly into CrossRef or HubMed’s citation parser.
Alf Eaton (and others) are technically correct, in that PubMed itself doesn’t “index” full articles. But professional indexers do, and assign subject headings (MeSH) to the articles b/f they’re put into MEDLINE, and then PubMed searches the subject headings, title/abstract, and some other fields.
Anna, as an undergrad a zillion years ago, I always felt like I was smart enough to figure out how to use the library on my own, and that it was somehow an admission of weakness on my part to actually ask a librarian. But you know what? Your tuition dollars are helping to pay for librarians at Countway and other Harvard libraries, and they can sit down with you and show you the ins and outs of the database (and PubMed is a database, NOT a search engine), and probably introduce you to some other resources (like Web of Science or Bio. Abstracts) that will help you in the future. Google does search full-text of articles, which is great, esp. when you’re looking for a new or obscure concept, except that it’s not good at letting you fine-tune a search so that you get fewer than 2000 articles.
One of the “problems” with PubMed is that it serves a huge variety of users—clinical medicine, nursing, bench science/research, public health, etc., and it’s very hard for a database to be all things to all people—esp. when it covers 5000 or so journals, in multiple languages, and contains over 17 million citations. MeSH is great for clinical medicine, maybe not so good for other areas. One reason medical/science librarians are useful is that we spend so much time in these databases (I probably spend 20+ hours per week in PubMed) and know the ins & outs and tricks of the trade. Yes, you know your science, but we know how to get to your science, and can work with you to maximize PubMed’s potential.
When I teach PubMed classes, I emphasize two points: 1) the great thing about PubMed is that there’s at least 3 different ways to do just about anything—but that’s the frustrating thing, too. 2) if you’re spending more than 15 min. looking for something, and not getting anywhere, CALL ME—or another librarian. I get paid to know how to use PubMed and help other people figure it out. You (the general you) get paid to do research (or treat patients, etc.).
Anna,
One of my colleagues, Carrie Iwema (she has PhD in Neuroscience and is also a librarian) authored a post for the Bitesize Bio:the molecular and cell biologist’s companion blog titled, “18 Ways to Improve your PubMed searches” bitesizebio.com/2008/03/05/18-ways-to-improve-your-pubmed-searches/
The post has been very popular, perhaps you’ll find the information useful.
I’ve been following this post for a few days now and wow am I shocked by the unwaivering fanboy support of Pubmed.
Pubmed isn’t an iPod with a polished interface that “just works” and it’s certainly not your child (although, if you did code it and are reading this, please take note!). I do not, therefore, understand all the hatred that has spewed out. I have several specific problems with Pubmed and with science online in general. Here’s a wonderful list.
1) A major problem is the lack of a simple user interface. Google can get away with having a homepage with one text box…. google has amazingly talented people and huge resources and has made simplicity rather elegant. Pubmed is lacking one or all of these traits and therefore needs to compensate by promoting more user input.
Don’t get me wrong, you can do it all in one box, but it turns out that that box is about 100 pixels wide and after i have 3 or 4 search terms in there with [TI] and [Jour] (why can’t this just be journal??) and all sorts of other hard to remember qualifiers (“Codes”:http://healthlinks.washington.edu/howto/pubmed_search_tags.html), I can’t see what I’ve written anymore. A simple input page like google scholar’s advanced search would be a welcome improvement. “Scholar”:http://scholar.google.com/advanced_scholar_search?
Also, why can’t I “search within a search” or get prompted for relevant terms that may make my search better?
2) Let’s use a recent review for this one… “Protein translocation across the eukaryotic endoplasmic reticulum and bacterial plasma membranes” Naturally, if I’d like to search for this, i might type in the title… and being the astute pubmed user, I realize I should be using the title [TI] qualifier. Type in “Protein translocation across the eukaryotic endoplasmic reticulum and bacterial plasma membranes [TI]” and… wha? Where’s the paper? I don’t know, but it wasn’t found.
3) Why is it so hard to get PDFs and when you get them, why do they have absurd filenames? I understand that many (most?) PDFs are only available with a subscription, and I’m grateful to PubMed Central (PMC) for doing its part to make more PDFs readily available, but why aren’t the PMC PDFs linked to on the search page directly so I don’t have to click through 4 web pages? Also, are the files that PMC has used for full text search or are only their abstracts still searched? I don’t know!
As for the filenames, why can’t they be something helpful to the scientist like mp3 names (ie “First author – Year – Title.pdf” or some variant) or better yet, why can’t they be machine readable by a program like iTunes (as in PMID.pdf)?? Then the computer can do the organization and I don’t have to download the same paper 30 times. Check out this program to see what COULD be done “iPapers”:http://mekentosj.com/papers/ Pubmed could certainly begin the initiative with its PMCentral files.
4) Don’t be so clinician-centric. Let’s pretend that some people work on things that aren’t humans. There are important model systems like yeast and E. coli…. there are even some people who work on Archaea, whatever that is. Advanced search for pubmed (called Limits for some unknown reason) offers odd parameters like age, sex, and twin studies. I didn’t see E. coli listed.
5) Along the lines of point 4, i see there’s a suggested search strategy on pubmed for all things smallpox—a disease that’s lived solely in two freezers on opposite ends of the earth for the last 30 years. “Smallpox”:http://www.nlm.nih.gov/services/smallpox.html Looking through this, I realize just why Pubmed sucks. poxvirus is not equal to poxviruses which is not equal to pox virus. Why should the researcher have to non intuitively type in every possibility? And when would he know to stop… when all the bases were covered?
Anyways, feel free to answer some of these questions or better yet, attempt to fix the problems, but I would rather not sit through another librarian course that gets outdated (and forgotten due to complexity) soon after the hour was wasted. No offense
Sorry to be tardy to the debate here…
Anna, what I was referring to at the panel you mention goes beyond PubMed, which is indeed a pretty good full text search engine, especially if you know the tips and tricks. The problem here is more fundamental.
Google and other modern search engines work because there is a massive network of hyperlinks between web documents. The page with a given word with the most incoming links gets the high ranking for that word (gross simplification but generally the case) with some adjustments for “popular” pages sending in links counting more than unpopular pages sending in links. This is a function of the data formats of the Web and the culture of open, permissive hyperlinking. The crowd’s decision to link gives us a much better sense of what is relevant, not just what exists.
Scholarly papers aren’t generally full of hyperlinks. At best they’ve got citations. That means that PubMed is stuck doing pre-Google keyword searches, not relevance based searches. You can use ontologies and vocabularies (MeSH for example, or the gene ontology) to get incrementally better results. Check out Alf Eaton’s work at HubMed in particular.
But we don’t need incrementally better search. We need nonlinear change. We need access to the backfile so we can go and create hyperlinks where they don’t exist, so Google starts to work better. We need to create links between the mention of P53 in a paper and the NCBI entry for P53. We need to take the iHOP network and explode its size, scale, scope, and accessibility. And we need a bunch of smart folks figuring out – and competing with each other – to provide killer search applications, query builders, query visualization tools, and so forth.
That’s what the web gives us in culture. Massive scale of data, crosslinked in ways that can be captured and analyzed, letting us use some crowdsourcing to make guesses about relevance. Competition and innovation where little companies try to become the next Google. Systems that run from a web browser without training. That’s what we need. Not incremental improvements to PubMed.
Anyhow, as usual, Your Mileage May Vary.
Hi Anna,
Thank you again for answering some of my questions. I’ve put them on the blog for others to have a peak:
http://weblogs.elearning.ubc.ca/googlescholar/archives/045760.html
Best of luck with your dissertation
Dean
To use any search engine effectively, you have to have some understanding of what a search engine “sees”. Several comments refer to “text” searching. PubMed is NOT fundamentally a text search engine.
PubMed is a search engine for the the indexing of articles. This is deliberate because the aim of PubMed is to be as close to universal as possible in the biomedical journal literature.
PubMed does not, in general, ever look at the text of the article for reasons that fall into two categories, historical limits on storage and processing [which are largely within current capabilities] and legal limitations, in other words copyright law.
For PubMed to search text, it must have access to the text. To do so would require permission and in the real world, hundreds of millions of dollars in licensing fees. The alternative is that the first article scanned would leave the NIH open to a court injunction shutting the entire NCBI down. Technically, someone could face felony charges and one year in prison and $100,000 minimum fine; but the injunction would do.
Google spiders look at sample pages, text out of embargo, and a few odds and ends and gives the impression, and it is entirely an impression, of searching text. Google does not care if it’s search is universal or that it is only looking at an portion of an article or at text only available at one of those libraries you disdain that paid several million dollars so their users can get to the article text. In the majority of cases, all Google is looking at is PubMed because that is the only source that the spiders can reach!
As to the specifics of the search Anna Kushnir described:
PubMed lists authors for indexing as Last-Name[space]initials [no spaces]. Recently, PubMed added the full name, as presented on the article. That is important because I know from experience that as simple a name as “John A Doe” can end up listed 8 different ways.
It may not be “intuitive” but the Help would have told you how to use the tag for an author name. Remember, that PubMed is a medical database there are thousands of syndromes named for the person who described them.
As an aside, librarians created the first large tagged public databases in the 1960’s.
Second, what PubMed looks at is publication information, title, authors, abstract [if available and permission granted], and indexing terms applied by the National Library of Medicine [NLM]. That is all, period. PA K252a will only be searchable if it is in the title or abstract or an index term. As suggested above, there are tools to search for genetic information. Again, you never look at the text.
I should point out that you can search genetic information because the research was government funded and the information was public domain. Drug names are mostly proprietary or temporary and are not used for indexing until a generic name is approved. Until that time, NLM indexes to the class of compounds.
In the future, largely because librarians fought for years to get publicly funded research released from publishers into the public domain, you will be able to search an order of magnitude more text. PubMed is pioneering that search at the moment at PubMedCentral.
Hello Anna, nice post and some interesting responses. PubMed is far from perfect, but it’s well worth reading the manuals fully. I’ve found the linking to pubmed guide and PubMed Search Field Descriptions and Tags pages handy in the past, if you haven’t seen them already.
Hi, Anna,
Great post. I’m a “straddler” in this discussion: I was a neuropathology tech for 20 years, I got my master’s in library science 2 years ago, and I am now a basic sciences support librarian who is getting a second graduate degree in bioinformatics, doing my thesis work in computational analysis of herpesvirus entry glycoproteins.
With feet in both worlds (research and librarianship), I understand the frustration of both sides. Until I became a librarian, I thought I was a GREAT literature searcher, but I realized I wasn’t that good at it when I found out that there was more to various Medline interfaces (PubMed and OVID) than I had ever tried.
Your comments make so much sense: the interface should be more intuitive than it is. I guess one could say the same for many interfaces. I use a number of bioinformatics tools weekly, and some of their interfaces require hours of tinkering to fine tune for my purposes, and they have nothing to do with library or bibliographic literature!
I think when you make a search interface simple, you run the risk of making the results too broad, or in your example of Google Scholar, too inflexible to further sorting or organization.
Your experiences in contributing to Web 2.0 search applications are fantastic. I think that a lot of designers of search tools wish that more researchers could have been involved from the roots of the project when some of the current tools were being developed.
It’s really hard, as a biosciences librarian, to get the research community to believe that I can help them. They don’t have a lot of time to spare. Researchers are the quintessential “do-it-yourselfers” in academia: that’s the nature of research—so they will be less likely to come to the library staff for assistance. Clinicians, on the other hand, seem to be more willing to call a librarian because they don’t want to take the time trying to figure out how to formulate their searches.
It’s interesting: both sets of professionals don’t have a lot of time to mess around with searching, but they deal with their search problems in different ways.
Thanks for the post and the incredibly considerate way you have been responding to the comments.
Thanks to Dean for the interview with Anna.
Anna, good luck with your thesis…I know it’s a crazy time…
Best,
Pamela
@ Karl Erlandson (and Anna of course)
With respect to your second point, Karl: you can easily find a specific (known) paper in PubMed by using the Single Citation Mapper, at the left hand side of the PubMed screen (blue). Besides title words you can fill in the (first) author(s), year of publication, first page (very handy this one, because it is specific). And then press GO! This way it took me a few seconds to find the paper you referred to.
1: Nature. 2007 Nov 29;450(7170):663-9. Protein translocation across the eukaryotic endoplasmic reticulum and bacterial plasma membranes. – Rapoport TA.
In this case (where the title is kind of unique) you can also typ the whole sentence in the main search bar (no codes whatsoever) and you get 6 titles, the first being the one you look for. If you used the limit-button for ‘Title’ you just got 1: the right one
Perhaps you even knew too much about PubMed codes. The [ti] code does work but only following individual words. You have to fill in: Protein[ti] AND translocation[ti] etcetera. PubMed does not always recognize a phrase. (OVID MEDLINE however does, that’s why I often prefer OVID).
But the main issue here is that it would have been very easy to find this paper had you known the existence of the Single Citation Mapper.
Now you may object that it is a shortcoming that this trick isn’t more obvious and I would agree. Much of the tools of PubMed are ‘hidden’. However one class of an hour or two would be sufficient to learn you the most basic tricks. I can’t understand why scientist do find it a waste of time to spend two hours to learn the basics of PubMed, while it will save them hours.
Yes I’m a librarian (shame on me ;)) but one who has its PhD and worked for about 15 years in the lab. I understand and share some of your frustrations, especially when you’re not working in the field of medicine. (But what do you think PubMed stands for: Pub….Med!!)
For instance, I find it counter-intuitive that EGFR-inhibitors are indexed as Receptor, Epidermal Growth Factor/antagonists & inhibitors (and this is to broad when you want to differentiate between the specific agents) and that Signal Transducers and Activators of Transcription (STAT) is automatically split in Signal AND “transducers”[MeSH] AND activators AND “transcription, genetic”[MeSH Terms], whereas there is now a good single MeSH: STAT Transcription Factors.
However, now I know how it works I can rather easily find the correct terms (MeSH or textword). Try to do this in Google or Google Scholar!!! You have to think of all the synonyms yourselves and come op with nonsense that ranks the highest. I do use Google Scholar though, but restricted (nice to find things mentioned in the Method Section for instance, you will never find that in PubMed). But the rubbish you sometimes get on a subject search… pfff.
Making PubMed simpler? They might make it more ‘transparent’ if you like. But they already simplified it by giving it the appearance of Google. That’s why many people (especially of the young generation) use it like Google. Much of PubMed’s functionality getting lost here.
@ Anna, a lot of success with your thesis. You’re almost there!