Metrics, The art of Counting

Ian Mulvany

Wednesday, 28 May 2008 09:51 UTC

First off, thanks for the event last night, I found it really enjoyable.

I wanted to start this topic in order to put up some information about what the actual metrics are that are currently being used, and to hear from everyone about various pro’s and con’s of each. I think the question of whether we should measure, and the implications of measurement (gaming, altering research behavior and so forth) certainly deserves it’s own topic, however the fact is that we do measure and so we ought look closely at the measures that we use.

There are a number of discussion across NN on the topic of impact factors,

see H-index and self citation.

Impact factor revolution.

Impact Factor

The impact factor is the most influential metric. It was created in an ad-hoc way by Eugene Garfield in the mid 1950’s. It’s not been altered since. There is a reasonable
Wikipedia article about it and a good plos editorial discussing some of the problems with how it is calculated.

The calculation is done as follows:
A = the number of times articles published in 2001-2 were cited in indexed journals during 2003

B = the number of “citable items” (usually articles, reviews, proceedings or notes; not editorials and letters-to-the-Editor) published in 2001-2

2003 impact factor = A/B

(note that the 2003 impact factor was actually published in 2004, because it could not be calculated until all of the 2003 publications had been received.)

One of the big issues is that ‘citable items’ is not well defined. Another issue is that it measure not individual contributions, but some kind of an aggregate score for a journal. This may well have been a major cause for the rise of strong stratification in the journal market. A journal’s impact factor will tend to be driven by the top 10% most cited articles. That’s a rule of thumb, but it’s stacked up pretty well in most cases that I have looked at closely.

H-Index, or Hirsh-index

This has become a topic of much discussion recently. I think for a few main reasons. It is that it is very easy to calculate. It represents and individual’s contribution. It just seems to make sense when you understand what it is. It was proposed by a physicist Jorge E. Hirsch (who will forever now be probably be more remembered for this contribution than for any physics he does).

It is calculated thusly:

A scientist has index h if h of his Np papers have at least h citations each, and the other (Np – h) papers have at most h citations each

You list all of your publications in order of citation, from highest to lowest. Running down the list, the point at which you position on the list drops below the number of citations for that paper is your hirsch index. So for example if you only have three publications and they all have more than three citations, no matter how many citations they have, your h index will only be three. In contradistinction, if you a thousand publications, but each is only cited once, then your hirsh index will be one. It is a measure which tries to balance depth of contribution with volume. Average h indexes for tenured faculty tend to be in the tens. Most Nobel laureates have an h index somewhere in the 80’s, and the person with the greatest h index is Edward Witten with an h index exceeding 130 (confirming him, by this metric, to be the true reincarnation of Einstein himself). If you are interested you can go and look up your own h-index here.

Mathematics of Networks

Google’s page rank takes a random walk over the nodes of a network and measures the liklihood, given the link structure of the network, of arriving at a paticular node (hence the appearance of spam pages filled with self-links over the last number of years). At the moment all academic papers are not ‘googleable’, and we cannot freely ride the citation graph, but we are getting there. There is hardly any doubt in my mind that when this becomes not only simple to do, but also simple to program against, a slew of network metrics will be instantly calcuable on the graph. There are many different type of measurements that one can do on such a graph, but rather than describe them here myself, there is a group actually working on this. This is the MESUR group at Los Alamos. You can see the slides from a recent talk.

In the end David Colquhoun’s admonition that we should just read the papers to determine the quality of the work is a great ideal, and it is certainly the only way that scientists can determine the value of the contributions of their peers in the published literature. However there remain many of us who are interested in what is happening in science who are not conversant with the details of particular fields, and we depend on derivitave indicators. Beyond the published literature there are many growing areas of contribution that at present are almost totally ignored by “The Man”, such as blogs, for a long time develoment of computational tools, making data sets open. This last point also deserves it’s own topic, so I’ll not dwell on it here.

  • Replies

    Post a reply
    • Excellent post, Ian, thank you.
      Nature has published quite a few news articles and comments on the h index, which are listed here. Philip Ball’s two news stories (here, 2007; and here, 2005) are good assessments, I think.

      As discussed last night, there is also the Scimago database that allows self-ranking, as described here in a Nature news story by Declan Butler. You mentioned last night that Scimago is one of a subset, so you might have more to write about it here.

    • The Scimago Journal Rank Indicator (SJR) was actually developed from Google PageRank. The formula is complicated, but just like PageRank a citation counts more if the citation is from a journal that receives many citations itself. The SJR was compared to Impact Factors in this FASEB Journal Paper.

    • There are three separate problems that need to be kept distinct.

      (1) Are any sort of publications metrics suitable for assessing people?
      (2) Are any sort of publications metrics suitable for assessing institutions?
      (3) How accurately can each sort of metric can be measured.

      There is little point in discussing (3) unless the answer to (1) or (2) is yes.

      It is very easy to see that the answer to (1) is NO, simply by applying the proposed measure to someone who commands universal respect in you own field. The examples I gave here (pdf version ) show that citation counting is barely less daft than using impact factors, something that even the wretched Eugene Garfield disavowed. (See also Colquhoun D (2003). Challenging the tyranny of impact factors. Nature 423 , 479).

      I don’t think it helps very much to point out that “Most Nobel laureates have an h index somewhere in the 80s”. That is being wise after the event. What we need is a way to spot the future Nobel prizewinners when they are still unknown. That, no doubt, is impossible, but it is very important to have mechanisms that avoid firing the future stars because they have not got big enough grants or enough citations. That risk is by no means negligible, as my examples show .

      The second question, are any sort of publications metrics suitable for assessing institutions?, is perhaps more difficult. The argument against using methods like that is partly their undemonstrated worth, but also the distortion of science that their imposition will undoubtedly produce. The pressure to produce cheap headline-grabbing work will be enormous. The long-term reputation of UK science will surely be damaged by this sort of bean-counting approach.

      It was put rather well in a recent ZDnet newaletter

      “Creation is sloppy; discovery is messy; exploration is dangerous. What’s a manager to do? The answer in general is to encourage curiosity and accept failure. Lots of failure.”

      The Washington Post recently reported (apropos of the Howard Hughes Institute)
      “Today’s medicine is the beneficiary of scientific inquiry that took place decades ago,” “Our goal in funding the basic biomedical sciences is to lay the groundwork for the medical discoveries that will take place 20, 30, 40 years from now.”

      The UK seems now to be heading in exactly the opposite direction, and that does not auger well for the UK’s performance “20, 30, 40 years from now”.

      Finally, I have a question. Now that we are in an era of full economic costing, why is HEFCE assessing research at all? It is surely not their job now. They are, after all, the same folks who seem to be happy to fund BSc degrees that teach amethysts emit high yin energy . That does not give one much confidence in their ability to judge science.

    • @Maxine,

      Here are some other people who are working on this kind of thing:

      Eigenfactor.org

      The Mesur project as linked to in the opening post,

      The publication harvester

      I have some links to these and other interesting sites on my connotea account http://www.connotea.org/user/IanMulvany/tag/citation

      But the general point I was making last night is that even when you consider the Page Rank algorithm it is only one way of looking at a property of a graph. Other very interesting items to look at, which have so far not really made much headway into the popular consciousness of people who are interested in the spread of influence in science, could be:

      - clustering coefficient
      - betweeness centrality

      And plenty others that escape me right now.

      I’m going to be heading to this meeting http://www.ifr.ac.uk/netsci08/ later this year and hope to see some interesting results.

      And you should all check out this essay:

      Taraborelli, D. (2008)
      Soft peer review. Social software and distributed scientific evaluation

      it’s on this page: http://nitens.org/taraborelli/papers

      @David,

      I grant your points, and I mentioned in the opening of this forum that these are all very important points, but that I was going to restrict my discussions in this forum topic to the mechanics of metrics. It is a fact that they are being used. I’d prefer to keep this forum to this more mechanical discussion. I invite you to open another forum topic to discuss the issue of whether they should be used at all. I think that that is a much more important topic than the issue I’d like this forum to address (of course if the discussion does head away from that point then I will gracefully fold my hand and jump in here). (I’d start one myself but I’m about to run out of battery, eek!)

    • I’ve made David’s points into a separate forum discussion, Are publication metrics appropriate for assessing people and/or institutes?

    • FWIW, I uploaded a bunch of references to Connotea. I have even more, for which Web of Science couldn’t provide a suitable reference (some had the wrong DOI!). I haven’t gone through them yet, but someone might find them useful.

      I tried to import citations to Hirsch’s h-index paper, but Web of Science thought only two of them had DOIs. Grrrr.

    • Thanks, Bob. I also have a bunch of references on Connotea, see links list on left-side vertical column of Nautilus, using keyword citation. I haven’t added to these for a while but I will now add the keyword “impact factor” as well so that they merge with yours.

      Thanks again. Anyone is welcome to add to the citation and impact factor Connotea libraries; and at Nautilus there are links to some more libraries I’ve created on authorship-related subjects for anyone to use. (I must get to updating them!)

    Post a reply

Search forums Advanced search

web feed

Submit this topic to

Advertisement