• FnL

    • Pubmed Faceoff

      Monday, 09 Jun 2008 - 15:46 UTC

      I find the science of face perception fascinating. The human brain is highly tuned to identify, process and interpet faces – understandable, as they play a tremendously important role in our social interactions. It’s a hardwired proficiency that kicks in early and if anything works too well (Toast. Ebay. $28k. Say no more).

      Chernoff Faces are a visualization technique developed in the 70s to take advantage of our innate ability to detect small differences in the size, shape and expressions of human faces. The idea is to take a dataset and then map each dimension to a different facial feature, be it the slant of the eyebrows, size of the nose or the chubbiness of cheek (Herman Chernoff, who came up with the idea, suggested ten different possibilities).

      It’s an appealing concept. Sadly Chernoff Faces never really took off, possibly because existing implementations don’t produce anything that looks like a face. You’d have a hard time finding anybody who prefers the faces produced by R to the data table they were derived from.

      Computer graphics have moved on a bit from 2D lines and circles, though. Photorealistic 3D facial models are de rigeur nowadays in everything from Second Life to video games. What if we took the technology from there and applied it to Chernoff Faces?

      I gave it a go. Check out Pubmed Faceoff (and be gentle – it hooks into other webservices and can be quite slow).

      Pubmed Faceoff is a mashup of Pubmed, Carl Bergstrom’s Eigenfactors dataset and Scopus, inspired by something that Pierre Lindenbaum mentioned on Twitter. It renders PubMed results as a set of photorealistic Chernoff Faces whose facial features are determined by the age, citation count and journal impact factor associated with each paper. The idea is that you can tell at a glance which papers are new, exciting and high impact and which are languishing, uncited and unread.

      I’m quite pleased with how the system turned out although to be honest I still think the usefulness of Chernoff Faces is debatable. Does it actually work? Is the amount of time it takes you to adjust to scanning the faces more than the amount of time it’d take to simply scan a table of data? Or is it just cute?

      The gender and ethnicity of each face are picked at random to add a bit of visual interest but personally I find it slightly easier to interpret the faces when they’re all male and European. That I’m rubbish at reading women comes as no surprise but the ethnicity thing is interesting as it fits with research into cross-race facial recognition that suggests we’re each better at recognizing the types of faces that we see every day.

      While the photorealism helps it’s important with Chernoff Faces to map dimensions to the right features to aid comprehension. It definitely helps that it’s a short logical leap from ‘happy faces’ to ‘happy papers’ (in good journals that have been cited lots). The age feature for age of paper is also a no-brainer.

      It’d be interesting to incorporate other dimensions into the faces, though. Perhaps the number of authors of a paper could determine how fat or thin a face is? A spotty complexion could indicate a first time author? Nature papers could be represented by Chuck Norris?

      update: for more on the ‘sort by impact’ idea have a look at the commentary surrounding Pierre’s original tweet.

      Last updated: Monday, 09 Jun 2008 - 15:46 UTC

      • Comments

        • Date:
          Monday, 09 Jun 2008 - 16:11 UTC
          Matt Brown said:

          Awesome post, Euan! A couple more ideas…

          Bigness of head proportional to number of self-citations.

          Greyness of hair depends on time between submission and publication.

        • Date:
          Monday, 09 Jun 2008 - 16:45 UTC
          Euan Adie said:

          As Ian suggested the real possibilities come once they’re animated. And can speak with Cockney accents.

        • Date:
          Monday, 09 Jun 2008 - 17:49 UTC
          Graham Steel said:

          Excellent excellent mashup. A bit more mashing and…….bingo

        • Date:
          Monday, 09 Jun 2008 - 17:52 UTC
          Sabine Hossenfelder said:

          Nobody wants to publish my paper!

          ;-p

        • Date:
          Monday, 09 Jun 2008 - 18:56 UTC
          Richard Akerman said:

          I will never see impact ranking in quite the same way. Neat work.

        • Date:
          Monday, 09 Jun 2008 - 19:53 UTC
          Pedro Beltrao said:

          Amazing as usual ;)

        • Date:
          Monday, 09 Jun 2008 - 21:40 UTC
          Henry Gee said:

          He’s got his father’s ears.

        • Date:
          Monday, 09 Jun 2008 - 22:25 UTC
          Scott Keir said:

          That’s very very cute. I wondered why you hadn’t been blogging in a while.

          I wonder how wise it is to use colour randomly though – we (well, those of us not colourblind) attach great importance to colour, not least it being one of the most obvious things for us to tell apart. So does the random colour thing help or hinder?

        • Date:
          Monday, 09 Jun 2008 - 22:36 UTC
          Scott Keir said:

          Oh, and I love Chuck Morris. Is it an interesting paper?

        • Date:
          Monday, 09 Jun 2008 - 22:38 UTC
          Euan Adie said:

          I think it probably hinders. But it’s prettier!

          As skin tone and masculinity / femininity are two things that you immediately pick up on when you see a face it’d be good to be able to use them as features. I’m not sure I want to be the one who decides which gender and ethnicity gets to represent papers that don’t get cited though.

        • Date:
          Monday, 09 Jun 2008 - 22:46 UTC
          Euan Adie said:

          Also: Chuck Norris destroyed the periodic table, because Chuck Norris only recognizes the element of surprise.

        • Date:
          Tuesday, 10 Jun 2008 - 05:13 UTC
          Pedro Beltrao said:

          Now that I played a little bit with it I agree that is easier to see the differences when setting every face to the same skin tone and gender. I am curious on how you generate the faces.

        • Date:
          Tuesday, 10 Jun 2008 - 15:07 UTC
          Frank Norman said:

          I found the darker skin tones made it harder to see the features. To be honest, I think cartoon faces would make it a whole lot clearer.

        • Date:
          Tuesday, 10 Jun 2008 - 15:19 UTC
          Euan Adie said:

          @pedro They’re pre-generated. I scripted Facegen using AutoIt to render all the different possibilities of face, then left the PC on overnight.

          The problem with the pre-generated approach is that adding a new feature increases the, uh, face space exponentially. There’s an SDK that could do it dynamically but it costs $$$.

        • Date:
          Tuesday, 10 Jun 2008 - 23:37 UTC
          Scott Keir said:

          I vote for Matt spending all of NN’s pocket money on commissioning you to do a version of this as avatars for NN’s bloggers – with ethnicity and sex based on each blogger’s details, and the remaining features based on linkages, technorati authority, age of last post, etc.

        • Date:
          Wednesday, 11 Jun 2008 - 15:35 UTC
          Eva Amsen said:

          My paper’s face is so scary and ugly! It’s the “good journal” lady with the “few citations” desperate expression. (Also, it says I have 2 citations, but Google Scholar says I have 3. I’m grasping at straws here, but if paper apparently sucks as much as mine it matters!)

        • Date:
          Wednesday, 11 Jun 2008 - 15:40 UTC
          Eva Amsen said:

          Oh, it looks a lot better if I change gender and ethnicity. He just looks a bit flabbergasted now, and I instantly feel better about my work!

        • Date:
          Wednesday, 11 Jun 2008 - 15:55 UTC
          Euan Adie said:

          Yeah, though a cool service Scopus sometimes lags behind on citations as it’s (allegedly) powered by thousands of Vietnamese street orphan curators. They can only read so many papers per day… so if you were cited recently the database may not be up to date yet.

        • Date:
          Thursday, 24 Jul 2008 - 14:38 UTC
          deathraypizza software said:

          If you have a mac you should try PubSearch, a new, fast, efficient tool for searching pubmed : http://www.deathraypizza.com

          It’s much faster and easier to use than the pubmed web interface


Search blogs

web feed Want a blog?

Submit this post to

Advertisement