My brief excursion into the open source software world this week has a phrase ringing around in my head.
It’s from Bryan Kirschner at Microsoft (he has the once-unthinkable title of “Director of Open Source at Microsoft”). While we were talking over dinner he came back repeatedly to the idea that if you’re not going to ship code, you should share the code.
This is an idea that could really benefit the science community. So much work gets left behind on the laboratory equivalent of the cutting room floor that the adoption of this piece of open source philosophy would be welcome.
But, as I get tired of saying, science is a lot more complicated. It takes some work to make the stuff on the cutting room floor useful for other people, whether it’s data, or lab protocols, or DNA vectors. Some of that work becomes part of the lab’s institutional memory and finds its way into other projects at other times. Ship it or share it is going to have a hard road to hoe before it becomes a widely accepted policy.
I would however love to see this become a piece of open notebook science. I’m not sure that the hot stuff is ever going to be in open notebooks, but it’s a good place to do quick and dirty micropublication of otherwise lost information.
The issue of how to cite and what citations mean in such an environment is an interesting one, however – you don’t get credit for musing about science, you get credit for proving stuff. We need to have more ways to measure the geneaology of ideas than simple systems based on antique systems of citation, too.
There is Nature Precedings!
I’ve used my blog a couple of times to put up ideas that aren’t big enough to make into a full paper. I’ve had people wanting to cite what I’ve written, and I agree that giving credit could be a problem. But then if I was worried about someone stealing my idea without credit, I wouldn’t blog it in the first place.
On code specifically, I do a lot of statistical analyses, and this often involves writing code in R or BUGS, so I’m moving more towards including my code in supplements to papers. Last week I was in Dublin for a stats meeting, and they had a “meet the editors” session. One problem I can see in the future is that there won’t be standards for depositing code (it’s a few hundred lines at the most, and usually much shorter), so it will be a pain to sort through. Only one journal was looking at standardising formats, which I thought was a bit disappointing.
Should there be a more formal system of micro-publication? Essentially, micro-papers would need a doi and I guess would need some level of refereeing, but if they were short and quick to review it might not be too much of a burden.
Absolutely right, John. And amazing coming from Microsoft. Incidentally, I presume from your photo that John Wilbanks is really a pseudonym for Bill Gates?
@Bob there are standard formats and places for dropping code (sourceforge, googlecode etc) but they aren’t widely used by scientists (although this is changing). I think the problem is assuming that it should somehow be ‘in the paper’. PDBs and genbank files are citeable, and don’t have doi’s, so at the end of the day, why can’t you just cite a blog post? Having just been at a meeting which had a significant focus on archiving I can see the problems but in many ways blogs may be more stable than some online journals are – and as Craig says you can always drop a copy in Nature Precedings or use Gunther Eyenbach’s webcite service if you’re worried about it.
@Brian – I hadn’t really twigged before but yes that photo does make John look a lot like a young Bill Gates. I can attest, however, to the fact that John does really exist and doesn’t really look that much like BG in real life. Shame really – great potential for jokes in that.
photo changed. gaaaaaahhhhhhhhh!
LOL