perl vs. python

Eva Amsen

Thursday, 22 May 2008 21:55 UTC

My only Bioinformatics experience is as a user. I’ve TA-ed a course for 4th year Biochemistry students where they learn how to visualize proteins, make phylogenetic trees, sequence logos, alignments, etc. and have to understand how the programs work at a very basic level (eg. what is Hidden Markov Model)

I’d like to learn the other end of the field and want to teach myself a useful programming language. I’ve been leaning towards Python after several people’s input, but I know that Perl is more established in bioinformatics. Would I be able to be of any use if I knew only Python? Would I be behind the times if I knew only Perl? Is either one harder to pick up after knowing only the other?

If you had the choice to learn one of these languages from scratch, which would you pick?

(I feel the same way as when I had the choice between learning to ski or learning to snowboard.)

  • Replies

    Post a reply
    • Hi Eva,
      I think that any language is a good language as long as your program is doing the right job.
      Python, perl , R, php, ruby, C, C++, java or whatever are all a good choice .

      However I suggest you master a scripting langage (PERL/Python) and a compiled one (C/C++/java) especially for the large projects.

      Hope its helps

    • I agree with Pierre.

      Any language that helps you get the job done is good. Of equal importance is approach. As you know, scientific domains are very changable: learning and using tools such as automation and testing (in whatever language) can help you retain flexibility and speed up the development process.

    • I agree with Pierre – what matters is that you can get the job done; and it’s good to know (at least) one scripted, one compiled.

      Libraries is another consideration – you can save yourself a lot of coding if other people have gone before you. Many of the major languages have “Bio*” libraries (BioPerl, BioPython, BioRuby, BioJava) – of which BioPerl is the oldest, largest and most-established.

      I started with Perl ~ 8 years ago, largely due to BioPerl and Perl’s long pedigree in bioinformatics. If I were starting over today, I’d probably go with Python or Ruby.

    • Any language is NOT a good language to start with! Don’t start with a scripting language like Perl or Python. You will acquire bad programming habits that will be dangerous and difficult to get rid of. Learn Java. It is a modern GUI cross-platform language with a huge user base, and enforces good programming practices.

      I don’t buy the arguments about all that matters is getting the job done. It seems that you don’t have a particular job you want doing. What you want is to acquire the skills to do any job that emerges. As you know, there is already software available to do all the common bioinformatics tasks, so you’re not going to be doing that.

    • I do agree with Neil, the BioPerl libraries are far better established than the BioPython libraries. However, I’m not going to sit on the fence when responding to this one.

      Learn Python. If you are at the beginning stage, the learning curve is far less steep for Python, and you’ll make more progress in a shorter time. This particularly comes in to play when reading other peoples code … Python is often inherently more readable. In the longer term, you’ll know a good, clean language which is far more manageable than Perl when scaled up to larger projects. Python is also emerging as popular language for web applications, more so than Perl. Note that I’m biased, since I mostly only code Python now, after learning some basic Perl then deciding I could write far more comprehensible and scalable code in Python. In the end, both languages can do most tasks, just with different degrees of pain, so it might be worth trying out a few tutorials for both and see which one sticks.

      (Hey, and once you know both Perl and Python, you can port some of that good BioPerl stuff over to the BioPython project :) )

      I hope my comments won’t cause this thread to disintegrate into a religious war … I’m not really anti-Perl, just pro-Python.

    • As other people in the forum suggested, Perl has long history in bioinformatics. But if you are a starting programmer, this is not a good time to start. Perl6 is around the corner. You might want to wait for it to be released. On the other hand you won’t get a job in bioinformatics unless you know perl.

    • My earlier reply was cut short. In any case, it continues…

      If you see the market share of languages in bioinformatics it might go like this:

      1. Perl
      2. C and Java (Java may be slightly higher)
      3. Python and Ruby

      If see job descriptions, most have them require one scripting language (Perl/Python/Ruby), and one compiled language (C/C++/Java). Additionally you will require to have a knowledge of SQL. Knowledge of R is a plus.

      In any programming job the skill to read C code is essential even if you never code in C. And there are miles of legacy code written in Perl in bioinformatics. So a cursory knowledge of Perl is essential also essential.

      Following the logic, and to minimize hard work, I will suggest learn Perl first. That will give a good hold on the field. Then if you have time learn Java. That will cover almost 70% of the requirements.

      Best of luck.

    • Thanks for all the comments!

      I didn’t mention, but I learned C in undergrad. It was eleven years ago, and I haven’t used it since, but I can probably pick that up again. I never took it very far though, but it would be much easier to pick up something I already did before, so that covers the compiled language.

      I’m not working on anything in particular, but I want to be able to play around with all the information that’s coming out of the various genome and proteome projects, and if that ends up piquing my interest to the point that I want to pursue something further (academically) I want to be able to do so, but it’s not necessarily my direct goal right now to be a full-time bioinformatician. So I guess in that case it doesn’t matter what I would be using myself, is that right?

      Sorry this is so vague. I’m in the middle of figuring out my career path in terms of research/bench/non-research/academic/non-academic work and this is just a tiny part of the puzzle.

    • As usual I got late to the discussion and I think everyone above answered what I would answer and I agree with most of them, except for the fact that Perl 6 is “around the corner”. Anyway, I have two blog entries on the issue:

      http://blindscientist.genedrift.org/2007/09/12/bioinformatic-perl-or-python/
      http://blindscientist.genedrift.org/2007/09/16/apart-from-python-and-perl-in-bioinformatics/

      Cheers

    • Following up on David’s comment, I would suggest taking a class that covers not only syntax but also good programming practices and design (perhaps an object oriented design class?) for whichever language you choose. I’ve found that learning a new syntax is rather easy (esp. if you’ve mastered one), but unless you understand how to structure the code or project, you’re limited in the scale/scope of projects you can later build. Understanding good design/coding practices can also make it easier to dive into existing programs as you can then understand how/why the code is structured.

      Of course, this depends on what you want to do—obviously you don’t need to understand code structure/design to write a script that just parses a text file.

    Post a reply

Search forums Advanced search

Submit this topic to

web feed

Advertisement