Our great sponsors
-
perl-for-reysenbach-lab
These are perl scripts I developed over many years as a Bioinformaticist for the Reysenbach Lab at PSU. The Reysenbach Lab studies microbial diversity in extreme environments. Lotta fasta utilities here if you are into that sort of thing.
-
faster-perl-for-reysenbach
Tracks the progress of making old Perl scripts faster and more maintainable. Working from Meneghin's perl-for-reysenbach-lab repository of bioinformatics scripts.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I saw Jennifer's post about re-writing her perl scripts in python and how she saw a 2.5 times improvement.
I have an interest in Perl and Science, so time to roll up sleeves and learn me some profiling/benchmarking. What follows is my internal monologue and the notes I scribbled down during the learning process. For those that want to follow along, I've created a small repo.
There are no more obvious or easy gains here. Any more work is likely to yield small returns. Go outside, have a life or at the least consult the relevant chart.
Search "benchmarking tools for linux" and decide that hyperfine is good for what I'm doing. Run Jennifer's new python script against my refactored perl and find that the python is 1.26 times faster for k=3 and 1.47 times faster for k=4. For the Covid-19 sequence, these are both on the order of hundreds of milliseconds.