Skip to main content

Numba in the real world

Numba is a just-in-time compiler (JIT) for Python code focused on NumPy arrays and scientific Python. I've seen various tutorials around the web and in conferences, but I have yet to see someone use Numba "in the wild". In the past few months, I've been using Numba in my own code, and I recently released my first real package using Numba, skan. The short version is that Numba is amazing and you should strongly consider it to speed up your scientific Python bottlenecks. Read on for the longer version.

Read more…

Trump's win

Like many of you, I watched in horror two days ago as the night unfolded, and the unthinkable slowly came to pass. After a Netflix binge to try to numb the fear, I dived into a clickhole of social media posts and news articles to try to make sense of what had happened. I hope that writing a synthesis of that will let me get on with my life in this brave new world.

I am deeply, depressingly pessimistic about the future of the planet under Trump. Let's take the very best, most ludicrously optimistic scenario: that Trump swings to the center1 and doesn't make good on his many horrid promises. Even then, his election, and the Republicans' victory in the House and Senate, represent game over in the fight to avoid climate change2.

Read more…

Review: Sony Digital Paper

Three years ago I excitedly posted about Sony's then-new writeable e-paper tablet, called Sony Digital Paper System (DPS-1).

Now it is finally mine and I love it.

Here's what I wrote about it when Sony announced it:

the iPad (et al) sucks for some things. Three of those are: (1) taking handwritten notes, (2) reading (some) pdfs in full-page view, and (3) reading in full daylight. By the sound of it, Sony’s new tablet will excel at all three

Having had it for about a month, I can confidently say that it does indeed excel at those three things, beyond my wildest dreams. Even with the improved competition of the identically-priced iPad Pro (which can now handle points 1 and 2 with aplomb), I still prefer the Sony. Here's why:

Read more…

Why scientists should code in the open

All too often, I encounter published papers in which the code is "available upon request", or "available in the supplementary materials" (as a zip file). This is not just poor form. It also hurts your software's future. (And, in my opinion, when results depend on software, it is inexcusable.)

Given the numerous options for posting code online, there's just no excuse to give code in a less-than-convenient format, upon publication. When you publish, put your code on Github or Bitbucket.

In this piece, I'll go even further: put your code there from the beginning. Put your code there as soon as you finish reading this article. Here's why:

Read more…

The cost of a Python function call

I've read in various places that the Python function call overhead is very high. As I was parroting this "fact" to Ed Schofield recently, he asked me what the cost of a function actually was. I had no idea. This prompted us to do a few quick benchmarks.

The short version is that it takes about 150ns to call a function in Python (on my laptop). This doesn't sound like a lot, but it means that you can make at most 6.7 million calls per second, two to three orders of magnitude slower than your processor's clock speed.

If you want your function to do something, such as, oh, I don't know, receive an input argument, this goes up to 350ns, throttling you at 2.8 million calls per second.

Read more…

My first use of Python 3's `yield from`!

I never really understood why yield from was useful. Last weekend, I wanted to use Python 3.5's new os.scandir to explore a directory (and its subdirectories). Tragically, os.scandir is not recursive, and I find os.walks 3-tuple values obnoxious. Lo and behold, while I was trying to implement a recursive version of scandir, a yield from use just popped right out!

import os
def rscandir(path):
    for entry in os.scandir(path):
        yield entry
        if entry.is_dir():
            yield from rscandir(entry.path)

That's it! I have to admit that reads wonderfully. The Legacy Python (aka Python 2.x) alternative is quite a bit uglier:

import os
def rscandir(path):
    for p in os.listdir(path):
        yield p
        if os.path.isdir(p):
            for q in rscandir(p):
                yield q

Yuck. So, yet again: time to move away from Legacy Python! ;)