Skip to main content

The SciPy ecosystem belongs to everyone

I use Twitter favourites almost exclusively to mark posts that I know will be useful in some not-too-distant future; kind of like a Twitter Evernote. Recently I was looking through my list in search of this excellent blog post detailing how to build cross-platform binary distributions for conda.

I came across two other tweets from the EuroSciPy 2014 conference: this one by Ian Ozsvald about his IPython memory usage profiler, right next to this one by Alexandre Chabot about Aaron O'Leary's notedown. I'd forgotten that this was how I came across these two tools, but since then I have contributed code to both (1, 2). I'd met Ian at EuroSciPy 2013, but I've never met Aaron, yet nevertheless there is my code in the latest version of his notedown library.

Read more…

Calling out SciPy on diversity (even though it hurts)

Over the past few weeks, I've been heavily promoting the SciPy conference, a meeting about scientific programming in Python. I've been telling everyone who would listen that they should submit a talk abstract and go, because scientific programming is increasingly common in any scientist's work and SciPy massively improves how you do that.

I have also been guiltily ommitting that the speaker and attendee diversity at SciPy is shockingly bad. Last year, for example, 15% of attendees were women, and that was an improvement over the ratio three years ago, when just 3% (!!!) were women.

Read more…

Go to SciPy 2015

SciPy is my favourite conference. My goal with this post is to convince someone to go who hasn't had that chance yet.

Photo by Ian Rees (from the SciPy 2012 conference website)

Why SciPy?

Most scientists go to conferences in their own field: neuroscientists go to the monstrous Society for Neuroscience (SfN); Bioinformaticians go to RECOMB, ISMB, or PSB; and so on.

People go to these to keep up with the latest advances in their field, and often, to do a bit of networking.

SciPy is a different kind of conference. It changes the way you do science. You learn about the latest free and open source software to help you with your work. You learn to write functions and interfaces instead of scripts, and to write tests so you don't break your code. You also learn to contribute these to bigger projects, maximising the reach and impact of your work (see "sprints", below).

Read more…

Experiences porting a medium-sized library from Python 2 to 3

Update: Much of the information in this post is outdated (especially the part about Python 3 being slower — Python 3.7 is the fastest version of Python ever created.). Take everything you read here with a grain of salt.

Prompted in part by some discussions with Ed Schofield, creator of python-future.org, I've been going on a bit of a porting spree to Python 3. I just finished with my gala segmentation library. (Find it on GitHub and ReadTheDocs.) Overall, the process is nowhere near as onerous as you might think it is. Getting started really is the hardest part. If you have more than yourself as a user, you should definitely just get on with it and port.

The second hardest part is the testing. In particular, you will need to be careful with dictionary iteration, pickled objects, and file persistence in general. I'll go through these gotchas in more detail below.

Read more…

Clarifications about our book, Elegant SciPy (and our call for code submissions)

Short version

Thank you to everyone who has already submitted, retweeted, and spread the word about our book, Elegant SciPy! We are still looking for code submissions meeting these criteria: - Submissions must use NumPy, SciPy, or a closely related library in a non-trivial way. - Submissions must be (re)licensed as BSD, MIT, public domain, or something similarly liberal. (This is easy if you are the author.) - Code should be satisfying in some way, such as speed, conciseness, broad applicability... - Preferably, nominate someone else's code that impressed you. - Include a scientific application on real data.

Read more…

Call for code nominations for Elegant SciPy!

Update: See the also the clarifications to this post, and submit code by creating an issue in our GitHub repo!

It's official! Harriet Dashnow, Stéfan van der Walt, and I will be writing an O'Reilly book about the SciPy library and the surrounding ecosystem. The book is called Elegant SciPy, and is intended to teach SciPy to fledgling Pythonistas, guided by the most elegant SciPy code examples we can find.

So, if you recently came across scientific Python code that made you go "Wow!" with its elegance, simplicity, cleverness, or power, please point us to it! As an example, have a look at Vighnesh Birodkar's code to build a region adjacency graph from an n-dimensional image, which I highlighted previously here.

Read more…

Some things I learned while building a site on GitHub Pages

Understatement: I'm not much of a web developer. However, we all have to become a little bit versed in web-dev if we want to publish things these days. GitHub Pages makes it really easy to publish a site (check out the official guide, and Thinkful's truly excellent interactive getting started guide).

If you just want to publish a static set of html files using absolute paths, you'll be fine. However, Pages uses Jekyll, a server that can transform collections of Markdown, HTML, and other files into full-fledged websites. The process is definitely full of gotchas, though, and you'll run into issues for anything other than single pages. I'm making this list for my own future reference, and so that I can finally close the umpteen tabs I have open on the topic! But I hope someone else will find it useful.

Read more…

Continuous integration in Python, 7: some helper tools and final thoughts

Almost everything I know about good Python development I've learned from Stéfan van der Walt, Tony Yu, and the rest of the scikit-image team. But a few weeks ago, I was trying to emulate the scikit-image CI process for my own project: cellom2tif, a tool to liberate images from a rather useless proprietary format. (I consider this parenthetical comment sufficient fanfare to announce the 0.2 release!) As I started copying and editing config files, I found that even from a complete template, getting started was not very straightforward. First, scikit-image has much more complicated requirements, so that a lot of the .travis.yml file was just noise for my purposes. And second, as detailed in the previous posts, a lot of the steps are not found or recorded anywhere in the repository, but rather must be navigated to on the webpages of GitHub, Travis, and Coveralls. I therefore decided to write this series as both a notetaking exercise and a guide for future CI novices. (Such as future me.) .. has_math: no .. status: published .. wp-status: publish -->

It's time to draw my "continuous integration in Python" series to a close. This final post ties all six previous posts together and is the preferred write-up to share more widely and on which to provide feedback.

Almost everything I know about good Python development I've learned from Stéfan van der Walt, Tony Yu, and the rest of the scikit-image team. But a few weeks ago, I was trying to emulate the scikit-image CI process for my own project: cellom2tif, a tool to liberate images from a rather useless proprietary format. (I consider this parenthetical comment sufficient fanfare to announce the 0.2 release!) As I started copying and editing config files, I found that even from a complete template, getting started was not very straightforward. First, scikit-image has much more complicated requirements, so that a lot of the .travis.yml file was just noise for my purposes. And second, as detailed in the previous posts, a lot of the steps are not found or recorded anywhere in the repository, but rather must be navigated to on the webpages of GitHub, Travis, and Coveralls. I therefore decided to write this series as both a notetaking exercise and a guide for future CI novices. (Such as future me.)

Read more…

Continuous integration in Python, 6: Show off your work

We're finally ready to wrap up this topic. By now you can:

But, much as exercise is wasted if your bathroom scale doesn't automatically tweet about it, all this effort is for naught if visitors to your GitHub page can't see it!

Most high-profile open-source projects these days advertise their CI efforts. Above, I cheekily called this showing off, but it's truly important: anyone who lands on your GitHub page is a potential user or contributor, and if they see evidence that your codebase is stable and well-tested, they are more likely to stick around.

Read more…

Continuous integration in Python, 5: report test coverage using Coveralls

Travis runs whatever commands you tell it to run in your .travis.yml file. Normally, that's just installing your program and its requirements, and running your tests. If you wanted instead to launch some nuclear missiles, you could do that. (Assuming you were happy to put the launch keys in a public git repository... =P)

The Coveralls service, once again free for open-source repositories, takes advantage of this: you just need to install an extra piece of software from PyPI, and run it after your tests have passed. Do so by adding the line pip install coveralls to your before_install section, and just the coveralls command to a new after_success section:

Read more…