Skip to main content

Continuous integration in Python, 4: set up Travis-CI

Introduction to Travis-CI

Once you've set up your tests locally, it does you no good if you don't remember to run them! Travis-CI makes this seamless, because it will check out your code and run you tests for each and every commit you push to GitHub! (This is even more important when you are receiving pull requests on GitHub: the tests will be run online, without you having to individually check out each PR and run the tests on your machine!) .. has_math: no .. status: published .. wp-status: publish -->

This is the fourth post in my Continuous integration (CI) in Python series, and the one that puts the "continuous" in CI! For the introductory three posts, see:

Introduction to Travis-CI

Once you've set up your tests locally, it does you no good if you don't remember to run them! Travis-CI makes this seamless, because it will check out your code and run you tests for each and every commit you push to GitHub! (This is even more important when you are receiving pull requests on GitHub: the tests will be run online, without you having to individually check out each PR and run the tests on your machine!)

This is what continuous integration is all about. Once upon a time, the common practice was to pile on new features on a codebase. Then, come release time, there would be a feature freeze, and some time would be spent cleaning up code and removing bugs. In continuous integration, instead, no new feature is allowed into the codebase until it is bug free, as demonstrated by the test suite.

What to do

You need to first add a .travis.yml file to the root of your project. This tells Travis how to install your program's dependencies, install your program, and run the tests. Here's an example file to run tests and coverage on our maths.py sample project:

[code lang=text]
language: python
python:
    - "2.7"
    - "3.4"
before_install:
    - pip install pytest pytest-cov
script:
    - py.test
[/code]

Pretty simple: tell Travis the language of your project, the Python version (you can specify multiple versions, one per line, to test on multiple Python versions!), how to install test dependencies. Finally, the command to run your tests. (This can be anything, not just pytest or another testing framework, as long as a shell exit status of 0 means "success" and anything else means "failure".)

You can read more about the syntax and options for your .travis.yml in the Travis documentation. There are other sections you can add, such as "virtualenv" to set up Python virtual environments, "install" to add compilation and other installation steps for your library, before testing, and "after_success" to enable e.g. custom notifications. (We will use this last one in the next post.)

Once you have created .travis.yml for your project, you need to turn it on for your repo. This is, currently, a fairly painful process, and I can't wait for the day that it's enabled by default on GitHub. [Update: see below] In the meantime though:

[Update 2014-10-28: Thanks to @hugovk for pointing out that the first four points above can be skipped. It turns out that when you first log in to Travis-CI using your GitHub account, you give them write access to your webhooks. So, when you add a repo from their end, they go ahead and add themselves on the GitHub end! Boom. Way easier.]

https://twitter.com/hugovk/status/526798595442618368

Voilà! Every push and pull-request to your repository will trigger a job on Travis's servers, building your dependencies and your software, running your tests, and emailing you if anything went wrong! Amazingly, this is completely free for open source projects, so you really have no excuse for not using it!

Travis build page

Follow this blog to learn how to continuously check test coverage using Coveralls, coming in the next post!

Update: Volume 5: turn on Coveralls

Continuous integration in Python 3: set up your test configuration files

This is the third post in my series on continuous integration (CI) in Python. For the first two posts, see 1: automated tests with pytest, and 2: measuring test coverage.

By now, you've got a bunch of tests and doctests set up in your project, which you run with the command:

```http://en.wikipedia.org/wiki/INI_file

omit = *script.py

```

Once you've done this, you can invoke your tests with a simple, undecorated call to just py.test.

To find out more about your configuration options, see the pytest basic test configuration page, and Ned Batchelder's excellent .coveragerc syntax guide.

That's it for this entry of my CI series. Follow this blog for the next two entries, setting up Travis CI and Coveralls.

Update: Volume 4: Set up Travis-CI

Continuous integration in Python, Volume 2: measuring test coverage

(Edit: I initially thought it would be cute to number from 0. But it turns out it becomes rather obnoxious to relate English (first, second, ...) to 0-indexing. So this was formerly volume 1. But everything else remains the same.)

This is the second post in a series about setting up continuous integration for a Python project from scratch. For the first post, see Automated tests with pytest.

After you've written some test cases for a tiny project, it's easy to check what code you have automatically tested. For even moderately big projects, you will need tools that automatically check what parts of your code are actually tested. The proportion of lines of code that are run at least once during your tests is called your test coverage.

For the same reasons that testing is important, measuring coverage is important. Pytest can measure coverage for you with the coverage plugin, found in the pytest-cov package. Once you've installed the extension, a test coverage measurement is just a command-line option away:

[code lang=text]
 ~/projects/maths $ py.test --doctest-modules --cov .
============================= test session starts ==============================
platform darwin -- Python 2.7.8 -- py-1.4.25 -- pytest-2.6.3
plugins: cov
collected 2 items 

maths.py .
test_maths.py .
--------------- coverage: platform darwin, python 2.7.8-final-0 ----------------
Name         Stmts   Miss  Cover
--------------------------------
maths            2      0   100%
test_maths       4      0   100%
--------------------------------
TOTAL            6      0   100%

=========================== 2 passed in 0.07 seconds ===========================
[/code]

(The --cov takes a directory as input, which I find obnoxious, given that py.test so naturally defaults to the current directory. But it is what it is.)

Now, if I add a function without a test, I'll see my coverage drop:

[code lang=text]
def sqrt(x):
    """Return the square root of `x`."""
    return x * 0.5
[/code]

(The typo is intentional.)

[code lang=text]
--------------- coverage: platform darwin, python 2.7.8-final-0 ----------------
Name         Stmts   Miss  Cover
--------------------------------
maths            4      1    75%
test_maths       4      0   100%
--------------------------------
TOTAL            8      1    88%
[/code]

With one more option, --cov-report term-missing, I can see which lines I haven't covered, so I can try to design tests specifically for that code:

[code lang=text]
--------------- coverage: platform darwin, python 2.7.8-final-0 ----------------
Name         Stmts   Miss  Cover   Missing
------------------------------------------
maths            4      1    75%   24
test_maths       4      0   100%   
------------------------------------------
TOTAL            8      1    88%   
[/code]

Do note that 100% coverage does not ensure correctness. For example, suppose I test my sqrt function like so:

[code lang=python]
def sqrt(x):
    """Return the square root of `x`.

    Examples
    --------
    >>> sqrt(4.0)
    2.0
    """
    return x * 0.5
[/code]

Even though my test is correct, and I now have 100% test coverage, I haven't detected my mistake. Oops!

But, keeping that caveat in mind, full test coverage is a wonderful thing, and if you don't test something, you're guaranteed not to catch errors. Further, my example above is quite contrived, and in most situations full test coverage will spot most errors.

That's it for part 2. Tune in next time to learn how to turn on Travis continuous integration for your GitHub projects!

Update: Volume 3: set up config files.

Continuous integration in Python, Volume 1: automated tests with pytest

Without further ado, here goes the first post: how to set up automated testing for your Python project using pytest.

.. has_math: no .. status: published .. wp-status: publish -->

(Edit: I initially thought it would be cute to number from 0. But it turns out it becomes rather obnoxious to relate English (first, second, …) to 0-indexing. So this was formerly volume 0. But everything else remains the same.)

I just finished the process of setting up continuous integration from scratch for one of my projects, cellom2tif, a simple image file converter/liberator. I thought I would write a blog post about that process, but it has slowly mutated into a hefty document that I thought would work better as a series. I'll cover automated testing, test coverage, and how to get these to run automatically for your project with Travis-CI and Coveralls.

Without further ado, here goes the first post: how to set up automated testing for your Python project using pytest.

Automated tests, and why you need them

Software engineering is hard, and it's incredibly easy to mess things up, so you should write tests for all your functions, which ensure that nothing obviously stupid is going wrong. Tests can take a lot of different forms, but here's a really basic example. Suppose this is a function in your file, maths.py:

```http://ilovesymposia.com/2014/01/09/best-practices-addendum-find-and-follow-the-conventions-of-your-programming-community/

~/projects/maths $ py.test --doctest-modules ============================= test session starts ============================== platform darwin -- Python 2.7.8 -- py-1.4.20 -- pytest-2.5.2 collected 2 items

maths.py . test_maths.py .

=========================== 2 passed in 0.06 seconds ===========================

```

Test-driven development

That was easy! And yet most people, my past self included, neglect tests, thinking they'll do them eventually, when the software is ready. This is backwards. You've probably heard the phrase "Test-driven development (TDD)"; this is what they're talking about: writing your tests before you've written the functionality to pass them. It might initially seem like wasted effort, like you're not making progress in what you actually want to do, which is write your software. But it's not:

https://twitter.com/zspencer/status/514447236239859712

By spending a bit of extra effort to prevent bugs down the road, you will get to where you want to go faster.

That's it for volume 1! Watch out for the next post: ensuring your tests are thorough by measuring test coverage.

Update: Volume 2: measuring test coverage

Use pmset to restore your Mac's instant-wake after upgrading to Yosemite

I just upgraded to Mac OS X Yosemite Beta 3, and sure enough, Apple has once again broken my time-to-deep-sleep setting. After I upgraded, I noticed that my computer took a few seconds to wake from sleep, rather than the usual instant-on that defined OS X for so long. I checked pmset, Mac's power management command-line utility:

```http://ilovesymposia.com/2013/11/05/speed-up-your-macs-wake-up-time-using-pmset-do-it-again-after-upgrading-to-mavericks/

sudo pmset -a standbydelay 86400

```

Check in for a new post come next upgrade! (Mammoth?)

Read microscopy images to numpy arrays with python-bioformats

The python-bioformats library lets you seamlessly read microscopy images into numpy arrays from pretty much any file format.

I recently explained how to use Fiji's Jython interpreter to open BioFormats images, do some processing, and save the result to a standard format such as TIFF. Since then, the CellProfiler team has released an incredible tool, the python-bioformats library. It uses the Java Native Interface (JNI) to access the Java BioFormats library and give you back a numpy array within Python. In other words, it's magic.

Some of the stuff I was doing in the Jython interpreter was not going to fly for a 400GB image file produced by Leica (namely, setting the flag setOpenAllSeries(True)). This file contained 3D images of multiple zebrafish embryos, obtained every 30 minutes for three days. I needed to process each image sequentially.

The first problem was that even reading the metadata from the file resulted in a Java out-of-memory error! With the help of Lee Kamentsky, one of the creators of python-bioformats, I figured out that Java allocates a maximum memory footprint of just 256MB. With the raw metadata string occupying 27MB, this was not enough to contain the full structure of the parsed metadata tree. The solution was simply to set a much larger maximum memory allocation to the JVM:

```https://github.com/jni/lesion/blob/c3223687d35a7f81da7305e1e041f9c5a53104b1/lesion/lifio.py#L79 , cmap=cm.gray) plt.show()

```

2D slice from 5D volume

Boom. Using Python BioFormats, I've read in a small(ish) part of a quasi-terabyte image file into a numpy array, ready for further processing.

Note: the dimension order here is time, z, y, x, channel, or TZYXC, which I think is the most efficient way to read these files in. My wrapper allows arbitrary dimension order, so it'll be good to use it to figure out the fastest way to iterate through the volume.

In my case, I'm looking to extract statistics using scikit-image's profile_line function, and plot their evolution over time. Here's the min/max intensity profile along the embryo for a sample stack:

min/max vs time

I still need to clean up the code, in particular to detect bad images (no prizes for guessing which timepoint was bad here), but my point for now is that, thanks to Python BioFormats, doing your entire bioimage analysis in Python just got a heck of a lot easier.

SciPy 2014: an extremely useful conference with a diversity problem

About SciPy in general

Since my first SciPy in 2012, I've decided to prioritise programming conferences over scientific ones, because the value I get for my time is just that much higher. At a scientific conference, I certainly become aware of way-cool stuff going on in other research labs in my area. Once I get home, however, I go back to whatever I was doing. In contrast, at programming conferences, I become aware of new tools and practices that change the way I do science. In his keynote this year, Greg Wilson said of Software Carpentry, "We save researchers a day a week for the rest of their careers." I feel the same way about SciPy in general.

In the 2012 sprints, I learned about GitHub Pull Requests and code review, the lingua franca of open source development today. I can't express how useful that's been. I also started my ongoing collaboration with the scikit-image project, which has enabled my research to reach far more users than I ever could have achieved on my own.

No scientific conference I've been to has had such an impact on my science output, nor can I imagine one doing so. .. has_math: no .. status: published .. wp-status: publish -->

I just got back home from the SciPy 2014 conference in Austin, TX. Here are my thoughts after this year's conference.

About SciPy in general

Since my first SciPy in 2012, I've decided to prioritise programming conferences over scientific ones, because the value I get for my time is just that much higher. At a scientific conference, I certainly become aware of way-cool stuff going on in other research labs in my area. Once I get home, however, I go back to whatever I was doing. In contrast, at programming conferences, I become aware of new tools and practices that change the way I do science. In his keynote this year, Greg Wilson said of Software Carpentry, "We save researchers a day a week for the rest of their careers." I feel the same way about SciPy in general.

In the 2012 sprints, I learned about GitHub Pull Requests and code review, the lingua franca of open source development today. I can't express how useful that's been. I also started my ongoing collaboration with the scikit-image project, which has enabled my research to reach far more users than I ever could have achieved on my own.

No scientific conference I've been to has had such an impact on my science output, nor can I imagine one doing so.

This year's highlights

This year was no different. Without further ado, here are my top hits from this year's conference:

  • Michael Droettboom talked about his continuous benchmarking project, Airspeed Velocity. It is hilariously named and incredibly useful. asv checks out code from your Git repo at regular intervals and runs benchmarks (which you define), and plots your code's performance over time. It's an incredible guard against feature creep slowing down your code base.
  • IPython recently unveiled their modal version 2.0 interface, sending vimmers worldwide rejoicing. But a few key bindings are just wrong from a vim perspective. Most egregiously, i, which should enter edit mode, interrupts the kernel when pressed twice! That's just evil. Paul Ivanov goes all in with vim keybindings with his hilarious and life-changing IPython vimception talk. The title is more appropriate than I realised. Must-watch.
  • Damián Avila revealed (heh) his live IPython presentations with Reveal.js, forever changing how Python programmers present their work. I was actually aware of this before the conference and used it in my own talk, but you should definitely watch his talk and get the extension from his repo.
  • Min RK gave an excellent tutorial on IPython parallel (repo, videos 1, 2, 3). It's remarkable what the IPython team have achieved thanks to their decoupling of the interactive shell and the computation "kernel". (I still hate that word.)
  • Brian Granger and Jon Frederic gave an excellent tutorial on IPython interactive widgets (notebook here). They provide a simple and fast way to interactively explore your data. I've already started using these on my own problems.
  • Aaron Meurer gave the best introduction to the Python packaging problem that I've ever seen, and how Continuum's conda project solves it. I think we still need an in-depth tutorial on how package distributors should use conda, but for users, conda is just awesome, and this talk explains why.
  • Matt Rocklin has a gift for crystal clear speaking, despite his incredible speed, and it was on full display in his (and Mark Wiebe's) talk on Blaze, Continuum's new array abstraction library. I'm not sure I'll be using Blaze immediately but it's certainly on my radar now!
  • Lightning talks are always fun: days 1, 2, 3. Watch out for Fernando Pérez's announcement of Project Jupyter, the evolution of the IPython notebook, and for Damon McDougall's riveting history of waffles. (You think I'm joking.)

Apologies if I've missed anyone: with three tracks, an added one with the World Cup matches ;) , and my own talk preparations, "overwhelming" does not begin to describe the conference! I will second Aaron Meurer's assertion that there were no bad talks. Which brings us to...

On my to-watch

Jake Vanderplas recently wrote a series of blog posts (last one here, with links to earlier posts) comparing frequentist and Bayesian approaches to statistical analysis, in which he makes a compelling argument that we should all relegate frequentism to the dustbin of history. As such, I intend to go over Chris Fonnesbeck's tutorial (2, 3) and talk about Bayesian analysis in Python using PyMC.

David Sanders also did a Julia tutorial (part 2) that was scheduled at the same time as my own scikit-image tutorial, but I'm interested to see how the Julia ecosystem is progressing, so that should be a good place to start. (Although I'm still bitter that they went with 1-based indexing!)

The reproducible science tutorial (part 2) generated quite a bit of buzz so I would be interested to go over that one as well.

For those interested in computing education or in geoscience, the conference had dedicated tracks for each of those, so you are bound to find something you like (not least, Lorena Barba's and Greg Wilson's keynotes). Have a look at the full listing of videos here. These might be easier to navigate by looking at the conference schedule.

The SciPy experience

I want to close this post with a few reflections on the conference itself.

SciPy is broken up into three "stages": two days of tutorials, three days of talks, and two days of sprints. Above, I covered the tutorials and talks, but to me, the sprints are what truly distinguish SciPy. The spirit of collaboration is unparalleled, and an astonishing amount of value is generated in those two days, either in the shape of code, or in introducing newcomers to new projects and new ways to collaborate in programming.

My biggest regret of the conference was not giving a lightning talk urging people to come to the sprints. I repeatedly asked people whether they were coming to the sprints, and almost invariably the answer was that they didn't feel they were good enough to contribute. To reiterate my previous statements: (1) when I attended my first sprint in 2012, I had never done a pull request; (2) sprints are an excellent way to introduce newcomers to projects and to the pull request development model. All the buzz around the sprints was how welcoming all of the teams were, but I think there is a massive number of missed opportunities because this is not made obvious to attendees before the sprints.

Lastly, a few notes on diversity. During the conference, April Wright, a student in evolutionary biology at UT Austin, wrote a heartbreaking blog post about how excluded she felt from a conference where only 15% of attendees were women. That particular incident was joyfully resolved, with plenty of SciPyers reaching out to April and inviting her along to sprints and other events. But it highlighted just how poorly we are doing in terms of diversity. Andy Terrel, one of the conference organisers, pointed out that 15% is much better than 2012's three (women, not percent!), but (a) that is still extremely low, and (b) I was horrified to read this because I was there in 2012... And I did not notice that anything was wrong. How can it be, in 2012, that it can seem normal to be at a professional conference and have effectively zero women around? It doesn't matter what one says about the background percentage of women in our industry and so on... Maybe SciPy is doing all it can about diversity. (Though I doubt it.) The point is that a scene like that should feel like one of those deserted cityscapes in post-apocalyptic movies. As long as it doesn't, as long as SciPy feels normal, we will continue to have diversity problems. I hope my fellow SciPyers look at these numbers, feel appalled as I have, and try to improve.

... And on cue, while I was writing this post, Andy Terrel wrote a great post of his own about this very topic: http://andy.terrel.us/blog/2014/07/17/

I still consider SciPy a fantastic conference. Jonathan Eisen (@phylogenomics), whom I admire, would undoubtedly boycott it because of the problems outlined above, but I am heartened that the organising committee is taking this as a serious problem and trying hard fix it. I hope next time it is even better.

A clever use of SciPy's ndimage.generic_filter for n-dimensional image processing

Vighnesh is tasked with implementing region adjacency graphs and graph based methods for image segmentation. He initially wrote specific functions for 2D and 3D images, and I suggested that he should merge them: either with n-dimensional code, or, at the very least, by making 2D a special case of 3D. He chose the former, and produced extremely elegant code. Three nested for loops and a large number of neighbour computations were replaced by a function call and a simple loop. Read on to find out how.

.. has_math: no .. status: published .. wp-status: publish -->

This year I am privileged to be a mentor in the Google Summer of Code for the scikit-image project, as part of the Python Software Foundation organisation. Our student, Vighnesh Birodkar, recently came up with a clever use of SciPy’s ndimage.generic_filter that is certainly worth sharing widely.

Vighnesh is tasked with implementing region adjacency graphs and graph based methods for image segmentation. He initially wrote specific functions for 2D and 3D images, and I suggested that he should merge them: either with n-dimensional code, or, at the very least, by making 2D a special case of 3D. He chose the former, and produced extremely elegant code. Three nested for loops and a large number of neighbour computations were replaced by a function call and a simple loop. Read on to find out how.

Iterating over an array of unknown dimension is not trivial a priori, but thankfully, someone else has already solved that problem: NumPy’s nditer and ndindex functions allow one to efficiently iterate through every point of an n-dimensional array. However, that still leaves the problem of finding neighbors, to determine which regions are adjacent to each other. Again, this is not trivial to do in nD.

scipy.ndimage provides a suitable function, generic_filter. Typically, a filter is used to iterate a “selector” (called a structuring element) over an array, compute some function of all the values covered by the structuring element, and replace the central value by the output of the function. For example, using the structuring element:

[code lang=python]
fp = np.array([[0, 1, 0],
               [1, 1, 1],
               [0, 1, 0]], np.uint8)
[/code]

and the function np.median on a 2D image produces a median filter over a pixel’s immediate neighbors. That is,

[code lang=python]
import functools
median_filter = functools.partial(generic_filter,
                                  function=np.median,
                                  footprint=fp)
[/code]

Here, we don’t want to create an output array, but an output graph. What to do? As it turns out, Python’s pass-by-reference allowed Vighnesh to do this quite easily using the “extra_arguments” keyword to generic_filter: we can write a filter function that receives the graph and updates it when two distinct values are adjacent! generic_filter passes all values covered by a structuring element as a flat array, in the array order of the structuring element. So Vighnesh wrote the following function:

[code lang=python]
def _add_edge_filter(values, g):
    """Add an edge between first element in `values` and
    all other elements of `values` in the graph `g`.
    `values[0]` is expected to be the central value of
    the footprint used.

    Parameters
    ----------
    values : array
        The array to process.
    g : RAG
        The graph to add edges in.

    Returns
    -------
    0.0 : float
        Always returns 0.

    """
    values = values.astype(int)
    current = values[0]
    for value in values[1:]:
        g.add_edge(current, value)
    return 0.0
[/code]

Then, using the footprint:

[code lang=python]
fp = np.array([[0, 0, 0],
               [0, 1, 1],
               [0, 1, 0]], np.uint8)
[/code]

(or its n-dimensional analog), this filter is called as follows on labels, the image containing the region labels:

[code lang=python]
filters.generic_filter(labels,
                       function=_add_edge_filter,
                       footprint=fp,
                       mode='nearest',
                       extra_arguments=(g,))
[/code]

This is a rather unconventional use of generic_filter, which is normally used for its output array. Note how the return value of the filter function, _add_edge_filter, is just 0! In our case, the output array contains all 0s, but we use the filter exclusively for its side-effect: adding an edge to the graph g when there is more than one unique value in the footprint. That’s cool.

Continuing, in this first RAG implementation, Vighnesh wanted to segment according to average color, so he further needed to iterate over each pixel/voxel/hypervoxel and keep a running total of the color and the pixel count. He used elements in the graph node dictionary for this and updated them using ndindex:

[code lang=python]
for index in np.ndindex(labels.shape):
    current = labels[index]
    g.node[current]['pixel count'] += 1
    g.node[current]['total color'] += image[index]
[/code]

Thus, together, numpy’s nditer, ndindex, and scipy.ndimage’s generic_filter provide a powerful way to perform a large variety of operations on n-dimensional arrays… Much larger than I’d realised!

You can see Vighnesh’s complete pull request here and follow his blog here.

If you use NumPy arrays and their massive bag of tricks, please cite the paper below!

Stefan Van Der Walt, S. Chris Colbert, & Gaël Varoquaux (2011). The NumPy array: a structure for efficient numerical computation Computing in Science and Engineering 13, 2 (2011) 22-30 arXiv: 1102.1523v1

Elsevier et al's pricing douchebaggery exposed

Ted Bergstrom and a few colleagues have just come out with an epic paper in which they reveal how much for-profit academic publishing companies charge university libraries, numbers that had previously been kept secret. The paper is ostensibly in the field of economics, but it could be more accurately described as "sticking-it-to-the-man-ology".

This paragraph in the footnotes was particularly concise and fun to read:

Elsevier contested our contract request from Washington State University on the grounds that their pricing policy was a trade secret, and brought suit against the university. The Superior Court judge ruled that Washington State University could release the contracts to us. Elsevier and Springer also contested our request for contracts from the University of Texas (UT) System. The Texas state attorney general opined that the UT System was required to release copies of all of these contracts.
In other words: in the interest of full disclosure, suck it, Elsevier!

The executive summary is that these companies will extort as much as they possibly can from individual universities, then do everything to keep that number a secret so that they can more freely extort even more from other universities. You can read more about it in Science.

Oh, and in the ultimate twist of irony, the paper itself is behind PNAS's paywall. How much did your university pay for that?

Bergstrom, T., Courant, P., McAfee, R., & Williams, M. (2014). Evaluating big deal journal bundles Proceedings of the National Academy of Sciences DOI: 10.1073/pnas.1403006111

An update on mixing Java and Python with Fiji

Two weeks ago I posted about invoking ImageJ functions from Python using Fiji’s Jython interpreter. A couple of updates on the topic:

First, I’ve made a repository with a template project encapsulating my tips from that post. It’s very simple to get a Fiji Jython script working from that template. As an example, here’s a script to evaluate segmentations using the metric used by the SNEMI3D segmentation challenge (a slightly modified version of the adapted Rand error).

Second, this entire discussion might be rendered obsolete by two incredible projects from the CellProfiler team: Python-Javabridge, which allows Python to interact seamlessly with Java code, and Python-Bioformats, which uses Python-Javabridge to read Bioformats images into Python. I have yet to play with them, but both look like cleaner alternatives to interact with ImageJ than my Jython scripting! At some point I’ll write a post exploring these tools, but if you get to it before me, please mention it in the comments!