An update on mixing Java and Python with Fiji

Two weeks ago I posted about invoking ImageJ functions from Python using Fiji’s Jython interpreter. A couple of updates on the topic:

First, I’ve made a repository with a template project encapsulating my tips from that post. It’s very simple to get a Fiji Jython script working from that template. As an example, here’s a script to evaluate segmentations using the metric used by the SNEMI3D segmentation challenge (a slightly modified version of the adapted Rand error).

Second, this entire discussion might be rendered obsolete by two incredible projects from the CellProfiler team: Python-Javabridge, which allows Python to interact seamlessly with Java code, and Python-Bioformats, which uses Python-Javabridge to read Bioformats images into Python. I have yet to play with them, but both look like cleaner alternatives to interact with ImageJ than my Jython scripting! At some point I’ll write a post exploring these tools, but if you get to it before me, please mention it in the comments!

Get the best of both worlds with Fiji’s Jython interpreter

Fiji is just ImageJ, with batteries included. It contains plugins to do virtually anything you would want to do to an image. Since my go-to programming language is Python, my favorite feature of Fiji is its language-agnostic API, which supports a plethora of languages, including Java, Javascript, Clojure, and of course Python; 7 languages in all. (Find these under Plugins/Scripting/Script Editor.) Read on to learn more about the ins and outs of using Python to drive Fiji.

Among the plugin smorgasbord of Fiji is the Bio-Formats importer, which can open any proprietary microscopy file under the sun. (And there’s a lot of them!) Below I will use Jython to open some .lifs, do some processing, and output some .pngs that I can process further using Python/NumPy/scikit-image. (A .lif is a Leica Image File, because there were not enough image file formats before Leica came along.)

The first thing to note is that Jython is not Python, and it is certainly not Python 2.7. In fact, the Fiji Jython interpreter implements Python 2.5, which means no argparse. Not to worry though, as argparse is implemented in a single, pure Python file distributed under the Python license. So:

Tip #1: copy argparse.py into your project.

This way you’ll have access the state of the art in command line argument processing from within the Jython interpreter.

To get Fiji to run your code, you simply feed it your source file on the command line. So, let’s try it out with a simple example, echo.py:

import argparse

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description=
                                     "Parrot back your arguments.")
    parser.add_argument('args', nargs="*", help="The input arguments.")
    args = parser.parse_args()
    for arg in args.args:
        print(arg)

Now we can just run this:

$ fiji echo.py hello world
hello
world

But sadly, Fiji captures any -h calls, which defeats the purpose of using argparse in the first place!

$ fiji echo.py -h
Usage: /Applications/Fiji.app/Contents/MacOS/fiji-macosx [<Java options>.. --] [<ImageJ options>..] [<files>..]

Java options are passed to the Java Runtime, ImageJ
options to ImageJ (or Jython, JRuby, ...).

In addition, the following options are supported by ImageJ:
General options:
--help, -h
	show this help
--dry-run
	show the command line, but do not run anything
--debug
	verbose output

(… and so on, the output is quite huge.)

(Note also that I aliased the Fiji binary, that long path under /Applications, to a simple fiji command; I recommend you do the same.)

However, we can work around this by calling help using Python as the interpreter, and only using Fiji to actually run the file:

$ python echo.py -h
usage: echo.py [-h] [args [args ...]]

Parrot back your arguments.

positional arguments:
  args        The input arguments.

optional arguments:
  -h, --help  show this help message and exit

That’s more like it! Now we can start to build something a bit more interesting, for example, something that converts arbitrary image files to png:

import argparse
from ij import IJ # the IJ class has utility methods for many common tasks.

def convert_file(fn):
    """Convert the input file to png format.

    Parameters
    ----------
    fn : string
        The filename of the image to be converted.
    """
    imp = IJ.openImage(fn)
    # imp is the common name for an ImagePlus object,
    # ImageJ's base image class
    fnout = fn.rsplit('.', 1)[0] + '.png'
    IJ.saveAs(imp, 'png', fnout)

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description="Convert TIFF to PNG.")
    parser.add_argument('images', nargs='+', help="Input images.")

    args = parser.parse_args()
    for fn in args.images:
        convert_file(fn)

Boom, we’re done. But wait, we actually broke the Python interpreter compatibility, since ij is not a Python library!

$ python convert2png.py -h
Traceback (most recent call last):
  File "convert.py", line 2, in <module>
    from ij import IJ # the IJ class has utility methods for many common tasks.
ImportError: No module named ij

Which brings us to:

Tip #2: only import Java API functions within the functions that use them.

By moving the from ij import IJ statement into the convert function, we maintain compatibility with Python, and can continue to use argparse’s helpful documentation strings.

Next, we want to use the Bio-Formats importer, which is class BF in loci.plugins. Figuring out the class hierarchy for arbitrary plugins is tricky, but you can find it here for core ImageJ (using lovely 1990s-style frames) and here for Bio-Formats, and Curtis Rueden has made this list for other common plugins.

When you try to open a file with Bio-Formats importer using the Fiji GUI, you get the following dialog:

BioFormats import window
BioFormats import window

That’s a lot of options, and we actually want to set some of them. If you look at the BF.openImagePlus documentation, you can see that this is done through an ImporterOptions class located in loci.plugins.in. You’ll notice that “in” is a reserved word in Python, so from loci.plugins.in import ImporterOptions is not a valid Python statement. Yay! My workaround:

Tip #3: move your Fiji imports to an external file.

So I have a jython_imports.py file with just:

from ij import IJ
from loci.plugins import BF
from loci.plugins.in import ImporterOptions

Then, inside the convert_files() function, we just do:

from jython_imports import IJ, BF, ImporterOptions

This way, the main file remains Python-compatible until the convert() function is actually called, regardless of whatever funky and unpythonic stuff is happening in jython_imports.py.

Onto the options. If you untick “Open files individually”, it will open up all matching files in a directory, regardless of your input filename! Not good. So now we play a pattern-matching game in which we match the option description in the above dialog with the ImporterOptions API calls. In this case, we setUngroupFiles(True). To specify a filename, we setId(filename). Additionally, because we want all of the images in the .lif file, we setOpenAllSeries(True).

Next, each image in the series is 3D and has three channels, but we are only interested in a summed z-projection of the first channel. There’s a set of ImporterOptions methods tantalizingly named setCBegin, setCEnd, and setCStep, but this is where I found the documentation sorely lacking. The functions take (int s, int value) as arguments, but what’s s??? Are the limits closed or open? Code review is a wonderful thing, and this would not have passed it. To figure things out:

Tip #4: use Fiji’s interactive Jython interpreter to figure things out quickly.

You can find the Jython interpreter under Plugins/Scripting/Jython Interpreter. It’s no IPython, but it is extremely helpful to answer the questions I had above. My hypothesis was that s was the series, and that the intervals would be closed. So:

>>> from loci.plugins import BF
>>> from loci.plugins.in import ImporterOptions
>>> opts = ImporterOptions()
>>> opts.setId("myFile.lif")
>>> opts.setOpenAllSeries(True)
>>> opts.setUngroupFiles(True)
>>> imps = BF.openImagePlus(opts)

Now we can play around, with one slight annoyance: the interpreter won’t print the output of your last statement, so you have to specify it:

>>> len(imps)
>>> print(len(imps))
18

Which is what I expected, as there are 18 series in my .lif file. The image shape is given by the getDimensions() method of the ImagePlus class:

>>> print(imps[0].getDimensions())
array('i', [1024, 1024, 3, 31, 1])

>>> print(imps[1].getDimensions())
array('i', [1024, 1024, 3, 34, 1])

That’s (x, y, channels, z, time).

Now, let’s try the same thing with setCEnd, assuming closed interval:

>>> opts.setCEnd(0, 0) ## only read channels up to 0 for series 0?
>>> opts.setCEnd(2, 0) ## only read channels up to 0 for series 2?
>>> imps = BF.openImagePlus(opts)
>>> print(imps[0].getDimensions())
array('i', [1024, 1024, 1, 31, 1])

>>> print(imps[1].getDimensions())
array('i', [1024, 1024, 3, 34, 1])

>>> print(imps[2].getDimensions())
array('i', [1024, 1024, 1, 30, 1])

Nothing there to disprove my hypothesis! So we move on to the final step, which is to z-project the stack by summing the intensity over all z values. This is normally accessed via Image/Stacks/Z Project in the Fiji GUI, and I found the corresponding ij.plugin.ZProjector class by searching for “proj” in the ImageJ documentation. A ZProjector object has a setMethod method that usefully takes an int as an argument, with no explanation in its docstring as to which int translates to which method (sum, average, max, etc.). A little more digging in the source code reveals some class static variables, AVG_METHOD, MAX_METHOD, and so on.

Tip #5: don’t be afraid to look at the source code. It’s one of the main advantages of working in open-source.

So:

>>> from ij.plugin import ZProjector
>>> proj = ZProjector()
>>> proj.setMethod(ZProjector.SUM_METHOD)
>>> proj.setImage(imps[0])
>>> proj.doProjection()
>>> impout = proj.getProjection()
>>> print(impout.getDimensions())
array('i', [1024, 1024, 1, 1, 1])

The output is actually a float-typed image, which will get rescaled to [0, 255] uint8 on save if we don’t fix it. So, to wrap up, we convert the image to 16 bits (making sure to turn off scaling), use the series title to generate a unique filename, and save as a PNG:

>>> from ij.process import ImageConverter
>>> ImageConverter.setDoScaling(False)
>>> conv = ImageConverter(impout)
>>> conv.convertToGray16()
>>> title = imps[0].getTitle().rsplit(" ", 1)[-1]
>>> IJ.saveAs(impout, 'png', "myFile-" + title + ".png")

You can see the final result of my sleuthing in lif2png.py and jython_imports.py. If you would do something differently, pull requests are always welcome.

Before I sign off, let me recap my tips:

1. copy argparse.py into your project;

2. only import Java API functions within the functions that use them;

3. move your Fiji imports to an external file;

4. use Fiji’s interactive Jython interpreter to figure things out quickly; and

5. don’t be afraid to look at the source code.

And let me add a few final comments: once I started digging into all of Fiji’s plugins, I found documentation of very variable quality, and worse, virtually zero consistency between the interfaces to each plugin. Some work on “the currently active image”, some take an ImagePlus instance as input, and others still a filename or a directory name. Outputs are equally variable. This has been a huge pain when trying to work with these plugins.

But, on the flipside, this is the most complete collection of image processing functions anywhere. Along with the seamless access to all those functions from Jython and other languages, that makes Fiji very worthy of your attention.

Acknowledgements

This post was possible thanks to the help of Albert Cardona, Johannes Schindelin, Wayne Rasband, and Jan Eglinger, who restlessly respond to (it seems) every query on the ImageJ mailing list. Thanks!

References

Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, Preibisch S, Rueden C, Saalfeld S, Schmid B, Tinevez JY, White DJ, Hartenstein V, Eliceiri K, Tomancak P, & Cardona A (2012). Fiji: an open-source platform for biological-image analysis. Nature methods, 9 (7), 676-82 PMID: 22743772

Linkert M, Rueden CT, Allan C, Burel JM, Moore W, Patterson A, Loranger B, Moore J, Neves C, Macdonald D, Tarkowska A, Sticco C, Hill E, Rossner M, Eliceiri KW, & Swedlow JR (2010). Metadata matters: access to image data in the real world. The Journal of cell biology, 189 (5), 777-82 PMID: 20513764

Best practices addendum: find and follow the conventions of your programming community

The bioinformatics community is all atwitter about the recent PLOS Biology article, Best Practices for Scientific Computing. Its main points should be obvious to most quasi-experienced programmers, but I can certainly remember a time when they did not seem so obvious to me (last week I think). As such, it’s a valuable addition to the written record on scientific computing. One of their code snippets, however, is pretty annoying:

def scan(op, values, seed=None):
# Apply a binary operator cumulatively to the values given
# from lowest to highest, returning a list of results.
# For example, if "op" is "add" and "values" is "[1,3,5]",
# the result is "[1, 4, 9]" (i.e., the running total of the
# given values). The result always has the same length as
# the input.
# If "seed" is given, the result is initialized with that
# value instead of with the first item in "values", and
# the final item is omitted from the result.
# Ex : scan(add, [1, 3, 5] , seed=10)
# produces [10, 11, 14]
...implementation...

First, this code ignores the article’s own advice, (1b) make names consistent, distinctive, and meaningful.  I would argue that “scan” here is neither distinctive (many other operations could be called “scan”) nor meaningful (the function purpose is not at all clear from the name). My suggestion would be “cumulative_reduce”.

It also does not address another important piece of advice that I would add to their list, maybe as (1d): Find out, and follow, the conventions of the programming community you’re joining. This will allow others to use and assess your code more readily, and you to contribute to other code libraries more easily. Here, although they have made efforts to make their advice language-agnostic, the authors have chosen Python to illustrate their point. Python happens to have strong style and documentation prescriptions in the form of Python Enhancement Proposals PEP-8: Style Guide for Python Code and PEP-257: Docstring conventions. Following PEP-8 and PEP-257, the above comments become an actual docstring (which is attached to the function automatically by documentation-generating tools):

def cumulative_reduce(op, values, seed=None):
    """Apply a binary operator cumulatively to the values given.

    The operator is applied from left to right.

    For example, if "op" is "add" and "values" is "[1,3,5]",
    the result is "[1, 4, 9]" (i.e., the running total of the
    given values). The result always has the same length as
    the input.

    If "seed" is given, the result is initialized with that
    value instead of with the first item in "values", and
    the final item is omitted from the result.
    Ex : scan(add, [1, 3, 5] , seed=10)
    produces [10, 11, 14]
    """
    ...implementation...

In addition, the Scientific Python community in particular has adopted a few docstring conventions of their own, including the NumPy docstring conventions, which divide the docstring into meaningful sections using ReStructured Text, and the doctest convention to format examples, so the documentation acts as unit tests. So, to further refine their example code:

def cumulative_reduce(op, values, seed=None):
    """Apply a binary operator cumulatively to the values given.

    The operator is applied from left to right.

    Parameters
    ----------
    op : binary function
        An operator taking as input to values of the type contained in
        `values` and returning a value of the same type.
    values : list
        The list of input values.
    seed : type contained in `values`, optional
        A seed to start the reduce operation.

    Returns
    -------
    reduced : list, same type as `values`
        The accumulated list.

    Examples
    --------
    >>> add = lambda x, y: x + y
    >>> cumulative_reduce(add, [1, 3, 5])
    [1, 4, 9]

    If "seed" is given, the result is initialized with that
    value instead of with the first item in "values", and
    the final item is omitted from the result.

    >>> cumulative_reduce(add, [1, 3, 5], seed=10)
    [10, 11, 14]
    """
    ...implementation...

Obviously, these conventions are specific to scientific Python. But the key is that other communities will have their own, and you should find out what those conventions are and adopt them. When in Rome, do as the Romans do. It’s actually taken me quite a few years of scientific programming to realise this (and internalise it). I hope this post will help someone get up to speed more quickly than I have.

(Incidentally, the WordPress/Chrome/OSX spell checker doesn’t bat an eye at “atwitter”. That’s awesome.)

Reference

Greg Wilson, DA Aruliah, C Titus Brown, Neil P Chue Hong, Matt Davis, Richard T Guy, Steven HD Haddock, Kathryn D Huff, Ian M Mitchell, Mark D Plumbley, Ben Waugh, Ethan P White, & Paul Wilson (2014). Best Practices for Scientific Computing PLoS Biol, 12 (1) DOI: 10.1371/journal.pbio.1001745

Our environmental future

Another link post, to a worthwhile article by Veronique Greenwood for Aeon (emphases mine):

For much of the thousands of years of human existence, our species has treated the world more or less as an open system. [...] the general faith was that there were, say, more whales somewhere [...] more trees somewhere [...]. Even today, in the face of imminent climate change, we continue to function as though there’s more atmosphere somewhere, ready to whisk off our waste to someplace else. It is time, though, to think of the world as a closed system. When you look at the resources involved in maintaining even a single member of a developed society, it’s hard to avoid the knowledge that this cannot continue. Last year, Tim De Chant, an American journalist who runs the blog Per Square Mile, made striking depictions of the space required if everyone in the world live liked the inhabitants of a number of countries. If we all lived like Americans, even four planet Earths would not be enough.

The article does suggest, however, that a change of mindset will push us to inventive solutions to our environmental problems. I hope she’s right.

My review of the Roost laptop stand

In short: it’s awesome; the best stand I have ever used, by a wide margin. Read on for details.

The Roost is an ingeniously designed laptop stand that folds away to nothing, so you can always carry it with you. It’s another Kickstarter success story. (It’s the third I’ve participated in, after the Elevate iPhone dock and the Pebble watch. I absolutely love the Kickstarter economy.)

Here’s a picture of the Roost in its carry bag. You can see that it’s just tiny:

Image

And unwrapped:

Image

And yet for all its diminutive size, this stand gives my laptop wicked air:

roost-before roost-after

The laptop screen actually sits higher (closer to eye level) than on other stands I’ve used from Griffin or Xbrand, despite the Roost being much lighter and smaller. Folding and unfolding the Roost is fantastically easy, smooth, and fast. It’s just excellent design.

The laptop is held up by two tiny tabs that latch underneath the display’s hinge:

2013-11-10 14.13.44

Ingenious!

If you’re at all thinking about purchasing a laptop stand, I can’t recommend the Roost highly enough. Buy it now.

Speed up your Mac’s wake up time using pmset. Do it again after upgrading to Mavericks

Last year I got a 15″ Retina Macbook Pro, an excellent machine. However, it was taking way longer than my 13″ MBP to wake up from sleep. After a few months of just accepting it as a flaw of the new machines and the cost of being an early adopter, I finally decided to look into the problem. Sure enough, I came across this excellent post from OS X Daily:

Is Your Mac Slow to Wake from Sleep? Try this pmset Workaround

Oooh, sweet goodness: basically, after 1h10min asleep, your Mac goes into a “deep sleep” mode that dumps the contents of RAM into your HDD/SSD and powers off the RAM. On wake, it needs to load up all the RAM contents again. This is slow when your machine has 16GB of RAM! Thankfully, you can make your Mac wait any amount of time before going into deep sleep. This will eat up your battery a bit more, but it’s worth it. Just type this into the Terminal:

sudo pmset -a standbydelay 86400

This changes the time to deep sleep to 24h. Since I rarely spend more than 24h without using my computer, I now have instant-on every time I open up my laptop!

Finally, the reason I wrote this now: upgrading to Mavericks sneakily resets your standbydelay to 4200. (Or, at least, it did for me.) Just run the above command again and you’ll be set, at least until the next OS upgrade comes along!

Update: the original source of this tip appears to be a post from Erv Walter on his site, Ewal.net. It goes into a lot more detail about the origin of this sleep mode — which indeed did not exist when I bought my previous Macbook Pro.

OSX software watch: use Photosweeper to remove duplicates in your image collection

It’s no secret that the photo management problem is a huge mess. As new cameras, software, and online storage and sharing services come and go, our collections end up strewn all over the place, often in duplicate. This eats up precious storage space and makes finding that one photo an exercise in frustration.

Peter Nixey has an excellent post on the disappointing state of affairs (to put it kindly) and an excellent follow-up on how Dropbox could fix it. You should definitely read those.

But, while Apple and/or Dropbox get their act together (I’m not holding my breath), you have to make sense of your photos in your Pictures folder, in your Dropbox Photos folder, in various other Dropbox shared folders, on your Desktop, in your Lightroom, Aperture, and iPhoto collections, and so on. A lot of these might be duplicated because, for example, you were just trying out Lightroom and didn’t want to commit to it so you put your pics there but also in Aperture. And by you I mean I.

So, the first step to photo sanity is to get rid of these duplicates. Thankfully, there is an excellent OSX app called Photosweeper made for just this purpose. I used it yesterday to clear 34GB of wasted space on my HDD. (I was too excited to take screenshots of the process, unfortunately!)

There’s a lot to love about Photosweeper. First, it is happy to look at all the sources I mentioned above, and compare pics across them. Second, it lets you automatically define a priority for which version of a duplicate photo to save. In my case, I told it to keep iPhoto images first (since these are most likely to have ratings, captions, and so on), then Aperture, then whatever’s on my HDD somewhere. If a duplicate was found within iPhoto, it should keep the most recent one.

But, third, what makes Photosweeper truly useful: it won’t do a thing without letting you review everything, and it offers a great reviewing interface. It places duplicates side-by-side, marking which photo it will keep and which it will trash. Best of all, this view shows everything you need to make sure you’re not deleting a high-res original in favour of the downscaled version you emailed your family: filename, date, resolution, DPI, and file size. Click on each file and the full path (even within an iPhoto or Aperture library) becomes visible. This is in stark contrast to iPhoto’s lame “hey, this is a duplicate file” dialog that shows you two downscaled versions of the images with no further information.

Once you bite the bullet, it does exactly the right thing with every duplicate: iPhoto duplicates get put in the iPhoto Trash, Lightroom duplicates get marked “Rejected” and put in a special “Trash (Photosweeper)” collection, and filesystem duplicates get moved to the OSX Trash. Lesser software might have moved all the iPhoto files to the OSX Trash, leaving the iPhoto library broken.

In all, I was really impressed with Photosweeper. 34GB is nothing to sniff at and getting rid of those duplicates is the first step to consolidating all my files. It does this in a very accountable, safe way. At no point did I get that sinking feeling of “there is no undo.”

Finally, I should mention that Photosweeper also has a “photo similarity” mode that finds not only duplicates, but very similar series of photos. This is really good for when you snapped 15 pics of the same thing so that one might turn out ok. But I’m too much of a digital hoarder to take that step!

Photosweeper currently sells for $10 on the Mac App Store.

All journals should require authors to publish their raw data

This is just a link post. The excellent and excellently-named Data Colada blog has a brilliant analysis of scientific fraud exposed by the raw data. Figures can obscure flaws that are immediately obvious in the numbers. (Although, Matt Terry’s awesome and hilarious Yoink might alleviate this.) In this case, averages of four numbers turning out to be integers every single timeand two independent experiments giving almost exactly the same distribution of values. (Frankly, if you can’t simulate random sampling from an underlying distribution, you don’t belong in the fraud world!)

The post demonstrates the importance of publishing as much data (and code) as possible with a paper. Words are fuzzy; data and code are precise.

See here for more.

Why PLOS ONE is no longer my default journal

Time-to-publication at the world’s biggest scientific journal has grown dramatically, but the nail in the coffin was its poor production policies.

When PLOS ONE was announced in 2006, its charter immediately resonated with me. This would be the first journal where only scientific accuracy mattered. Judgments of “impact” and “interest” would be left to posterity, which is the right strategy when publishing is cheap and searching and filtering are easy. The whole endeavour would be a huge boon to “in-between” scientists straddling established fields — such as bioinformaticians.

My first first-author paper, Joint Genome-Wide Profiling of miRNA and mRNA Expression in Alzheimer’s Disease Cortex Reveals Altered miRNA Regulation, went through a fairly standard journal loop. We first submitted it to Genome Biology, which (editorially) deemed it uninteresting to a sufficiently broad readership; then to RNA, which (editorially) decided that our sample size was too small; and finally to PLOS ONE, where it went out to review. After a single revision loop, it was accepted for publication. It’s been cited more than 15 times a year, which is modest but above the Journal Impact Factor for Genome Biology — which means that the editors made a bad call rejecting it outright. (I’m not bitter!)

Overall, it was a very positive first experience at PLOS. Time to acceptance was under 3 months, time to publication under 4. The reviewers were no less harsh than in my previous experiences, so I felt (and still feel) that the reputation of PLOS ONE as a “junk” journal was (is) highly undeserved. (Update: There’s been a big hullabaloo about a recent sting targeting open access journals with a fake paper. PLOS ONE came away unscathed. See also the take of Mike Eisen, co-founder of PLOS.) And the number of citations certainly vindicated PLOS ONE’s approach of ignoring apparent impact.

So, when looking for a home for my equally-awkward postdoc paper (not quite computer vision, not quite neuroscience), PLOS ONE was a natural first choice.

The first thing to go wrong was the time to publication, about 6 months. Still better than many top-tier journals, but no longer a crushing advantage. And it’s not just me: there’s been plenty of discussion about time-to-publication steadily increasing at PLOS ONE. But I was not too worried about the publication time, since I’d put my paper up on the arXiv (and revised it at each round of peer-review, so you can see the revision history there — but not on PLOS ONE).

But, after multiple rounds of review, the time came for production, at which point they messed up two things: they did not include my present address; and they messed up Figure 1, which is supposed to be a small, single-column, illustrative figure, and which they made page-width. The effect is almost comical, and my first impression seeing page 2 would be to think that the authors are trying to mask their incompetence with giant pictures. (We’re not, I swear!)

Figure 1 of my paper on arXiv (left) and PLOS ONE (right)
Figure 1 of our paper on arXiv (left) and PLOS ONE (right)

Both of these mistakes could have been avoided if PLOS ONE did not have a policy of not letting you see the camera-ready pdf before it is published, and of not allowing corrections to papers unless they are technical or scientific, regardless of fault. Not to mention they could have, you know, actually looked at the dimensions embedded in the submitted TIFFs. With a $1,300 publication fee, PLOS could afford to take a little bit of extra care with production. Both of the above policies are utterly unnecessary — the added cost of sending authors a production proof is close to nil, and keeping track of revisions on online publications is also trivial (see the 22 year old arXiv for an example).

We scientists live and die by our papers. We don’t want the culmination of years of work to be marred by a silly, easily-fixed formatting error, ossified by an unwieldy bureaucracy. I’ve been an avid promoter of PLOS (and PLOS ONE in particular) over the past few years, but I’m sad to say that’s not where my next paper will end up.

Ultimately, PLOS ONE’s model, groundbreaking though it was, is already being supplanted by newcomers. PeerJ offers everything PLOS ONE does at a fraction of the cost, and further includes a preprint service and open peer-review. Ditto for F1000 Research, which in addition offers unlimited revisions (a topic close to my heart ;). And both use the excellent MathJAX to render mathematical formulas, unlike PLOS’s archaic use of embedded images. They get my vote for the journals of the future.

[Note: the views expressed herein are mine alone — no co-authors were harmed consulted in the writing of this blog post.]

References

Nunez-Iglesias J, Liu CC, Morgan TE, Finch CE, & Zhou XJ (2010). Joint genome-wide profiling of miRNA and mRNA expression in Alzheimer’s disease cortex reveals altered miRNA regulation. PloS one, 5 (2) PMID: 20126538

Kravitz DJ, & Baker CI (2011). Toward a new model of scientific publishing: discussion and a proposal. Frontiers in computational neuroscience, 5 PMID: 22164143

Juan Nunez-Iglesias, Ryan Kennedy, Toufiq Parag, Jianbo Shi, & Dmitri B. Chklovskii (2013). Machine learning of hierarchical clustering to segment 2D and 3D images arXiv arXiv: 1303.6163v3

Nunez-Iglesias J, Kennedy R, Parag T, Shi J, & Chklovskii DB (2013). Machine Learning of Hierarchical Clustering to Segment 2D and 3D Images. PloS one, 8 (8) PMID: 23977123