When I need to manipulate large data files (i.e. 100’s of GBs or TBs), I usually write a script in NCL or Python to do the job. Recently I’ve realized that this is often a huge waste of time! The problem is that my approach requires the task to be broken down into chunks, which slows the process by several orders of magnitude due to the computational overhead that results from iterations, as well as my own time to write the code.

A much faster and cleaner way to manipulate large datasets is to use command line operators that are specifically designed for climate datasets.Two libraries have been independently developed to do this, specifically:

The “Climate Data Operators” (CDO) library was developed at the Max-Planck Institute in Hamburg, Germany, whereas the “NetCDF Operators” (NCO) library was developed as an open-source project by various people. They can both do the same things, but the commands look very different. They also work differently under the hood, which can result in different performance outcomes for the same calculation.

Here’s a nice list of simple NCO examples. The nice thing about NCO is that there is a short list of basic commands. In spite of this simplicity, NCO commands can be more complicated than CDO commands.

  • ncap     – NetCDF Arithmetic Processor
  • ncatted  – NetCDF Attribute Editor
  • ncbo     – NetCDF Binary Operator (ex. ncadd, ncmultiply)
  • ncea     – NetCDF Ensemble Averager
  • ncecat   – NetCDF Ensemble Concatenator
  • ncflint  – NetCDF File Interpolator
  • ncks     – NetCDF Kitchen Sink
  • ncpdq    – NetCDF Permute Dimensions Quickly, Pack Data Quietly
  • ncra     – NetCDF Record Averager
  • ncrcat   – NetCDF Record Concatenator
  • ncrename – NetCDF Renamer
  • ncwa     – NetCDF Weighted Averager

CDO has a long list of operators, which can be hard to remember. I still need to look them up everytime I use them, but I imagine I’ll start to remember a few overtime.

Here’s a simple example of combining a list of files with different timesteps into a single output file:

cdo copy  ifile1 ifile2 ifile3  ofile

This is pretty straightforward. Doing the same thing with NCOis also pretty simple

ncrcat -h  ifile1 ifile2 ifile3  ofile

One instance where CDO wins over NCO is converting grib files to netcdf:

cdo -f nc copy file.grb

An instance where NCO seems to have the advantage is editing variable attributes. The NCO attribute editor “ncatted” makes this pretty simple:

ncatted -O   units,U,m,c,"m/s"

In the end, it’s not a matter of choosing the “best” library to add to the toolbox. Both have various strengths and weakness that should be exploited. Either way, they can save us huge amounts of time!

Over time I will be posting a series of short articles with examples and tricks. These are mostly just for my own reference, but I hope other people will find them useful. A list of these can be found here on my Publications page.

3 thoughts on “CDO vs NCO

  1. David Gold

    Nice write up. I recently discovered the ‘cdo’ utility when fumbling around trying to figure out how to convert from grib to netcdf. The trickiest issue was installing the utility since there are numerous dependencies. I have also found that there are some grib files that have strange packing and that they have to be converted to ‘simple_grid’ format first using the ECMWF grib API before passing to cdo. But on the whole I’ve found it to be a very fast way to manipulate grib/netcdf files.

    1. Andualem

      NCL’s ncl_convert2nc function might be a better candidate here. It is just a one line command that can be run from terminal. eg.
      ncl_convert2nc U12345.grb -v PRES_6_SFC,PRES_6_TRO -L


Leave a Reply

Your email address will not be published. Required fields are marked *