Skip to content
Phil Bentley edited this page Nov 7, 2013 · 17 revisions

About cdlparser

The cdlparser module implements a Python-based parser for reading files or text strings encoded in netCDF-3 Common Data form Language, a.k.a. CDL. The parser logic is based upon the tokens and rules defined in the flex and yacc files as used by the ncgen3 utility that ships with the standard netCDF distribution.

If all you want to do is convert a CDL file to a netCDF file, then ncgen is still the command to use. If, however, you are wanting to incorporate CDL-reading capability into an existing or new Python-based workflow - and you don't want to drop out to a shell to invoke ncgen - then cdlparser may work for you.

One potentially useful application of cdlparser is to generate netCDF files on-the-fly from CDL text strings or files as part of a Python unit test suite. In this way you don't need to maintain a collection of netCDF binary files; instead you can simply keep the equivalent CDL text files in your repository and use those with your unit tests. (Naturally this is only practicable for small test cases; but then you wouldn't want to create huge files for a unit test now, would you? ;-)

Package Dependencies

The cdlparser module depends upon the following Python packages. If you don't already have these then you'll need to download and install them.

The latter package has its own multiple dependencies, of course (e.g. netCDF4, HDF5, NumPy, and so on). These too will need to be installed. Please refer to the respective documentation. I'm rather assuming that if you're interested in parsing CDL files then you're probably working with these packages already.

Installation

Once the above-mentioned packages have been installed (if necessary), simply download the cdlparser.py file and copy it to a suitable location on your Python path. You may need administrator privileges if you plan to copy the module to one of the default Python directories on your system.

Basic Usage

The basic usage idiom for parsing CDL text files is as follows:

from cdlparser import CDL3Parser
myparser = CDL3Parser(...)
ncdataset = myparser.parse_file(cdlfilename, ...)

If the input CDL file is valid then the above code should result in a netCDF file being generated. On completion of parsing the output filename can be obtained by querying the myparser.ncfile attribute.

The ncdataset variable returned by the parse_file() method is a handle to a netCDF4.Dataset object, which you can then query and manipulate as you wish. By default this dataset handle is left open when parsing has completed; hence you will need to call the object's close() method when you're done with it. If you know that you won't need to manipulate the dataset after parsing then you can set the close_on_completion keyword argument to True when the parser object is created, thus:

myparser = CDL3Parser(close_on_completion=True, ...)
ncdataset = myparser.parse_file(cdlfilename, ...)

By default the name of the netCDF file produced by the parse_file() method is taken from the dataset name defined in the first line of the CDL file (with a '.nc' extension appended), just as the ncgen command does. You can supply a different filename, however, via the optional ncfile keyword argument, e.g.:

ncdataset = myparser.parse_file(cdlfilename, ncfile="/my/nc/folder/stuff.nc", ...)

Note: the '.nc' extension is merely a convention, albeit a very widely used one. You can use whatever extension makes sense in your context.

In addition to parsing CDL files, you can also parse CDL definitions stored in plain text strings. The parse_text() method is used in this case, as shown below:

cdltext = r'netcdf mydataset { dimensions: dim1=10; variables: float var1(dim1); var1:comment="blah blah"; }'
myparser = CDL3Parser(...)
ncdataset = myparser.parse_text(cdltext)

The above code should create a netCDF file called 'mydataset.nc' in the current working directory. Note that the CDL text will usually need to be a raw string of the form r'...' in order for the string to be passed unmodified to the parser.

Unsupported Features

The following CDL v3 features are not currently supported by cdlparser:

  • Use of the 'l' or 'L' suffix to indicate integer constants. This feature has been deprecated.
  • Use of the lexical tokens DoubleInf, NaN, +/-Infinity, FloatInf, +/-Inff. These tokens were used to indicate fill values prior to netCDF 2.4 (according to the ncgen.l file).

Enjoy!

-- rockdoc

Clone this wiki locally