The netCDF File Format

 

  1. What is netCDF?
    1. When to use netCDF
    2. The netCDF file format is very useful when you want to store gridded or time series data for a lot of variables. NetCDF files are self documenting. That is, they include the units of each variable and notes about what it means and how it was collected. This extra information is called "metadata".

    3. How to access netCDF data
    4. You get data into and out of a netCDF file by accessing it with a library of special functions from UNIDATA and other organizations. The function library is available from UNIDATA for C, C++, FORTRAN, PERL, and Java. Other sites have the library for MATLAB and various other languages. The Yorick language, for example, has netCDF access built in so you don't even need the library. Use google to find these other sites if you need them at some point in the future. The Meteorology Department's Linux and Windows systems usually have the netCDF library installed for MATLAB and various other languages. In this course we'll only concern ourselves with the MATLAB routines running under Windows.

    5. How to access netCDF documentation
    6. General netCDF documentation is available from the UNIDATA site. They have both HTML that you can browse on their site and PDF files that can download and browse on your own Windows or Linux system. The netCDF function library includes functions for accessing both the data and the "metadata" that describes it. The documentation for the USGS/Rutgers MATLAB Toolkit we'll be using to access netCDF includes both a detailed users guide to the netCDF functions in MATLAB and a nice tutorial introduction to the basic concepts of netCDF.

    7. Accessing netCDF files from MATLAB

       

  2. Online sources of netCDF programming information.
    1.  

  3. Common features
    1. Variables and Constants
    2. NetCDF supports a wide range of data variable types (byte, char, short, int, float, double) and requires strong typing.

      Special variables called dimensions are used to represent either a physical dimension in space and time or the index of a list. The dimension variable just tells you how big the array is in that dimension. Special variables called coordinate variables are used to specify the values of the grid or index associated with a dimension. The coordinate and dimension variables that go together have the same names. Additional information about a variable may be stored in attribute variables.

    3. Data Structures
    4. NetCDF's main role is to store multidimensional arrays. It has none of the more complex data structures such as hashes.

    5. Operators
    6. NetCDF has no operators because it isn't a language. It is just a file format and a set of functions in some other language (C for example) to read and write those file.

    7. Functions
    8. NetCDF data access functions are strongly typed as are their parameters. There are no user generated functions because a main purpose of netCDF is to protect the user from having to know the details of how the data is stored in the netCDF file. As with any file type, before you access netCDF data, you must open the file. You'll then need to find out what type of information is is in the file. Inquire as to number of dimensions, variables, number of global attributes, id of the dimension with unlimited length. After that you get the dimension information. Then you get the variable information. Then, the point of the whole exercise, you get the variable values. You can read in an entire array at a time or just parts of it by using difference functions from chapter 7 of the manual. Finally, you'll need to close the netCDF file.

      Equivalent functions are available in the MATLAB interface. Check the users guide for the exact MATLAB syntax. This syntax may vary somewhat from that described in the more general UNIDATA documentation linked to in the paragraph above. Sample code later in this lecture will demonstrate how to put all these pieces together.

    9. Objects
    10. NetCDF has no objects because it isn't a language. It does feel rather object oriented though because it packages the data description right in with the data. Thus, thinking like an object oriented programmer helps a lot when you're dealing with an netCDF dataset. The C++ version of the netCDF library is, of course, object oriented. Other versions, including the MATLAB version, return some information as structures. These structures are accessed with the dot syntax familiar from most object oriented languages.

    11. Flow control
    12. NetCDF has no operators or flow control because it isn't a language.

    13. I/O
    14. NetCDF is all about file I/O, thus all the functions in the netCDF library relate to file I/O in some way.

  1. Using the GUI