CDAAC
  Home Page     Current Status     Data Center     Research Tools     Post Process Results     Climate Processing  
  Data Access     Batch Data Downloads     Data Download Interface     FTP Access     File Formats     Documentation     CDAAC Data Users FAQ  

06/13/2009

File Format documentation for CDAAC.

CDAAC data files are documented via a specialized XML format. Files in this format are converted via a perl script to HTML for display on the web. Much of the information to be stored in the XML files is available in other places in the CDAAC system, so there is also a perl script which adds or updates information in the XML files with this CDAAC system information.

All the HTML-ified XML files are accessed from the web via an image map (called pub.png) which shows the organization of file types in the /pub directory hierarchy.

The /ops/tools/www/cdaac/fileFormats/ directory contains several file types:

  1. XML files: filetype.xml (eg atmPrf.xml). These files contain descriptions in XML for one CDAAC data file type. These XML files can have the following tags:

    • data_format
      The root element for all files. Contains the 'name' attribute, plus other attributes to describe the name.
    • description
      Contains a text/html description of the data file.
    • global_data
      For netCDF files, introduces the global or attribute section.
    • profile_data
      For netCDF files, introduces the vector or profile data section.
    • ncfield
      For netCDF files. Contained within the global_data or profile_data sections. Has several attributes: desc, name, valid_range, type, missing_value, unit.
    • part
      For binary (such as binex) files. Introduces one section of the binary file. Has attributes name and desc.
    • binfield
      For binary files. Analogous to the ncfield tag. Contains several attributes: desc, name, type, vals, size.
    • multiple
      For binary files. Introduces a repeated section of the file. Contains attributes num and name.
  2. These XML file are processed by perl programs also contained in this directory:

    • extractNetcdfDoc.pl
      This program is an XML to XML filter which takes a basic XML file and adds information to it taken from the netCDF file it documents (for example, atmPrf.xml documents files like atmPrf_CHAM.2001.139.00.10.G05_0031.0002_nc) and from the PubFile database. The information includes naming convention info for the data_format tag and netCDF field info for the ncfield tags.
    • updateMap.pl
      This program generates the image and image map which shows the sample CDAAC /pub hierarchy. These files are pub.png and pub.map.
  3. pub.tar
    This file contains a sample /pub hierarchy which shows how the various file types are organized in the /pub directory. This file is used by the program updateMap.pl to generate the image and image map.

How to add and update data file documentation for CDAAC.

To add documentation for a new file type (say data type foobar), do the following (assuming that the file type already exists in PubFile and there is an example in the /pub area):

  1. First create a foobar.xml file like this:
    <data_format>
    
    <description>
    The 'foobar' file contains foo data in the bar format.
    More text...
    </description>
    
    <ncfield name="foo"/>  
    
    </data_format>
    
    For netCDF file types any ncfield tags you specify will result in documentation being extracted from the netCDF file and added to the xml.
  2. Then run extractNetcdfDoc.pl:
    cd /ops/tools/www/cdaac/fileFormats
    ./extractNetcdfDoc.pl foobar_2003.290.00.34_nc fmission
    
    This will add naming convention info and (if its a netCDF file) documentation info about netCDF fields.
  3. Now add any final edits to foobar.xml. The fileFormats.cgi script already located in the main web area will take care of dynamically rendering html content from the xml file.
  4. In case you want to add the foobar file type to the pub.png image map, try this. First untar the pub.tar file:
    tar -xvf pub.tar
    
    This will extract a directory ./pub in the fileFormats directory.
  5. Now add add a foobar directory in this hierarchy:
    cd ./pub/champ/level1b # (say)
    mkdir foobar
    cd /ops/tools/www/cdaac/fileFormats
    
  6. Now, create a new pub.tar file:
    mv pub.tar pub.tar.bak
    tar -cvf - pub > pub.tar
    
  7. Finally, update the image and image map:
    ./updateMap.pl