[Top] [Prev] [Next]

2.2 HDF File Format

An HDF file contains a file header, at least one data descriptor block, and zero or more data elements as depicted in Figure 2a.

FIGURE 2a - The Physical Layout of an HDF File Containing One Data Object

The file header identifies the file as an HDF file. A data descriptor block contains a number of data descriptors. A data descriptor and a data element together form a data object, which is the basic conglomerate structure for encapsulating data in the HDF file. Each of these terms is described in the following sections.

2.2.1 File Header

The first component of an HDF file is the file header, which takes up the first four bytes of the HDF file. Specifically, it consists of four one-byte values that are ASCII representations of control characters: the first is a control-N, the second is a control-C , the third is a control-S and the fourth is a control-A (^N^C^S^A).

Note that, on some machines, the order of bytes in the file header might be swapped when the header is written to an HDF file, causing these characters to be written in little-endian order. To maintain the portability of HDF file header data when developing software for such machines, this byte swapping must be counteracted by ensuring the characters are read and written in the desired order.

2.2.2 Data Object

A data object is comprised of a data descriptor and a data element. The data descriptor consists of information about the type, location, and size of the data element. The data element contains the actual data. This organization of HDF data makes HDF files self-describing. Figure 2b shows two examples of data objects.

FIGURE 2b - Two Data Objects

2.2.2.1 Data Descriptor

All data descriptors are twelve bytes long and contain four fields, as depicted in Figure 2c. These fields are: a 16-bit tag, a 16-bit reference number, a 32-bit data offset and a 32-bit data length.

FIGURE 2c - The Contents of a Data Descriptor

Tag

A tag is the data descriptor field that identifies the type of data stored in the corresponding data element. A tag is a 16-bit unsigned integer between 1 and 65,535, and is associated with a mnemonic name to promote ease to use and the readability of user programs.

If a data descriptor has no corresponding data element, the value of its tag is DFTAG_NULL (or 0).

Tags are assigned by the HDF Group as part of the HDF specification. The following are the ranges of tag values and their descriptions:

1 to 32,767 - Tags reserved for HDF Group use
32,768 to 64,999 - User-definable tags
65,000 to 65,535 - Tags reserved for expansion of the HDF specification
A list of commonly-used tags and their descriptions is included in Appendix A, NCSA HDF Tags, of this document.

Reference Number

For each occurrence of a tag in an HDF file, a unique reference number is assigned by the library with the tag in the data descriptor. A reference number is a 16-bit unsigned integer and can not be changed during the life of the data object that the reference number specifies.

The combination of a tag and a reference number uniquely identifies the corresponding data object in the file.

Reference numbers are not necessarily assigned consecutively, so it cannot be assumed that the value of a reference number has any meaning beyond providing a way of distinguishing among objects with the same tag. While application programmers may find it convenient to impart some additional meaning to reference numbers in their code, it is emphasized that the HDF library will not internally recognize any such meaning.

Data Offset and Length

The data offset field points to the location of the data element in the file by storing the number of bytes from the beginning of the file to the beginning of the data element. The length field contains the size of the data element in bytes. The data offset and the length are both 32-bit unsigned integers.

2.2.2.2 Data Elements

The data element is the raw data portion of a data object.

2.2.3 Data Descriptor Block

Data descriptors are physically stored in a linked list of blocks called data descriptor blocks. The relationship between the data descriptor block to the other components of an HDF file is illustrated in Figure 2a on page 7. The individual components of a data descriptor block are depicted in Figure 2d on page 10. Each data descriptor in a data descriptor block is assumed to be associated with a data element unless it contains the tag DFTAG_NULL (or 0),which indicates that there is no associated data element. By default, a data descriptor block contains 16 (defined as DEF_NDDS) data descriptors. The user may reset this limit when creating the HDF file. Refer to Section 2.3.2 on page 11 for more details.

In addition to data descriptors, each data descriptor block contains a data descriptor header. The data descriptor header contains two fields: block size and next block. The block size field is a 16-bit unsigned integer indicating the number of data descriptors in the data descriptor block. The next block field is a 32-bit unsigned integer indicating the offset of the next data descriptor block, if one exists. The last data descriptor header in the list contains a value of 0 in its next block field.

Figure 2d illustrates the layout of a data descriptor block.

FIGURE 2d - Data Descriptor Block

2.2.4 Grouping Data Objects in an HDF File

Data objects containing related data in HDF files are usually grouped together by the library. These groups of data objects are called data sets. The HDF user uses the application interface to manipulate data sets in a file. As an example, an 8-bit raster image data set requires three objects: a group object identifying the members of the set, an image object containing the image data, and a dimension object indicating the size of the image.

Data objects are individually accessible even if they are included in a set, therefore data objects can belong to more than one set and sets can be included in larger groups. For example, a palette object included in one raster image set may also be a part of another raster image set if its tag and reference number are included in a data descriptor within that second set.

Additional information about data objects, including the options available for storing them, can be found in the HDF Specifications and Developer's Guide v3.2 from the HDF WWW home page at http://hdf.ncsa.uiuc.edu/.



[Top] [Prev] [Next]

hdfhelp@ncsa.uiuc.edu
HDF User's Guide - 05/19/99, NCSA HDF Development Group.