NCSA HDF Specification and DeveloperÕs Guide HDF General Purpose Interface 3-1 National Center for Supercomputing Applications November 8, 1993 3-1 November 8, 1993 3-1 Chapter 3 General Purpose Interface Chapter Overview This chapter provides a detailed description of the routines that make up the HDF general purpose interface. Introduction HDF supports several interfaces which can be categorized as high level and general purpose interfaces: ¥ High level interfaces support utilities and applications. ¥ General purpose interfaces perform basic operations on HDF files. These levels are illustrated in Figure 3.1, ÒHDF Software Layers.Ó Figure 3.1. HDF Software Layers This chapter is concerned only with the general purpose routines. Using these routines, you will be able to build and manipulate HDF objects of any type, including those of your own design. All HDF applications developed at NCSA use them as basic building blocks. The general purpose routines are all written in C but are typically accessible from FORTRAN. New General purpose Routines with Version 3.2 The general purpose routines described in this chapter were new with HDF Version 3.2, released in June 1992; they replace the routines provided with earlier versions. The new routines provide better performance and increased functionality and users are strongly advised to use them in new applications. The old routines are supported through emulation, but may be eliminated from the HDF library in a future release. The new lower layer incorporates the following improvements: ¥ More consistent data and function types ¥ More meaningful and extensive error reporting ¥ Simplification of key lower level functions ¥ Simplified techniques to facilitate portability ¥ Support for alternate forms of physical storage, such as linked blocks storage and storage of the data portion of an object in an external file ¥ A version tag to indicate which version of the HDF library last changed a file ¥ Support for simultaneous access to multiple files ¥ Support for simultaneous access to multiple objects within a single file The previous lower layer was called the DF layer because all routines began with the letters DF (e.g., DFopen and DFclose). The new lower layer is called the H layer because all routines begin with the letter H (e.g., Hopen, Hclose, and Hwrite). The source modules containing these routines begin with the letter h (see Table 2.1, ÒHDF Version 3.2 source code modulesÓ): hfile.c Basic I/0 routines herr.c Error-handling routines hkit.c General purpose routines hblocks.c Routines to support linked block storage hextelt.c Routines to support external storage of HDF data elements Overview of the Interface This section provides specifications and descriptions of the public functions of the general purpose interface. Opening and Closing HDF Files These calls are used to open and close HDF files: Hopen Provides an access path to an HDF file and reads all of the DD blocks in the file into memory Hclose Closes the access path to a file Locating Elements for Access and Getting Information These routines locate elements or acquire other information about an HDF file or its data objects. Except for Hendaccess, they initialize the element that they locate and return an access ID that is used in later references to the data element. Calls can include wildcards so that one can search for unknown tags and reference numbers (tag/refs). Hstartread Locates an existing data element with matching tag/ref and returns an access ID for reading it Hnextread Continues the search with the same accessÊID Hendaccess Disposes of access ID for tag/ref Hinquire Returns access information about a data element Hishdf Determines whether a file is an HDF file. Hnumber Returns the number of occurrences of a specified tag/ref in a file Hgetlibversion Returns version information for the current HDF library Hgetfileversion Returns version information for an HDF file Reading and Writing Entire Data Elements There are two sets of routines for reading and writing data elements. The routines described here are used to store and retrieve entire data elements. Hputelement Adds or replaces elements in a file Hgetelement Reads data elements in a file A second set of routines, described in the next section, may be used if you wish to access only part of a data element. Reading and Writing Part of a Data Element The second set of routines for reading and writing data elements makes it possible to read or write all or part of a data element. One of the access routines Hstartread or Hstartwrite must be called before these Hwrite, Hread, or Hseek: Hstartwrite Sets up writing to the object with the supplied tag/ref. If the object exists, it will be modified; otherwise it will be created. Hwrite Writes data to a data element where the last write or Hseek() stopped. If the space reserved is less than the length to write, then only as much as can fit is written. Hread Reads a portion of a data element. It starts at the last position left by an Hread or Hseek call and reads any data that remains in the element up to a specified number of bytes. Hseek Sets the access pointer to an offset within a data element. The next time Hread or Hwrite is called, the access occurs from the new position. The location to seek can be specified as an offset from the current location, from the start of the element, or from the end of the element.. Manipulating Data Descriptors (DDs) These routines perform operations on DDs without doing anything with the data to which the DDs refer: Hdupdd Generates new references to data that is already referenced from somewhere else Hdeldd Deletes a tag/ref from the list of DDs Hnewref Returns the next available reference number for the HDF file Creating Special Data Elements HDF 3.2 introduces two alternate methods of storing HDF objects: linked blocks and external elements. In previous releases, any data element had to be stored contiguously and all of the objects in an HDF file had to be in the same physical file. The contiguous requirement caused many problems, especially with regard to appending to existing objects. If you wanted to append data to an object, the entire data element had to be deleted and rewritten to the end of the file. Linked blocks allow elements in a single HDF file to be non- contiguous. External elements allow a single HDF object to be stored in an external file. It is not currently possible to store a single object (such as a very large data set) in multiple files. Nor can multiple objects be stored in one external file. Once they are created with the following routines, these special data elements can be accessed with the routines used for normal data elements: HLcreate Creates a new linked block special data element HXcreate Creates a new external file special data element These routines have two modes of operation. Calling HLcreate with a tag/ref that does not exist in a file will create a new element with the given tag/ref which will be stored as linked blocks. On the other hand, if the tag/ref already exists in the file, the referenced object will be promoted to linked block status. All data which had been stored in the object before the promotion will be retained. HXcreate behaves similarly. Development Routines The HDF library provides the following developer-level routines that simplify the task of writing HDF applications. Most of these routines mirror basic C library functions which are, unfortunately, not always completely portable in their library form: HDgettagname Returns a pointer to a text string describing a given tag HDgetspace Allocates space HDfreespace Frees space HDstrncpy Copies a string from one location to another up to a given number of characters Error Reporting The HDF library incorporates the notion of an error stack. This allows much of the context to be known when trying to decipher an error message. Error reporting is handled by the following routines: HEprint Prints out all of the errors on the error stack to a specified file HEclear Clears the error stack HERROR Reports an error Pushes the following information onto the error stack: Error type source file name Line number and the name of the function reporting the error HEreport Adds a text string to the description of the most recently reported error (only one text string per error) Standard C does not enable the code inside a function to know the name of the function. Therefore, to use the macro HERROR to report errors, there must exist a variable FUNC which points to a string containing the name of the reporting function. Other The Hsync routine has been defined and implemented to synchronize a file with its image in memory. Currently it is not very useful because the HDF software includes no buffering mechanism and the two images are always identical. Hsync will become useful when buffering is implemented: Hsync Synchronizes the stored version of an HDF file with the image in memory Function Specifications The terms IN: and OUT: are used as follows in this discussion: IN: Value as input parameter OUT: Value as output parameter Opening and Closing Files Hopen int32 Hopen(char *path, int access, int16 ndds) path IN: Name of file to be opened access IN: DFACC_READ, DFACC_RDWR, DFACC_CREATE, DFACC_ALL, or DFACC_WRITE ndds IN: Number of DDs in a block if this file needs to be created Purpose Provides an access path to an HDF file and reads all of the DD blocks in the file into primary memory. Return value Returns file ID if successful and FAIL (-1) otherwise. Description Opens an HDF file. The following events occur on successful exit: ¥ File_rec members are filled in. (File_rec is an internal HDF structure containing information about the opened file.) ¥ The requested file is opened with the relevant permission. ¥ Information about DDs is set up in memory. ¥ The file headers and initial information are set up for new files. Access privilege codes HDF provides several constants for use as access privilege codes as listed below. Note that these constants are not bit-flags and should not be ORed together to combine access modes. Doing so may cause odd behavior and, in some cases, loss of data: Recommended: DFACC_READ Open for read only. If file does not exist, error. DFACC_RDWR Open for read/write. If file does not exist, create it. DFACC_CREATE Force creation. If file exists, delete it, then open a new file for read/write (in the spirit of the UNIX System command clobber). Others: DFACC_ALL Same as DFACC_RDWR (obsolete but still supported). DFACC_WRITE Same as DFACC_RDWR (obsolete but still supported). Hclose intn Hclose(int32 id) id IN: The file ID of the file to be closed Purpose Closes the access path to the file. Return value Returns SUCCEED (0) if successful and FAIL (-1) otherwise. Description id is first validated. If valid, the function closes the access path to the file. If there are still access elements attached to the file, the error DFE_OPENAID is pushed onto the error stack and the file is not closed. This is a fairly common error when developing new interfaces. See the discussion of Hendaccess below for debugging hints. Locating Elements for Access and Getting Information Hstartread int32 Hstartread(int32 file_id, uint16 tag, uint16 ref) file_id IN: ID of file to attach access element to tag IN: Tag to search for ref IN: Reference number to search for Purpose Locates an existing data element with matching tag/ref and returns an access ID for reading it. Return value Returns access element ID if successful and FAIL (-1) otherwise. Description Searches the DDs for a particular tag/ref combination. If the search is successful, an access element is created, attached to the file, and positioned at the start of that data element; otherwise an error is returned. Searching on wildcards begins from the beginning of the DD list. Wildcards can be used for the tag or reference number (DFTAG_WILDCARD and DFREF_WILDCARD) and they match any values. Hnextread intn Hnextread(int32 access_id, uint16 tag, uint16 ref, int origin) access_id IN: ID of a READ access element tag IN: Tag to search for ref IN: Reference number to search for origin IN: Position at which to start searching Purpose Locates and positions a read access ID on next occurrence of tag/ref. Return value Returns SUCCEED (0) if successful and FAIL (-1) otherwise. Description Searches for the next DD that fits the tag/ref. Wildcards apply. If origin is DF_START, searches from start of DD list; if origin is DF_CURRENT, searches from current position. Searching from the end of the file via DF_END is not yet implemented. If the search is successful, then the access element is positioned at the start of that tag/ref; otherwise, the access ID is not modified. Hstartwrite int32 Hstartwrite(int32 file_id, uint16 tag, uint16 ref, int32 length) file_id IN: ID of file to write to tag IN: Tag to write to ref IN: Reference number to write to length IN: Length of the data element Purpose Creates or replaces data element with matching tag/ref. Return value Returns access element ID if successful and FAIL (-1) otherwise. Description Sets up an access element to write a data element. The DD list of the file is searched first; if the tag/ref is found, the data element can be modified. If an object with the corresponding tag/ref is not found, a new one is created. Hendaccess int32 Hendaccess(int access_id) access_id IN: ID of access element to dispose of Purpose Disposes of access element for tag/ref. Return value Returns SUCCEED (0) if successful and FAIL (-1) otherwise. Description Disposes of an access element. Only a finite number of access elements can be active at a given time, so it is important to call Hendaccess whenever you are done using an element. When developing new interfaces, a common mistake is to fail to call Hendaccess for all of the elements accessed. When this happens, Hclose will return FAIL and the dump of the error stack (see HEprint below) will tell how many access elements are still active. This can be a difficult problem to debug, as the low levels of the HDF library have no idea who or what opened an access element and forgot to release it. A tedious but effective means of debugging this problem is to annotate with comments the locations where the attached count of a file record is changed. This occurs in the files hfile.c, hblocks.c, and hextelt.c. Hinquire intn Hinquire(int32 access_id, int32 *pfile_id, uint16 *ptag, uint16 *pref, int32 *plength, int32 *poffset, int32 *pposn, int *paccess, int16 *pspecial) access_id IN: Access element ID pfile_id OUT: File ID ptag OUT: Tag of the element pointed to pref OUT: Reference number of the element pointed to plength OUT: Length of the element pointed to poffset OUT: Offset of element in the file pposn OUT: Position pointed to within the data element paccess OUT: Access type of this access element pspecial OUT: Special code Purpose Returns access information for a data element. Return value Returns SUCCEED (0) if the access element points to some data element and FAIL (-1) otherwise. Description Inquires for the statistics of the data element pointed to by the access element. If a piece of information is not needed, a NULL can be sent in for that value. Convenience macros for calls to Hinquire (HQuerypositon, HQuerylength, etc.) are defined in hdf.h. Hishdf int32 Hishdf(char *path) path IN: Name of file Purpose Determines whether a file is an HDF file. Return value Returns TRUE (non-zero) if file is an HDF file and FALSE (0) otherwise. Description The decision as to whether a file is an HDF file is based solely on the magic number stored in the first four bytes of an HDF file. Hishdf may sometimes identify a file as an HDF file that Hopen is unable to open (e.g., an HDF file with a corrupted DD list). Note: Hishdf only determines whether a file is an HDF file. It does not verify that the file is readable. Hnumber int Hnumber(int32 file_id, uint16 tag) file_id IN: File ID tag IN: Tag to be counted Purpose Counts the number of occurrences of a tag in a file. Return value The number of occurrences of a tag in a file. Hgetlibversion Hgetlibversion(uint32 *majorv, uint32 *minorv, uint32 *release, char string[]) majorv OUT: Major version number minorv OUT: Minor version number release OUT: Release number string OUT: Informational text string Purpose Gets version information for current HDF library. Return value Returns SUCCEED (0). Description Returns the version of the HDF library. The version information is compiled into the HDF library, so it is not necessary to have any open files for this function to execute. Hgetfileversion Hgetfileversion(uint32 file_id, uint32 *majorv, uint32 *minorv, uint32 *release, char *string) file_id IN: File ID majorv OUT: Major version number minorv OUT: Minor version number release OUT: Release number string OUT: Informational text string Purpose Gets version information for an HDF file. Return value Returns SUCCEED (0) if successful and FAIL (-1) otherwise. Description Returns the HDF version information stored in the given file. Exactly what the version information of a file should mean is still an open question, so user code should not call this function. Reading and Writing Entire Data Elements Hputelement int Hputelement(int32 file_id, uint16 tag, uint16 ref, uint8 *data, int32 length) file_id IN: File ID tag IN: Tag of data element to put ref IN: Reference number of data element to put data IN: Pointer to buffer length IN: Length of data Purpose Adds or replaces an element in a file. Return value Returns SUCCEED (0) if successful and FAIL (-1) otherwise. Description Writes a new data element or replaces an existing data element in a HDF file. Uses Hwrite and its associated routines. Hgetelement int Hgetelement(int32 file_id, uint16 tag, uint16 ref, uint8 *data) file_id IN: ID of the file to read from tag IN: Tag of data element to read ref IN: Reference number of data element to read data OUT: Buffer to read into Purpose Obtains the data referred to by the passed tag/ref. Return value Returns SUCCEED (0) if successful and FAIL (-1) otherwise. Description Reads a data element from an HDF file and puts it into the buffer pointed to by data. The space allocated for the buffer is assumed to be large enough. Note: Hgetelement assumes that the buffer is large enough to hold the data being read. It is the userÕs responsibility to prevent data loss by ensuring that this is the case. Reading and Writing Part of a Data Element Hread int32 Hread(int32 access_id, int32 length, uint8 *data) access_id IN: Read access element ID length IN: Length of segment to read in data OUT: Pointer to data array to read to Purpose Reads a portion of a data element. Return value Returns length of segment actually read if successful and FAIL (-1) otherwise. Description Reads in the next segment in the data element pointed to by the access element. Hread starts at the last position left by an Hread or Hseek call and reads any data that remains in the element up to length bytes. If the data element is too short (less than length bytes long), Hread reads to the end of the data element. Hwrite int32 Hwrite(int32 access_id, int32 length, uint8 *data) access_id IN: Write access element ID length IN: Length of segment to write data IN: Pointer to data to write Purpose Writes next data segment to data element. Return value Returns length of segment successfully written and FAIL (-1) otherwise. Description Writes the data to the data element where the last Hwrite or Hseek stopped. Hwrite starts at the last position left by an Hwrite or Hseek call, writes up to a specified number of bytes, and leaves the write pointer at the end of the data written. If the space reserved is less than the length to write, then only as much as can fit is written. It is the userÕs responsibility to ensure that no two access elements are writing to the same data element. Note that a user can interlace writes to multiple data elements in the same file. Hseek intn Hseek(int32 access_id, int32 offset, int origin) access_id IN: Access element ID offset IN: Offset to seek to origin IN: Position to seek from: DF_START (0) offset from beginning of data element DF_CURRENT (1) offset from current position DF_END (2) offset from end of data element Purpose Sets the access pointer to an offset within a data element. The next time Hread or Hwrite is called, the read or write occurs from the new position. Return value Returns SUCCEED (0) if successful and FAIL (-1) otherwise. Description Sets the position of an access element in a data element so that the next Hread or Hwrite will start from that position. origin determines the position from which offset should be counted. This routine fails if the access element is not associated with a data element or if the position sought is outside of the data element. Seeking from the end of a data element is not currently supported. Manipulating Data Descriptors Hdupdd int Hdupdd(int32 file_id, uint16 tag, uint16 ref, uint16 old_tag, uint16 old_ref) file_id IN: File ID tag IN: Tag of new data descriptor ref IN: Reference number of new data descriptor old_tag IN: Tag of data descriptor to duplicate old_ref IN: Reference number of data descriptor to duplicate Purpose Generates new references to data that is already referenced from somewhere else. Return value Returns SUCCEED (0) if successful and FAIL (-1) otherwise. Description Duplicates a data descriptor so that the new tag/ref points to the same data element pointed to by the old tag/ref. Hdeldd int Hdeldd(int32 file_id, uint16 tag, uint16 ref) file_id IN: File ID tag IN: Tag of data descriptor to delete ref IN: Reference number of data descriptor to delete Purpose Deletes a tag/ref from the list of DDs. Return value Returns SUCCEED (0) if successful and FAIL (-1) otherwise. Description Deletes the data descriptor of tag/ref from the DD list of the file. This routine is unsafe and may leave a file in a condition that is not usable by some routines. Use with care. Hnewref uint16 Hnewref(int32 file_id) file_id IN: File ID Purpose Returns the next available reference number. Return value Returns the reference number if successful and 0 otherwise. Description Returns a reference number that can be used with any tag to produce a unique tag/ref. Successive calls to Hnewref will generate a strictly increasing sequence until the highest possible reference number has been returned; then Hnewref will return unused reference numbers starting from 1. Creating Special Data Elements HLcreate int32 HLcreate(int32 file_id, uint16 tag, uint16 ref, int32 block_length, int32 number_blocks) file_id IN: File ID tag IN: Tag of new data element (or object) ref IN: Reference number of new data element (or object) block_length IN: Length of blocks to be used number_blocks IN: Number of blocks to use per linked block record Purpose: Creates a new linked block special data element. Return value Returns access ID for special data element if successful and FAIL (-1) otherwise. Description Appending to existing HDF elements was a problem prior to HDF Version 3.2 because HDF objects had to be stored contiguously. When appending, the HDF library forced the user to delete the existing element and rewrite it at the end of the file. HDF Version 3.2 introduced the concept of linked blocks, which allow unlimited appending to existing elements without copying over existing data. This routine can be used to create an object with the given tag/ref as a linked block element or to promote an existing element to be stored in linked blocks. Initially, a table is set up to accommodate number_blocks linked blocks for the specified data object. Each block has block_length bytes. If an existing object is being promoted, block_length does not have to be the same size as the original element. HLcreate returns an active access ID with write permission to the linked block element. HXcreate int32 HXcreate(int32 file_id, uint16 tag, uint16 ref, char *extern_file_name) file_id IN: file record ID tag IN: Tag of the special data element to create or promote ref IN: Reference number of the special data element to create/promote extern_file_name IN: name of the external file to use for the data element Purpose Creates a new external file special data element. Return value Returns access ID for special data element if successful and FAIL (-1) otherwise. Description Creates a new element in an external file or promotes an existing element to be stored in an external file. If an existing element is to be promoted, it is deleted (using Hdeldd) from the original file and copied into the new external file. Distributing a single object over multiple external files is not currently supported. In addition, one cannot place multiple objects in the same external file. This routine returns an active access ID with write permission to the external element. Development Routines HDgettagname char *HDgettagname(uint16 tag) tag IN: Tag to look up Purpose Gets a meaningful description of a tag. Return value Returns a pointer to a string describing this tag or NULL if the tag is unknown. Description To reduce the amount of duplicated code, this routine can be used to map a tag to a character string containing the name of the tag. The string returned by this routine is guaranteed to be 30 characters or less. HDgetspace void *HDgetspace(uint32 qty) qty IN: Number of bytes to allocate Purpose Allocates space. Return value If successful, returns a pointer to space that was allocated; otherwise returns NULL . Description Uses an appropriate allocation routine on the local machine to get space. HDfreespace void *HDfreespace(void *ptr) ptr IN: Pointer to previously-allocated space that is to be freed Purpose Frees space. Return value Returns NULL. Description Uses an appropriate routine on the local machine to free space. This routine is platform dependent. HDstrncpy char *HDstrncpy(register char *dest, register char *source, int32 length) dest OUT: Pointer to area to copy string to source IN: Pointer to area to copy string from length IN: Maximum number of bytes to copy Purpose Copies a string with maximum length length. Return value Returns address of dest. Description Creates a string in dest that is at most length characters long. The number of characters must include the NULL terminator for historical reasons. Hence, if you are working with the string Foo, you must call this copy function with the value 4 (three characters plus the NULL terminator) in length. Error Reporting HEprint void HEprint(FILE *stream, int32 level) stream IN: Stream to print error messages on level IN: Level of the error stack to print Purpose Prints information on the error stack. Return value Has no return value. Description Prints information on reported errors. If level is zero, all of the errors currently on the error stack are printed. Output from this function is sent to the file pointed to by stream. The following information printed: ¥ An ASCII description of the error ¥ The reporting routine ¥ The reporting routineÕs source file name ¥ The line at which the error was reported If the programmer has supplied extra information by means of HEreport, this information is printed as well. HEclear void HEclear(void) Purpose Clears all information on reported errors off of the error stack. Return value Has no return value. Description Clears all of the information off of the error stack. HERROR void HERROR(int16 number) number IN: Error number Purpose Reports an error. Return value Has no return value. Description Reports an error. Any function calling HERROR must have a variable FUNC which points to a string containing the name of the function. HERROR is implemented as a macro. HEreport void HEreport(char *format, ....) format IN: printf-style format and arguments Purpose Provides extra information to the error reporting routines. Return value Has no return value. Description Provides further annotation to an error report. Only one such annotation is remembered for each error report. The arguments to this routine follow the style of printf. Consider the following example from hfile.c: char *FUNC = "Hclose"; .... if (file_rec->attach > 0) { file_rec->refcount++; HERROR(DFE_OPENAID); HEreport("There are still %d active aids attached", file_rec->attach); return FAIL; } Other Hsync int Hsync(int32 file_id) file_id IN: ID of the file to synchronize Purpose Synchronizes on-disk HDF file with image in memory. Return value Returns SUCCEED. Description Hsync is not included in the current HDF library release because the on-disk representation of an HDF file is always the same as its in- memory representation. Hsync will be provided when future releases implement buffering schemes.