The HDF Group

HDF User’s Guide

Version 4.2.6


[Top] [Prev][Next]


Chapter 16 -- Raw Data Information


16.1 Chapter Overview

In 2011, to support the HDF4 File Content Map Project, HDF 4.2.6 introduced a set of routines that allow applications to access the raw data directly by providing the locations and sizes (i.e., offsets and lengths) of the data in an HDF file. The data can be all in one block or scattered in various locations due to linked-block or chunking storage scheme. This chapter describes these data information retrieval functions and provide examples of their usage.

16.2 The Data Information Retrieval Routines

There are several of the data information retrieval functions across the AN, SD, GR, V, and VS interfaces and the prefix of each function's name follows the same rule as other functions in the same interface. They all have "datainfo" in their names because their purpose is data information retrieval. Table 16A lists these routines. Currently, there is no implementation of the Fortran versions for these functions.

TABLE 16A Raw Data Information Retrieval Routines
Interface
Routine Name
Description and Reference
C
FORTRAN-77
AN
ANgetdatainfo
unavailable
Retrieves data information of an annotation's data (Section 16.3.1 on page 492)
SD
SDgetanndatainfo
unavailable
Retrieves data information of an DFSD API annotation's data (Section 16.4.4 on page 495)
SDgetattdatainfo
unavailable
Retrieves offset and length of an SD API attribute's data (Section 16.4.2 on page 493)
SDgetdatainfo
unavailable
Retrieves offset and length of a data set's data (Section 16.4.1 on page 492)
SDgetoldattdatainfo
unavailable
Retrieves offset and length of a DFSD API attribute's data (Section 16.4.3 on page 494)
GR
GRgetattdatainfo
unavailable
Retrieves offset and length of a GR API attribute's data (Section 16.5.2 on page 497)
GRgetdatainfo
unavailable
Retrieves offset and length of a raster image's data (Section 16.5.1 on page 496)
V
Vgetattdatainfo
unavailable
Retrieves offset and length of a V API attribute's data (Section 16.6.1 on page 497)
VS
VSgetattdatainfo
unavailable
Retrieves offset and length of a VS API attribute's data (Section 16.7.2 on page 499)
VSgetdatainfo
unavailable
Retrieves offset and length of a vdata or a vdata field's data (Section 16.7.1 on page 498)

There is no additional header file required for these new functions. As with existing API functions, the header file mfhdf.h must be included in programs that invoke SD interface routines, and hdf.h for non-SD ones.

16.3 Addition to the AN Interface

There is one routine added to the AN API for raw data information retrieval, ANgetdatainfo, and it is described in the following sub-section.

16.3.1 Retrieving Data Information of an Annotation: ANgetdatainfo

ANgetdatainfo retrieves the offset and length locating the data in a specified annotation. The syntax of ANgetdatainfo is as follows:

The annotation is specified by its identifier, ann_id. The offset and length are retrieved into the user-supplied buffers offset and length. Note that annotation's data is stored in one contiguous block only.

ANgetdatainfo returns SUCCEED (or 0), if successful, or FAIL (or -1), otherwise. The parameters of ANgetdatainfo are specified in Table 16B.

TABLE 16B ANgetdatainfo Parameter List
Routine Name
[Return Type]
(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
ANgetdatainfo
[intn]
(unavailable)
ann_id
int32
N/A
Annotation identifier
offset
int32 *
N/A
Buffer for offset of annotation's data
length
int32 *
N/A
Buffer for length of annotation's data

16.4 Addition to the SD Interface

There are several functions added to the SD API for raw data information retrieval:

These functions are described in the following sub-sections.

16.4.1 Retrieving Data Information of an SDS: SDgetdatainfo

SDgetdatainfo retrieves offset and length of data blocks in a specified data set. The syntax of SDgetdatainfo is as follows:

The offsets and lengths are retrieved into the user-supplied lists offsetarray and lengtharray.

The parameter origin must be NULL when the data is not stored in chunking layout. When the data is chunked, SDgetdatainfo can be called on a single chunk and origin is used to specify the coordinates of the chunk.

The parameter info_count specifies the maximum number of items the offset and length lists are allocated to hold. Applications, however, can pass in 0 for info_count and NULL for these arrays when only the actual number of data blocks in the data set is desired.

The purpose of the parameter start_block was to allow retrieval to start at a random block in the data. Applications would be able to start retrieving at the begining of the data by specifying start_block as 0, or at a block of data by specifying start_block as a value between 1 and the number of blocks in the data. However, in release 2.6, start_block has no effect except for contiguous data, in which case, SDgetdatainfo will fail when start_block is greater than 1. The supporting project did not need this specific feature. Thus, until the feature is supported, applications should pass 0 in for start_block to start retrieving at the beginning of the data and up to info_count or the total number of data blocks, whichever smaller.

SDgetdatainfo returns the number of offset/length pairs retrieved, if successful, or FAIL (or -1), otherwise. The parameters of SDgetdatainfo are specified in Table 16C.

TABLE 16C SDgetdatainfo Parameter List
Routine Name
[Return Type]
(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
SDgetdatainfo
[intn]
(unavailable)
sds_id
int32
N/A
Data set identifier
origin
int32 *
N/A
Coordinates of the origin of the chunk to be read
start_block
uintn
N/A
Indicating where to start reading offsets
info_count
uintn
N/A
Length of the offset and length lists
offsetarray
int32 *
N/A
Array to hold offsets of the data blocks
lengtharray
int32 *
N/A
Array to hold lengths of the data blocks

16.4.2 Retrieving Data Information of an Attribute: SDgetattdatainfo

SDgetattdatainfo retrieves offset and length of the data in a specified attribute. The syntax of SDgetattdatainfo is as follows:

The attribute is specified by its index and can be one that belongs to an SD file, a data set, or a dimension. The offset and length are retrieved into the user-supplied buffers offset and length. Note that attribute's data is stored in one contiguous block only.

There are attributes created by SDsetattr and those created by the DFSD API functions. Refer to Appendix C, "Attributes in HDF," for more details. SDgetattdatainfo can only retrieve data information of attributes that were created by SDsetattr. If the inquired attribute was created by the DFSD API functions, SDgetattdatainfo will return to the caller with error code DFE_NOVGREP and the caller can call SDgetoldattdatainfo to get the attribute's data information.

SDgetattdatainfo returns the number of offset/length pair retrieved, which should be 1, if successful, or FAIL (or -1), otherwise. The parameters of SDgetattdatainfo are specified in Table 16D.

TABLE 16D SDgetattdatainfo Parameter List
Routine Name
[Return Type]
(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
SDgetattdatainfo
[intn]
(unavailable)
id
int32
N/A
SD, SDS, or dimension identifier
attr_index
int32
N/A
Index of the attribute being inquired
offset
int32 *
N/A
Buffer for offset of attribute's data
length
int32 *
N/A
Buffer for length of attribute's data

16.4.3 Retrieving Data Information of a DFSD API Attribute: SDgetoldattdatainfo

SDgetoldattdatainfo retrieves offset and length of the data in a specified attribute, which was created by the DFSD API routines. The attributes created in this manner were not stored as vdatas like those created by SDsetattr. These type of attributes are often seen in some older files, circa 1993. However, later files may still contain them if the file was written with the DFSD API routines. In addition, this type of attributes can only be predefined; there are no user-defined attributes in DFSD API.

SDgetoldattdatainfo only works on DFSD-created attributes while its counter part SDgetattdatainfo only works on attributes created with SDsetattr. An application might call SDgetattdatainfo initially. When a DFSD-created attribute is encountered, SDgetattdatainfo will fail with the error code DFE_NOVGREP, which means there is no vgroup representation for the SDS and the SDS' attributes are stored differently than when they are created with SDsetattr. The application must call SDgetoldattdatainfo to get the data information of those attributes, if such error code is detected. For further information about this attribute issue, please refer to the Appendix C, "Attributes in HDF," in this document. The syntax of SDgetoldattdatainfo is as follows:

SDgetoldattdatainfo takes both SDS identifier and dimension identifier if the inquired attribute belongs to a dimension. When the inquired attribute belongs to an SDS, the dimension identifier will not be needed, and should be 0.

The attribute can be one that belongs to a data set or a dimension and is specified by its name, which can be one of the predefined names in Table 16E on page 495. The offset and length are retrieved into the user-supplied buffers offset and length. Note that attribute's data is stored in one contiguous block only.

TABLE 16E HDF4 Predefined Attributes
Predefined Name
Actual Text
Applicable To
_HDF_LongName
"long_name"
Dimension & SDS
_HDF_Units
"units"
Dimension & SDS
_HDF_Format
"format"
Dimension & SDS
_HDF_CoordSys
"coordsys"
Only SDS
_HDF_ScaleFactorErr
"scale_factor_err"
Only SDS
_HDF_AddOffset
"add_offset"
Only SDS
_HDF_ValidRange
"valid_range"
Only SDS
_HDF_ScaleFactor
"scale_factor"
Only SDS
_HDF_AddOffsetErr
"add_offset_err"
Only SDS
_HDF_CalibratedNt
"calibrated_nt"
Only SDS
_HDF_ValidMax
"valid_max"
Only SDS
_HDF_ValidMin
"valid_min"
Only SDS
_FillValue
"_FillValue"
Only SDS

SDgetoldattdatainfo returns the number of offset/length pair retrieved, which should be 1, if successful, or FAIL (or -1), otherwise. The parameters of SDgetoldattdatainfo are specified in Table 16F.

TABLE 16F SDgetoldattdatainfo Parameter List
Routine Name
[Return Type]
(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
SDgetoldattdatainfo
[intn]
(unavailable)
dim_id
int32
N/A
Dimension identifier
sds_id
int32
N/A
SDS identifier
attr_name
char *
N/A
Name of the attribute being inquired
offset
int32 *
N/A
Buffer for offset of attribute's data
length
int32 *
N/A
Buffer for length of attribute's data

16.4.4 Retrieving Data Information of an SDS Annotation: SDgetanndatainfo

SDgetanndatainfo retrieves offsets and lengths of the data belonging to the annotations of a given type. These annotations were created with the DFAN API. The syntax of SDgetanndatainfo is as follows:

The parameter id can be an SD or SDS identifier. However, when id is an SD identifier, the annotation's type must be either AN_FILE_LABEL (or 2) or AN_FILE_DESC (or 3), and when it is an SDS identifier, the type must be AN_DATA_LABEL (or 0) or AN_DATA_DESC (or 1). The offsets and lengths of the specified annotations are retrieved into the user-supplied buffers offsetarray and lengtharray. Note that annotation's data is stored in one contiguous block only, but there can be more than one annotation of the specified type. The parameter size specifies the number of elements offsetarray and lengtharray can hold.

SDgetanndatainfo returns the number of offset/length pairs retrieved, if successful, or FAIL (or -1), otherwise. The parameters of SDgetanndatainfo are specified in Table 16G.

TABLE 16G SDgetanndatainfo Parameter List
Routine Name
[Return Type]
(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
SDgetanndatainfo
[intn]
(unavailable)
id
int32
N/A
SD or SDS identifier
annotype
ann_type
N/A
Type of annotations to retrieve data info
size
uintn
N/A
Length of the offset and length arrays
offsetarray
int32 *
N/A
Buffer for offset of annotations' data
lengtharray
int32 *
N/A
Buffer for length of annotations' data

16.5 Addition to the GR Interface

There are two routines added to the GR API for raw data information retrieval, GRgetdatainfo and GRgetattdatainfo, and they are described in the following sub-sections.

16.5.1 Retrieving Data Information of a Raster Image: GRgetdatainfo

GRgetdatainfo retrieves offset and length of data blocks in a specified raster image. The syntax of GRgetdatainfo is as follows:

The offsets and lengths are retrieved into the user-supplied lists offsetarray and lengtharray.

The parameter info_count specifies the maximum number of items the offset and length lists are allocated to hold. Applications, however, can pass in 0 for info_count and NULL for these arrays when only the actual number of data blocks in the data set is desired.

The purpose of the parameter start_block was to allow retrieval to start at a random block in the data. Applications would be able to start retrieving at the begining of the data by specifying start_block as 0, or at a block of data by specifying start_block as a value between 1 and the number of blocks in the data. However, in release 2.6, start_block has no effect except for contiguous data, in which case, GRgetdatainfo will fail when start_block is greater than 1. The supporting project did not need this specific feature. Thus, until the feature is supported, applications should pass 0 in for start_block to start retrieving at the beginning of the data and up to info_count or the total number of data blocks, whichever smaller.

GRgetdatainfo returns the number of offset/length pairs retrieved, if successful, or FAIL (or -1), otherwise. The parameters of GRgetdatainfo are specified in Table 16H.

TABLE 16H GRgetdatainfo Parameter List
Routine Name
[Return Type]
(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
GRgetdatainfo
[intn]
(unavailable)
ri_id
int32
N/A
Raster image identifier
start_block
uintn
N/A
Indicating where to start reading offsets
info_count
uintn
N/A
Length of the offset and length lists
offsetarray
int32 *
N/A
Array to hold offsets of the data blocks
lengtharray
int32 *
N/A
Array to hold lengths of the data blocks

16.5.2 Retrieving Data Information of a GR API Attribute: GRgetattdatainfo

GRgetattdatainfo retrieves offset and length of the data in a specified attribute. The syntax of GRgetattdatainfo is as follows:

The attribute is specified by its index and can be one that belongs to a GR file or a raster image. The offset and length are retrieved into the user-supplied buffers offset and length. Note that attribute's data is stored in one contiguous block only.

GRgetattdatainfo returns the number of offset/length pair retrieved, which should be 1, if successful, or FAIL (or -1), otherwise. The parameters of GRgetattdatainfo are specified in Table 16I.

TABLE 16I GRgetattdatainfo Parameter List
Routine Name
[Return Type]
(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
GRgetattdatainfo
[intn]
(unavailable)
id
int32
N/A
GR or raster image identifier
attr_index
int32
N/A
Index of the attribute being inquired
offset
int32 *
N/A
Buffer for offset of attribute's data
length
int32 *
N/A
Buffer for length of attribute's data

16.6 Addition to the V Interface

There is one routine added to the V API for raw data information retrieval, Vgetattdatainfo, and it is described in the following sub-section.

16.6.1 Retrieving Data Information of a V API Attribute: Vgetattdatainfo

Vgetattdatainfo retrieves the offset and length locating the data in a specified attribute. The syntax of Vgetattdatainfo is as follows:

The annotation is specified by its identifier, ann_id. The offset and length are retrieved into the user-supplied buffers offset and length. Note that annotation's data is stored in one contiguous block only.

There are two types of attributes for vgroups; those created by Vsetattr (new style) and those created by non-Vsetattr approaches (old style.) Please refer to the section about Vnattrs and Vnattrs2 and the Appendix Attribute in this HDF User's Guide for details. Vgetattdatainfo can access both type of attributes. However, an application must use Vnattrs2 to get the number of attributes instead of Vnattrs in order to include both types. Note that, when a vgroup has both types of attributes, the old-style attributes will preceed the new ones, regardless of when they were created. The best way to access these attributes is through a loop.

Vgetattdatainfo returns the number of data blocks, which should be 1, if successful, or FAIL (or -1), otherwise. The parameters of Vgetattdatainfo are specified in Table 16J.

TABLE 16J Vgetattdatainfo Parameter List
Routine Name
[Return Type]
(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
Vgetattdatainfo
[intn]
(unavailable)
vgroup_id
int32
N/A
Annotation identifier
attr_index
intn
N/A
Index of the inquired attribute
offset
int32 *
N/A
Buffer for offset of attribute's data
length
int32 *
N/A
Buffer for length of attribute's data

16.7 Addition to the VS Interface

There are two routines added to the VS API for raw data information retrieval, VSgetdatainfo and VSgetattdatainfo, and they are described in the following sub-sections.

16.7.1 Retrieving Data Information of a Vdata: VSgetdatainfo

VSgetdatainfo retrieves offset and length of data blocks in a specified vdata. The syntax of VSgetdatainfo is as follows:

The offsets and lengths are retrieved into the user-supplied lists offsetarray and lengtharray.

The parameter info_count specifies the maximum number of items the offset and length lists are allocated to hold. Applications, however, can pass in 0 for info_count and NULL for these arrays when only the actual number of data blocks in the data set is desired.

The purpose of the parameter start_block was to allow retrieval to start at a random block in the data. Applications would be able to start retrieving at the begining of the data by specifying start_block as 0, or at a block of data by specifying start_block as a value between 1 and the number of blocks in the data. However, in release 2.6, start_block has no effect except for contiguous data, in which case, VSgetdatainfo will fail when start_block is greater than 1. The supporting project did not need this specific feature. Thus, until the feature is supported, applications should pass 0 in for start_block to start retrieving at the beginning of the data and up to info_count or the total number of data blocks, whichever smaller.

VSgetdatainfo returns a the number of offset/length pairs retrieved, if successful, or FAIL (or -1), otherwise. The parameters of VSgetdatainfo are specified in Table 16K.

TABLE 16K VSgetdatainfo Parameter List
Routine Name
[Return Type]
(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
VSgetdatainfo
[intn]
(unavailable)
vdata_id
int32
N/A
Vdata identifier
start_block
uintn
N/A
Indicating where to start reading offsets
info_count
uintn
N/A
Length of the offset and length lists
offsetarray
int32 *
N/A
Array to hold offsets of the data blocks
lengtharray
int32 *
N/A
Array to hold lengths of the data blocks

16.7.2 Retrieving Data Information of a VS API Attribute: VSgetattdatainfo

VSgetattdatainfo retrieves offset and length of the data in a specified attribute. The syntax of VSgetattdatainfo is as follows:

The attribute is specified by its index, attr_index, and can be one that belongs to a vdata or a field of the vdata. If findex is _HDF_VDATA (or -1), then the attribute is associated with the vdata. If findex is an index of the vdata field, then the attribute is one that is associated with the vdata field. The parameter attr_index specifies the attribute's index within the vdata's or the field's attribute list. Thus, its value must be within [0-number of attributes of the associated list].

The offset and length are retrieved into the user-supplied buffers offset and length. Note that attribute's data is stored in one contiguous block only.

VSgetattdatainfo returns the number of offset/length pair retrieved, which should be 1, if successful, or FAIL (or -1), otherwise. The parameters of VSgetattdatainfo are specified in Table 16L.

TABLE 16L VSgetattdatainfo Parameter List
Routine Name
[Return Type]
(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
VSgetattdatainfo
[intn]
(unavailable)
vdata_id
int32
N/A
Vdata identifier
findex
int32
N/A
Vdata's field index or _HDF_VDATA
attr_index
int32
N/A
Index of the attribute being inquired
offset
int32 *
N/A
Buffer for offset of attribute's data
length
int32 *
N/A
Buffer for length of attribute's data
EXAMPLE 1. Getting Data Information of SDS.

This example demonstrates the use of the routines SDgetdatainfo with simple and contiguous data in a data set.

C:
#include "mfhdf.h" 
#define SIMPLE_FILE  "datainfo_simple.hdf"   /* data file previously written */ 
main() 
{ 
    /*********************** Variable Declaration **************************/ 
    int32 sd_id, sds_id; 
    int32 offset, length; 
    uintn info_count = 0; 
    intn status; 
    /* 
     * Open the file for reading. 
     */ 
    sd_id = SDstart(SIMPLE_FILE, DFACC_READ); 
    /*********************************************************************** 
     Read data info for later accessing data without the use of HDF4 library 
     ***********************************************************************/ 
    /* 
     * Open the second dataset, get the number of data block, which is 1, then 
     * retrieve and record the offset/length 
     */ 
    sds_id = SDselect(sd_id, 1); 
    /* 
     * Passing in 0 for the info count and NULL for the offset and length 
     * arrays to get the number of data blocks in the data set.  Note that 
     * the second parameter is for chunk coordinates and because this data 
     * set is not chunked, NULL should be passed in.  The third parameter 
     * indicates to start retrieval at the beginning of the data. 
     */ 
    info_count = SDgetdatainfo(sds_id, NULL, 0, 0, NULL, NULL); 
    /* 
     * Call SDgetdatainfo again to retrieve the offset and length of the 
     * data block.  The info count is now 1 to specify the number of elements 
     * in the offset and length arrays. 
     */ 
    status = SDgetdatainfo(sds_id, NULL, 0, info_count, &offset, &length); 
    /* 
     * Terminate access to the data set. 
     */ 
    status = SDendaccess(sds_id); 
    /* 
     * Close the file. 
     */ 
    status = SDend(sd_id); 
    /****************************************************************** 
     Read data using previously obtained data info without HDF4 library 
     ******************************************************************/ 
    /* Open file and read in data without using SD API */ 
    { 
        int fd; /* for open */ 
        int32 ret32; /* for DFKconvert */ 
        ssize_t readlen = 0; /* for read */ 
        int32 *readibuf, *readibuf_swapped; 
        /* 
         * Open the file for reading without SD API. 
         */ 
        fd = open(SIMPLE_FILE, O_RDONLY); 
        /* 
         * Forward to the position of the data. 
         */ 
        lseek(fd, (off_t)offset, SEEK_SET); 
        /* 
         * Allocate buffers for SDS' data. 
         */ 
        readibuf = (int32 *) HDmalloc(N_VALUES * sizeof(int32)); 
        readibuf_swapped = (int32 *) HDmalloc(N_VALUES * sizeof(int32)); 
        /* 
         * Read in this block of data. 
         */ 
        readlen = read(fd, (VOIDP) readibuf, (size_t)length); 
        /* 
         * Convert data back to format on local machine. 
         */ 
        ret32 = DFKconvert(readibuf, readibuf_swapped, DFNT_INT32, 
                           N_VALUES, DFACC_WRITE, 0, 0); 
        /* 
         * Free resources. 
         */ 
        HDfree (readibuf_swapped); 
        HDfree (readibuf); 
        /* 
         * Close the file. 
         */ 
        close(fd); 
    } 
} 

HDF 4.2.6 - August 2011
Copyright
The HDF Group
www.hdfgroup.org
The HDF Group