ABOUT_4.0.alpha This file was last updated: November 8, 1994 INTRODUCTION This is a preliminary document describing the differences between HDF4.0 (Alpha) and HDF3.3r3. It is written for people who already use HDF3.3r3 or earlier versions and wish to be HDF4.0 Alpha testers. Special emphasis is given to changes that might be required in existing code. The files ABOUT_3.3r3, ABOUT_3.3r2 and ABOUT_3.3r1 which were released along with previous releases contain detailed descriptions of HDF3.3. Those files can be found in this directory. First-time HDF users are encouraged to read the FAQ file in directory HDF/ for more information about HDF and where to get HDF documentation. If you have any questions or comments, please send them to: hdfhelp@ncsa.uiuc.edu. Contents 1. Changes in include file names for FORTRAN programs 2. New features supported by HDF4.0 ANSI C only n-bit SDS Reading CDF files Parallel I/O interface on CM5 Installing HDF Libraries With CM5 Parallel IO Extension 3. Changes in HDF utilities hdp -- HDF dumper ristosds hdfls hdfpack hdfunpac 4. Platforms tested 5. Limits of the current release 1. Changes in include file names for FORTRAN programs In hdf/ there are two files for FORTRAN programs to include the values and functions defined in HDF. They were originally named as constant.i and dffunct.i. The extension .i causes problems on some machines since *.i is used by cc as an "intermediate" file produced by the cpp preprocessor. In HDF 4.0 dffunct.i is changed to dffunct.inc, and constant.i is changed to hdf.inc. Users' existing FORTRAN application programs need to make the corresponding changes, if they include the .i files, in order to compile with HDF4.0. 2. New Features supported by HDF4.0 ANSI C only As previously noted in the HDF newsletters, the next major release of the HDF library will compile only with ANSI C compilers. Backward compatibility will be provided through an ANSI->K&R filter which will need to be run on each source file in order to convert the ANSI style code into K&R style code. Currently the entire HDF library has been converted to ANSI code, but the filter is not yet in place. Future alpha releases may have the code filter in place, but it will definitely be in place for the first beta release. This shift to ANSI C compliance has been accompanied by a large cleanup in the source code. An attempt has been made to remove all warnings and informational messages that the compilers on supported platforms occasionally emit, but this may not be completely clean for all user sites. n-bit SDS Support for n-bit integer data has been incorporated into this release of the HDF library. The n-bit support is currently incorporated into the call to SDsetnbitdataset, future releases may incorporate high level access through the DFSD interface also. Access to the data stored in an n-bit data item is transparent to the calling program. For example to store an unsigned 12-bit integer (which is represented unpacked in memory as an unsigned 16-bit integer), with no sign extension or bit filling and which starts at bit 14 (counting from the right with bit zero being the lowest) the following setup & call would be appropriate: intn sign_ext = FALSE; intn fill_one = FALSE; intn start_bit= 14; intn bit_len = 12; SDsetnbitdataset(sds_id,start_bit,bit_len,sign_ext,fill_one); Further reads and writes to this dataset would transparently convert the 16-bit unsigned integers from memory into 12-bit unsigned integers stored on disk. The corresponding FORTRAN function name is sfsnbit which takes the same parameters in the same order. A breakdown of the parameters to the SDsetnbitdataset call is as follows: int32 sds_id - The id of a scientific dataset returned from SDcreate or SDselect. intn start_bit - This value determines the bit position of the highest end of the n-bit data to write out. Bits in all number- types are counted from the right starting with 0. For example, in the following bit data, "01111011", bits 2 and 7 are set to 0 and all the other bits are set to one. intn bit_len - The number of bits in the n-bit data to write, including the starting bit, counting towards the right (i.e. lower bit numbers). For example, starting at bit 5 and writing 4 bits from the following bit data, "01111011", would write out the bit data, "1110", to the dataset on disk. intn sign_ext - Whether to use the top bit of the n-bit data to sign-extend to the highest bit in the memory representation of of the data. For example, if 9-bit signed integer data is being extracted from bits 17-25 (nt=DFNT_INT32, start_bit=25, bit_len=9, see below for full information about start_bit & bit_len parameters) and the bit in position 25 is a 1, then when the data is read back in from the disk, bits 26-31 will be set to a 1, otherwise bit 25 will be a zero and bits 26-31 will be set to 0. This bit-filling takes higher precendence (i.e. is performed after) the fill_one (see below) bit-filling. intn fill_one - Whether to fill the "background" bits with 1's or 0's. The "background" bits of a n-bit dataset are those bits in the in-memory representation which fall outside of the actuall n-bit field stored on disk. For example, if 5 bits of an unsigned 16-bit integer (in-memory) dataset located in bits 5-9 are written to disk with the fill_one parameter set to TRUE (or 1), then when the data is read back into memory at a future time, bits 0-4 and 10-15 would be set to 1. If the same 5-bit data was written with a fill_one value of FALSE (or 0), then bits 0-4 and 10-15 would be set to 0. This setting has a lower precedence (i.e. is performed first) than the sign_ext setting. For example, using the sign_ext example above, bits 0-16 and 26-31 will first be set to either 1 or 0 based on the fill_one parameter, and then bits 26-31 will be set to 1 or 0 based on bit-25's value. Reading CDF files With HDF 4.0 limited support for reading CDF files was added to the library. This support is still somewhat in the development stage and is therefore limited. To begin with, unlike the netCDF merger, the CDF API is not supported. Rather, the SD and netCDF APIs can be used to access information pulled out of CDF files. The type of files supported are limited to CDF 2.X files. The header information is different between CDF 1.X and 2.X files. In addition, all of the files must be stored as single-file CDFs in network encoding. If there is user demand, and support, the types of CDF files readable may be increased in the future. Parallel I/O interface on CM5 An extension using the parallel IO in CM5 is added to the SDS interface. Initial tests have resulted in about 25 MBytes/second IO throughput using the SDA (Scalable Disk Array) file system. The library provides interfaces for both C* and CMF programming languages. Read the section "Installing HDF Libraries With CM5 Parallel IO Extension" below for specific installation instructions. Users will find some examples in the directory mfhdf/CM5/Examples. Please send comments, bugs reports, etc. to acheng@ncsa.uiuc.edu. The parallel I/O interface stores scientific datasets in external files. New options have been added to hdfls and hdfpack to handle them. A new utility program, hdfunpac, is created for external files handling too. See the man pages for details. Installing HDF Libraries With CM5 Parallel IO Extension The current alpha version requires two major steps to install the HDF libraries (libdf.a and libnetcdf.a). Works are in progress to make it simpler in the production release. Bear with us for now. 1) Compile and install the ordinary HDF libraries, include files and utilities according to the instructions for a Sun Microsystem machine. 2) To make the HDF library with CM5 parallel IO extension: There are two new libraries, libdfcm5.a and libnetcdfcm5.a that are similar to libdf.a and libnetcdf.a. For libdf.a cd hdf cp MAKE.CM5 Makefile cp src/Makefile.CM5 src/Makefile make libdf # create the parallel IO libdf.a # to install it in /usr/local/lib cp src/libdf.a /usr/local/lib/libdfcm5.a ranlib /usr/local/lib/libdfcm5.a For libnetcdf.a cd mfhdf # edit CUSTOMIZE to use "gcc" as the CC compiler # and add "-DCM5" to the CFLAGS variable. ./configure (cd libsrc; make ) # compile the library # to install it in /usr/local/lib cp libsrc/libnetcdf.a /usr/local/lib/libnetcdfcm5.a ranlib /usr/local/lib/libnetcdfcm5.a 3. Changes in HDF utilities hdp -- HDF dumper A new utility hdp is under development to list contents of HDF files and to dump data of HDF objects. A prototype is included in HDF4.0 Alpha for users to play with and comment on. Development will continue based on users' feedback. More information is contained in HDF/HDF4.0.alpha/mfhdf/dumper/README. ristosds Ristosds now converts several raster images into a 3D uint8, instead of float32, SDS. hdfls New options to recognize external elements. hdfpack New options to pack external elements back into the main file. hdfunpac New utility program to unpack scientific datasets to external elements. Can be used to prepare for CM5 parallel IO access. 4. HDF 4.0 Alpha has been tested on the following machines Platform 'base library' HDF/netCDF --------------------------------------------------------------- Sun4/SunOs X X Sun4/SOLARIS X X IBM/RS6000 X X SGI/IRIX4 X X Convex/ConvexOS * X X Cray Y-MP/UNICOS X X Cray/C90 X X NeXT/NeXTSTEP X X HP/UX 9.01 X X DecStation/MIPSEL X X IBM PC - MSDOS ** *** IBM PC - Windows 3.1 ** *** IBM PC - Windows NT X X DEC Alpha/OSF X X CM5/ X X Fujitsu VP/UXPM X Intel i860 X Mac/MacOS VMS * When compiling the mfhdf section of the library on a Convex3 you will need to set the environment variable 'MACHINE' to 'c3' before running the configure script. ** There is no FORTRAN support for either PC version of HDF4.0 Alpha *** The netCDF half of the HDF/netCDF merger is not working correctly, but the multi-file SD interface is working correctly. 5. Limits of the current release Sometimes it is important for HDF users to be aware of certain limits in using HDF files and HDF libraries. This section is aimed at HDF applications programmers and reflects upperbounds as of HDF 4.0. Limits that are #define'd are fully capitalized and the file where the symbol is defined is given in parentheses at the end of the sentence. If the #define's are changed in order to meet the needs of an application, it is important to make sure that all other users, who would share the HDF library and the hdf files of the application, are aware of the changes. If a limit has no #define, the size of the maximum storage allocated for that item is given; it would, generally, require a large amount of modification of the HDF library to change. If a limit is listed as a number type (e.g. int16) then it refers to the largest number that can be represented using that type. That is: int16 -- 32,767 int32 -- 2,147,483,647. H-Level Limits -------------- MAX_FILE files open at a single time (hfile.h) MAX_ACC access records open at a single time (hfile.h) int16 total tags (fixed) int32 max length and offset of an element in an HDF file (fixed) Vgroup Limits ------------- MAX_VFILE vset files open at a single time (hdf.h) int16 elements in a Vgroup (fixed) VGNAMELENMAX max length of a Vgroup name or class (vg.h) Vdata Limits ------------ MAX_VFILE vset files open at a single time (hdf.h) VSFIELDMAX fields in a Vdata (hdf.h) FIELDNAMELENMAX characters in a single field name (hdf.h) MAX_ORDER max field order in a Vdata (hdf.h) VSNAMELENMAX max length of a Vdata name or class (hdf.h) int16 max width in bytes of a Vdata record. (fixed) Vdatas can have a maximum field width of MAX_FIELD_SIZE bytes. (hdf.h) Raster Images ------------- int32 width or height of a raster image. (fixed) SD Limits --------- MAX_VAR_DIMS dimensions per dataset (defined in netcdf.h included by mfhdf.h) int32 maximum dimension length (fixed) MAX_NC_ATTRS attributes for a given object (defined in netcdf.h included by mfhdf.h) MAX_NC_NAME maximum length of a name of a dataset (defined in netcdf.h included by mfhdf.h) Other Convensions / Issues -------------------------- Some utility programs (e.g. ncgen) expect dataset names to be composed of only alphanumeric, '-' and '_' characters.