ABOUT HDF4.0 Release 1
February 7, 1996
INTRODUCTION
This document describes the differences between HDF4.0r1 and
HDF3.3r4. It is written for people who are familiar with
previous releases of HDF and wish to migrate to HDF4.0r1.
The documentation and release notes provide more in-depth
information concerning the topics discussed here. The HDF 4.0
documentation can be found on the NCSA ftp server in the
directory /HDF/Documentation/HDF4.0/Users_Guide. For more
history behind the implementation of the items listed here,
refer to the ABOUT_4.0.alpha, ABOUT_4.0b1 and ABOUT_4.0b2 files.
First-time HDF users are encouraged to read the FAQ in this
release for more information about HDF. Users can also look
at the home page for HDF at:
http://hdf.ncsa.uiuc.edu/
If you have any questions or comments, please send them to:
hdfhelp@ncsa.uiuc.edu
CONTENTS
- Important Changes (that will affect you)
- New Features and Changes
- Changes in Utilities
- Known Problems
Important Changes:
-----------------
1. Several changes have been made to the libraries in HDF4.0
which affect the way that users compile and link their
programs:
* The mfhdf library has been renamed to libmfhdf.a from
libnetcdf.a in previous releases.
* HDF 4.0 libraries now use v5 of the Independent JPEG Group
(IJG) JPEG file access library.
* Gzip library libz.a is added in HDF4.0r1, in order to
support "deflate" style compression of any object in
HDF files.
Due to these changes, users are required to specify four
libraries when compiling and linking a program: libmfhdf.a,
libdf.a, libjpeg.a and libz.a, even if your program
does not use JPEG or GZIP compression. For example:
For C:
cc -o myprog myprog.c -I \
\
or
cc -o myprog myprog.c -I \
-L -lmfhdf -ldf -ljpeg -lz
For FORTRAN:
f77 -o myprog myprog.f \
\
or
f77 -o myprog myprog.f -L \
-lmfhdf -ldf -ljpeg -lz
NOTE: The order of the libraries is important: libmfhdf.a
first, then libdf.a, followed by libjpeg.a and libz.a.
This is also discussed in Items 1, 2, and 3 of the New
Features and Changes section of this document.
2. The HDF 4.0 library will ONLY compile with ANSI C compilers.
See Item 4 in the New Features and Changes section of this
document for more information.
3. The HDF library and netCDF library on Unix systems can now be
automatically configured and built with one command. See Item 5
in the New Features and Changes section of this document for more
information.
4. In HDF 4.0, the FORTRAN programs dffunct.i and constant.i
have been changed to dffunct.inc and hdf.inc. See Item 16 in
the New Features and Changes section of this document for more
information.
5. Platforms tested on: IRIX (5.3, 6.1 (32 bit and 64 bit)),
SunOS 4.1.4, Solaris (ver 2.4, 2.5), Solaris x86,
HP-UX, Digital Unix, AIX, LINUX (A.OUT), CM5, YMP,
FreeBSD, C90, Exemplar, Open VMS, and SP2 (single node only).
HDF 4.0 is not yet available on the Macintosh for HDF4.0r1.
6. The HDF 4.0 binaries for each tested platform are available.
Unix binaries are located in the bin/ directory. Binaries for
Windows NT are located in the zip/ directory.
New Features and Changes:
------------------------
1. Changes to the mfhdf library
The mfhdf library has been renamed to libmfhdf.a from libnetcdf.a
in previous releases. To link a program with HDF4.0r1
libraries, four libraries are required: libmfhdf.a, libdf.a,
libjpeg.a and libz.a.
See Item 1 of 'Important Changes' for examples of how you would
compile and link your programs.
2. JPEG Group v5b library
HDF Version 4.0 libraries now use v5 of the Independent
JPEG Group (IJG) JPEG file access library.
The JPEG library will need to be linked with user's
applications whether they are compressed with JPEG or not.
See Item 1 of 'Important Changes' for examples of how you would
compile and link your programs.
3. Gzip library added
New with this release is support for gzip "deflate" style
compression of any object in an HDF file. This is supported
through the standard compression interface function calls
(HCcreate, SDsetcompress, GRsetcompress). The ABOUT_4.0b2
file contains additional information on this.
See Item 1 of 'Important Changes' for examples of how you would
compile and link your programs.
4. ANSI C only
As was previously noted in the HDF newsletters, this release
of the HDF library will compile only with ANSI C compilers.
This shift to ANSI C compliance has been accompanied by a large
clean up in the source code. An attempt has been made to remove
all warnings and informational messages that the compilers on
supported platforms occasionally emit, but this may not be
completely clean for all user sites.
5. Auto configuration
Both the HDF library and netCDF library on Unix systems now use
the same configure script and can be configured uniformally with
one command. See the README and the INSTALL files at the top
level of HDF4.0r1 for detailed instructions on configuration and
installation.
A consequence of the auto configuration is that on UNIX systems
without FORTRAN installed, the top level config/mh- will
need to have the 'FC' macros defined to "NONE" for correct
configuration.
6. New version of dimension record
In HDF4.0b1 and previous releases of the SDS interface, a vgroup
was used to represent a dimension. The vgroup had a single field
vdata with a class of "DimVal0.0". The vdata had
number of records, with each record having a fake value from
0, 1, 2 ... , ( - 1). The fake values were not
really required and took up a large amount of space. For
applications that created large one dimensional array datasets, the
disk space taken by these fake values almost doubled the size of the
HDF file. In order to omit the fake values, a new version of
dimension vdata was implemented.
The new version uses the same structure as the old version. The
only differences are that the vdata has only 1 record with a value
of and that the vdata's class is "DimVal0.1", to
distinguish it from the old version.
No change was made in unlimited dimensions.
Functions added to support this are:
- SDsetdimval_comp -- sets backward compatibility mode for a
dimension. The default mode is compatible in HDF4.0r1, and
will be incompatible in HDF4.1. See the man page of
SDsetdimval_comp(3) for detail.
- SDisdimval_bwcomp(dimid) -- gets the backward compatibility
mode of a dimension. See the man page of SDisdimval_bwcomp(3)
for detail.
7. Reading CDF files
With HDF 4.0 limited support for reading CDF files was added to
the library. This support is still somewhat in the development
stage and is therefore limited.
To begin with, unlike the netCDF merger, the CDF API is not
supported. Rather, the SD and netCDF APIs can be used to access
information pulled out of CDF files.
The type of files supported are limited to CDF 2.X files. The
header information is different between CDF 1.X and 2.X files. In
addition, all of the files must be stored as single-file CDFs in
network encoding.
If there is user demand, and support, the types of CDF files
that are readable may be increased in the future.
8. Parallel I/O interface on CM5
An extension using the parallel IO in CM5 has been added to
the SDS interface. Initial tests have resulted in about
25 MBytes/second IO throughput using the SDA (Scalable
Disk Array) file system. The library provides interfaces
for both C* and CMF programming languages. The ABOUT_4.0.alpha
file has more information concerning this.
Users will find some examples in the directory
mfhdf/CM5/Examples.
The parallel I/O interface stores scientific datasets in
external files. New options have been added to hdfls and
hdfpack to handle them. A new utility, hdfunpac, was
created for external files handling, too.
9. Support for SGI Power Challenge running IRIX6.1
Power Challenge is now supported, both in the native 64-bit
and the 32-bit objects modes. Note that the Power Challenge
native 64 bits objects use 64 bits long integers. Users should
be careful when using the netcdf interface. They should declare
their variables as "nclong", not "long".
10. Multi-file Annotation Interface (ANxxx)
The multi-file annotation Interface is for accessing
file labels and descriptions, and object labels and
descriptions. It allows users to keep open more than
one file at a time, and to access more than one
annotation at a time. It also allows multiple labels
and multiple descriptions to be applied to an HDF object
or HDF file.
11. Multi-file Raster Image (GRxxx) interface
The new Generic Raster (GR) interface provides a set of
functions for manipulating raster images of all kinds.
This interface allows users to keep open more than one
file at a time, and to "attach" more than one raster
image at a time. It supports a general framework
for attributes within the RIS data-model, allowing
'name = value' style metadata. It allows access to
subsamples and subsets of images.
The GRreqlutil and GRreqimageil functions allow for different
methods of interlacing images in memory. The images are
interlaced in memory only, and are actually written to disk in
"pixel" interlacing.
12. Compression for HDF SDS
Two new compression functions have been added to the SD
interface for HDF 4.0: SDsetcompress and SDsetnbitdataset.
SDsetcompress allows users to compress a scientific dataset
using any of several compression methods. Initially three
schemes, RLE encoding, an adaptive Huffman compression
algorithm, and gzip 'deflation' compression are available.
SDsetnbitdataset allows for storing a scientific dataset
using integers whose size is any number of bits between 1 and
32 (instead of being restricted to 8, 16 or 32-bit sizes).
Access to the data stored in an n-bit data item is transparent
to the calling program. The ABOUT_4.0.alpha file has an in-depth
description concerning this ("n-bit SDS" listed under Item 2).
13. External Path Handling
New functions have been added to allow applications to
specify directories to create or search for external
files.
- HXsetcreatedir (hxscdir for FORTRAN)
- HXsetdir (hxsdir for FORTRAN)
14. I/O performance improvement
HDF 4.0 unofficially supports file page buffering. With HDF 4.1
it will be officially supported. The file page buffering allows
the file to be mapped to user memory on a per page basis i.e. a
memory pool of the file. With regards to the file system, page
sizes can be allocated based on the file system page-size or
in a multiple of the file system page-size. This allows for fewer
pages to be managed as well as accommodating the user's file usage
pattern. See the top level INSTALL file and the
release_notes/page_buf.txt file for creating the library with
this support and using it.
15. Improvement in memory usage and general optimizations
Considerable effort was put into this release (since the b2
release) to reduce the amount of memory used per file and by the
library in general. In general terms, we believe that the library
should have at least half as large of a memory footprint during
the course of its execution and is more frugal about allocating
large chunks of memory.
Much time was spent optimizing the low-level HDF routines for this
release to be faster than they have been in the past also.
Applications which make use of files with many (1000+) datasets
should notice significant improvements in execution speed.
16. In hdf/ there are two files for FORTRAN programs to include
the values and functions defined in HDF. They were
originally named as constant.i and dffunct.i. The extension .i
caused problems on some machines since *.i is used by cc as
an "intermediate" file produced by the cpp preprocessor. In
HDF 4.0 dffunct.i has been changed to dffunct.inc, and constant.i
has been changed to hdf.inc. Users' existing FORTRAN application
programs need to make the corresponding changes, if they
include the .i files, in order to compile with HDF4.0.
17. Limits file
A new file, limits.txt, has been added to the ftp server.
It is aimed at HDF applications programmers and defines the
upper bounds of HDF 4.0. This information is also found in the
hdf/src/hlimits.h file. Refer to the ABOUT_4.0.alpha for
historical information concerning this.
18. Pablo available
HDF4.0 supports creating an instrumented version of the HDF
library(libdf-inst.a). This library, along with the Pablo
performance data capture libraries, can be used to gather data
about I/O behavior and procedure execution times. See the top
level INSTALL file and the hdf/pablo/README.Pablo file for
further information.
19. Support for the IBM SP-2
The HDF library has been ported to run in a single SP2 node.
It does not support the parallel or distributed computing for
multiple SP-2 nodes yet.
20. Miscellaneous fixes
- To avoid conflicts with C++, internal structures' fields which
were named 'new' have been renamed.
- The maximum number of fields in a vdata now is decided by
VSFIELDMAX.
- The platform number subclass problem when an external data file
was in Little_endian format has been fixed.
- Unlimited dimension was not handled correctly by the HDF3.3r4
FORTRAN interface. This problem has been fixed in HDF4.0r1.
Changes to utilities:
--------------------
o hdf/util/ristosds
Ristosds now converts several raster images into a 3D uint8,
instead of float32, SDS.
o hdf/util/hdfls
New options have been added to support the parallel I/O
interface on CM5.
o hdf/util/hdfpack
New options have been added to support the parallel I/O
interface on CM5.
o hdf/util/hdfunpac
This is a new utility for external file handling for the
parallel I/O interface on CM5.
o mfhdf/dumper/hdp
Hdp is a new command line utility designed for quick display of
contents and data objects. It can list the contents of hdf files
at various levels with different details. It can also dump the
data of one or more specific objects in the file. See hdp.txt in
the release notes for more information.
Known Problems:
--------------
o On the IRIX4 platform, fp2hdf creates float32 and float64 values
incorrectly.
o On the SP2, the hdp command gives a false message of "Failure to
allocate space" when the hdf file has no annotations to list.
o On the C90, hdfed fails inconsistently when opening hdf files
more than once in the same hdfed session.
o Currently there is a problem in re-writing data in the middle
of compressed objects.
o VMS gives an error on the test for Little Endian float64.
o If the external element test in hdf/test/testhdf fails
and there is no subdirectory "testdir" in hdf/test/,
create one via "mkdir" and run the test again. (The
"testdir" should have been created by "make". But
the "make" in some old systems does not support the
creation commands.)