- All Implemented Interfaces:
- DataFormat,- MetaDataContainer,- Serializable
- Direct Known Subclasses:
- Attribute,- CompoundDS,- ScalarDS
This class provides two convenient functions, read()/write(), to read/write data values. Reading/writing data may take many library calls if we use the library APIs directly. The read() and write functions hide all the details of these calls from users.
For more details on dataset, see HDF5 User's Guide
- Version:
- 1.1 9/4/2007
- Author:
- Peter X. Cao
- See Also:
- ScalarDS,- CompoundDS, Serialized Form
- 
Field SummaryFieldsModifier and TypeFieldDescriptionprotected long[]The array of dimension sizes for a chunk.protected StringBuilderThe compression information.static Stringprotected booleanFlag to indicate if the byte[] array is converted to stringsprotected ObjectThe array that holds the converted data of unsigned C-type integers.protected ObjectThe memory buffer that holds the raw data array of the dataset.protected DatatypeThe datatype object of the dataset.protected String[]Array of strings that represent the dimension names.protected long[]The current dimension sizes of the datasetprotected StringBuilderThe filters information.protected booleanFlag to indicate if this dataset has been initializedprotected booleanFlag to indicate if data values are loaded into memory.protected long[]The max dimension sizes of the datasetprotected longThe number of data points in the memory buffer.protected ObjectThe data buffer that contains the raw data directly reading from file (before any data conversion).protected intThe number of dimensions of the dataset.protected long[]Array that contains the number of data points selected (for read/write) in each dimension.protected int[]Array that contains the indices of the dimensions selected for display.protected long[]The number of elements to move from the start location in each dimension.protected long[]The starting position of each dimension of a selected subset.protected StringBuilderThe storage information.protected StringBuilderThe storage layout information.Fields inherited from class hdf.object.HObjectfileFormat, linkTargetObjName, oid, SEPARATOR
- 
Constructor SummaryConstructorsConstructorDescriptionDataset(FileFormat theFile, String dsName, String dsPath)Constructs a Dataset object with a given file, name and path.Dataset(FileFormat theFile, String dsName, String dsPath, long[] oid)Deprecated.Not for public use in the future.
- 
Method SummaryModifier and TypeMethodDescriptionstatic String[]byteToString(byte[] bytes, int length)Converts an array of bytes into an array of Strings for a fixed string dataset.voidclear()Clears memory held by the dataset, such as the data buffer.voidClears the current data buffer in memory and forces the next read() to load the data from file.static ObjectconvertFromUnsignedC(Object dataIN)Deprecated.Not for public use in the future.static ObjectconvertFromUnsignedC(Object dataIN, Object dataOUT)Converts one-dimension array of unsigned C-type integers to a new array of appropriate Java integer in memory.static ObjectconvertToUnsignedC(Object dataIN)Deprecated.Not for public use in the future.static ObjectconvertToUnsignedC(Object dataIN, Object dataOUT)Converts the array of converted unsigned integers back to unsigned C-type integer data in memory.abstract DatasetCreates a new dataset and writes the data buffer to the new dataset.long[]Returns the array that contains the dimension sizes of the chunk of the dataset.Returns the string representation of compression information.booleanReturns the flag that indicates if a byte array is converted to a string array.getData()Returns the data buffer of the dataset in memory.Returns the datatype of the data object.String[]Returns the array of strings that represent the dimension names.long[]getDims()Returns the array that contains the dimension sizes of the dataset.Returns the string representation of filter information.longReturns the dimension size of the vertical axis.long[]Returns the array that contains the max dimension sizes of the dataset.Get Class of the original data buffer if converted.intgetRank()Returns the rank (number of dimensions) of the dataset.long[]Returns the dimension sizes of the selected subset.int[]Returns the indices of display order.longgetSize(long tid)Returns the size in bytes of a given datatype.long[]Returns the starting position of a selected subset.Returns the string representation of storage information.Returns the string representation of storage layout information.long[]Returns the selectedStride of the selected dataset.getVirtualFilename(int index)intlonggetWidth()Returns the dimension size of the horizontal axis.booleanisInited()booleanisString(long tid)Checks if a given datatype is a string.booleanabstract byte[]Reads the raw data of the dataset from file to a byte array.voidsetConvertByteToString(boolean b)Sets the flag that indicates if a byte array is converted to a string array.voidNot for public use in the future.static byte[]stringToByte(String[] strings, int length)Converts a string array into an array of bytes for a fixed string dataset.voidwrite()Writes the memory buffer of this dataset to file.Methods inherited from class hdf.object.HObjectclose, createFullname, debug, equals, equals, equalsOID, getFID, getFile, getFileFormat, getFullName, getLinkTargetObjName, getName, getOID, getPath, hashCode, open, setFullname, setLinkTargetObjName, setName, setPath, toStringMethods inherited from class java.lang.Objectclone, finalize, getClass, notify, notifyAll, wait, wait, waitMethods inherited from interface hdf.object.DataFormatconvertFromUnsignedC, convertToUnsignedC, getFillValue, init, read, writeMethods inherited from interface hdf.object.MetaDataContainergetMetadata, hasAttribute, removeMetadata, updateMetadata, writeMetadata
- 
Field Details- 
dataThe memory buffer that holds the raw data array of the dataset.
- 
rankThe number of dimensions of the dataset.
- 
dimsThe current dimension sizes of the dataset
- 
maxDimsThe max dimension sizes of the dataset
- 
selectedDimsArray that contains the number of data points selected (for read/write) in each dimension.The selected size must be less than or equal to the current dimension size. A subset of a rectangle selection is defined by the starting position and selected sizes. For example, if a 4 X 5 dataset is as follows: 0, 1, 2, 3, 4 10, 11, 12, 13, 14 20, 21, 22, 23, 24 30, 31, 32, 33, 34 long[] dims = {4, 5}; long[] startDims = {1, 2}; long[] selectedDims = {3, 3}; then the following subset is selected by the startDims and selectedDims above: 12, 13, 14 22, 23, 24 32, 33, 34
- 
startDimsThe starting position of each dimension of a selected subset. With both the starting position and selected sizes, the subset of a rectangle selection is fully defined.
- 
selectedIndexArray that contains the indices of the dimensions selected for display.selectedIndex[] is provided for two purposes: - selectedIndex[] is used to indicate the order of dimensions for display, i.e. selectedIndex[0] = row, selectedIndex[1] = column and selectedIndex[2] = depth. For example, for a four dimension dataset, if selectedIndex[] is {1, 2, 3}, then dim[1] is selected as row index, dim[2] is selected as column index and dim[3] is selected as depth index.
- selectedIndex[] is also used to select dimensions for display for datasets with three or more dimensions. We assume that applications such as HDFView can only display data up to three dimensions (a 2D spreadsheet/image with a third dimension that the 2D spreadsheet/image is cut from). For datasets with more than three dimensions, we need selectedIndex[] to store which three dimensions are chosen for display. For example, for a four dimension dataset, if selectedIndex[] = {1, 2, 3}, then dim[1] is selected as row index, dim[2] is selected as column index and dim[3] is selected as depth index. dim[0] is not selected. Its location is fixed at 0 by default.
 
- 
selectedStrideThe number of elements to move from the start location in each dimension. For example, if selectedStride[0] = 2, every other data point is selected along dim[0].
- 
chunkSizeThe array of dimension sizes for a chunk.
- 
compressionThe compression information.
- 
COMPRESSION_GZIP_TXT- See Also:
- Constant Field Values
 
- 
filtersThe filters information.
- 
storageLayoutThe storage layout information.
- 
storageThe storage information.
- 
datatypeThe datatype object of the dataset.
- 
dimNamesArray of strings that represent the dimension names. It is null if dimension names do not exist.
- 
convertByteToStringFlag to indicate if the byte[] array is converted to strings
- 
isDataLoadedFlag to indicate if data values are loaded into memory.
- 
initedFlag to indicate if this dataset has been initialized
- 
nPointsThe number of data points in the memory buffer.
- 
originalBufThe data buffer that contains the raw data directly reading from file (before any data conversion).
- 
convertedBufThe array that holds the converted data of unsigned C-type integers.For example, Suppose that the original data is an array of unsigned 16-bit short integers. Since Java does not support unsigned integer, the data is converted to an array of 32-bit singed integer. In that case, the converted buffer is the array of 32-bit singed integer. 
 
- 
- 
Constructor Details- 
DatasetConstructs a Dataset object with a given file, name and path.- Parameters:
- theFile- the file that contains the dataset.
- dsName- the name of the Dataset, e.g. "dset1".
- dsPath- the full group path of this Dataset, e.g. "/arrays/".
 
- 
DatasetDeprecated.Not for public use in the future.
 UsingDataset(FileFormat, String, String)- Parameters:
- theFile- the file that contains the dataset.
- dsName- the name of the Dataset, e.g. "dset1".
- dsPath- the full group path of this Dataset, e.g. "/arrays/".
- oid- the oid of this Dataset.
 
 
- 
- 
Method Details- 
clearClears memory held by the dataset, such as the data buffer.
- 
getRankReturns the rank (number of dimensions) of the dataset.- Specified by:
- getRankin interface- DataFormat
- Returns:
- the number of dimensions of the dataset.
 
- 
getDimsReturns the array that contains the dimension sizes of the dataset.- Specified by:
- getDimsin interface- DataFormat
- Returns:
- the dimension sizes of the dataset.
 
- 
getMaxDimsReturns the array that contains the max dimension sizes of the dataset.- Returns:
- the max dimension sizes of the dataset.
 
- 
getSelectedDimsReturns the dimension sizes of the selected subset.The SelectedDims is the number of data points of the selected subset. Applications can use this array to change the size of selected subset. The selected size must be less than or equal to the current dimension size. Combined with the starting position, selected sizes and stride, the subset of a rectangle selection is fully defined. For example, if a 4 X 5 dataset is as follows: 0, 1, 2, 3, 4 10, 11, 12, 13, 14 20, 21, 22, 23, 24 30, 31, 32, 33, 34 long[] dims = {4, 5}; long[] startDims = {1, 2}; long[] selectedDims = {3, 3}; long[] selectedStride = {1, 1}; then the following subset is selected by the startDims and selectedDims 12, 13, 14 22, 23, 24 32, 33, 34- Specified by:
- getSelectedDimsin interface- DataFormat
- Returns:
- the dimension sizes of the selected subset.
 
- 
getStartDimsReturns the starting position of a selected subset.Applications can use this array to change the starting position of a selection. Combined with the selected dimensions, selected sizes and stride, the subset of a rectangle selection is fully defined. For example, if a 4 X 5 dataset is as follows: 0, 1, 2, 3, 4 10, 11, 12, 13, 14 20, 21, 22, 23, 24 30, 31, 32, 33, 34 long[] dims = {4, 5}; long[] startDims = {1, 2}; long[] selectedDims = {3, 3}; long[] selectedStride = {1, 1}; then the following subset is selected by the startDims and selectedDims 12, 13, 14 22, 23, 24 32, 33, 34- Specified by:
- getStartDimsin interface- DataFormat
- Returns:
- the starting position of a selected subset.
 
- 
getStrideReturns the selectedStride of the selected dataset.Applications can use this array to change how many elements to move in each dimension. Combined with the starting position and selected sizes, the subset of a rectangle selection is defined. For example, if a 4 X 5 dataset is as follows: 0, 1, 2, 3, 4 10, 11, 12, 13, 14 20, 21, 22, 23, 24 30, 31, 32, 33, 34 long[] dims = {4, 5}; long[] startDims = {0, 0}; long[] selectedDims = {2, 2}; long[] selectedStride = {2, 3}; then the following subset is selected by the startDims and selectedDims 0, 3 20, 23- Specified by:
- getStridein interface- DataFormat
- Returns:
- the selectedStride of the selected dataset.
 
- 
setConvertByteToStringSets the flag that indicates if a byte array is converted to a string array.In a string dataset, the raw data from file is stored in a byte array. By default, this byte array is converted to an array of strings. For a large dataset (e.g. more than one million strings), the conversion takes a long time and requires a lot of memory space to store the strings. In some applications, such a conversion can be delayed. For example, A GUI application may convert only the part of the strings that is visible to the users, not the entire data array. setConvertByteToString(boolean b) allows users to set the flag so that applications can choose to perform the byte-to-string conversion or not. If the flag is set to false, the getData() returns an array of byte instead of an array of strings. - Parameters:
- b- convert bytes to strings if b is true; otherwise, if false, do not convert bytes to strings.
 
- 
getConvertByteToStringReturns the flag that indicates if a byte array is converted to a string array.- Returns:
- true if byte array is converted to string; otherwise, returns false if there is no conversion.
 
- 
readBytesReads the raw data of the dataset from file to a byte array.readBytes() reads raw data to an array of bytes instead of array of its datatype. For example, for a one-dimension 32-bit integer dataset of size 5, readBytes() returns a byte array of size 20 instead of an int array of 5. readBytes() can be used to copy data from one dataset to another efficiently because the raw data is not converted to its native type, it saves memory space and CPU time. - Returns:
- the byte array of the raw data.
- Throws:
- Exception- if data can not be read
 
- 
writeWrites the memory buffer of this dataset to file.- Specified by:
- writein interface- DataFormat
- Throws:
- Exception- if buffer can not be written
 
- 
copyCreates a new dataset and writes the data buffer to the new dataset.This function allows applications to create a new dataset for a given data buffer. For example, users can select a specific interesting part from a large image and create a new image with the selection. The new dataset retains the datatype and dataset creation properties of this dataset. - Parameters:
- pgroup- the group which the dataset is copied to.
- name- the name of the new dataset.
- dims- the dimension sizes of the the new dataset.
- data- the data values of the subset to be copied.
- Returns:
- the new dataset.
- Throws:
- Exception- if dataset can not be copied
 
- 
isInited- Specified by:
- isInitedin interface- DataFormat
 
- 
getDataReturns the data buffer of the dataset in memory.If data is already loaded into memory, returns the data; otherwise, calls read() to read data from file into a memory buffer and returns the memory buffer. By default, the whole dataset is read into memory. Users can also select a subset to read. Subsetting is done in an implicit way. How to Select a Subset A selection is specified by three arrays: start, stride and count. - start: offset of a selection
- stride: determines how many elements to move in each dimension
- count: number of elements to select in each dimension
 The following example shows how to make a subset. In the example, the dataset is a 4-dimensional array of [200][100][50][10], i.e. dims[0]=200; dims[1]=100; dims[2]=50; dims[3]=10; 
 We want to select every other data point in dims[1] and dims[2]int rank = dataset.getRank(); // number of dimensions of the dataset long[] dims = dataset.getDims(); // the dimension sizes of the dataset long[] selected = dataset.getSelectedDims(); // the selected size of the dataet long[] start = dataset.getStartDims(); // the offset of the selection long[] stride = dataset.getStride(); // the stride of the dataset int[] selectedIndex = dataset.getSelectedIndex(); // the selected dimensions for display // select dim1 and dim2 as 2D data for display,and slice through dim0 selectedIndex[0] = 1; selectedIndex[1] = 2; selectedIndex[1] = 0; // reset the selection arrays for (int i = 0; i < rank; i++) { start[i] = 0; selected[i] = 1; stride[i] = 1; } // set stride to 2 on dim1 and dim2 so that every other data point is // selected. stride[1] = 2; stride[2] = 2; // set the selection size of dim1 and dim2 selected[1] = dims[1] / stride[1]; selected[2] = dims[1] / stride[2]; // when dataset.getData() is called, the selection above will be used since // the dimension arrays are passed by reference. Changes of these arrays // outside the dataset object directly change the values of these array // in the dataset object.For ScalarDS, the memory data buffer is a one-dimensional array of byte, short, int, float, double or String type based on the datatype of the dataset. For CompoundDS, the memory data object is an java.util.List object. Each element of the list is a data array that corresponds to a compound field. For example, if compound dataset "comp" has the following nested structure, and member datatypes comp --> m01 (int) comp --> m02 (float) comp --> nest1 --> m11 (char) comp --> nest1 --> m12 (String) comp --> nest1 --> nest2 --> m21 (long) comp --> nest1 --> nest2 --> m22 (double) getData() returns a list of six arrays: {int[], float[], char[], String[], long[] and double[]}.- Specified by:
- getDatain interface- DataFormat
- Returns:
- the memory buffer of the dataset.
- Throws:
- Exception- if object can not be read
- OutOfMemoryError- if memory is exhausted
 
- 
setDataNot for public use in the future.setData() is not safe to use because it changes memory buffer of the dataset object. Dataset operations such as write/read will fail if the buffer type or size is changed. - Specified by:
- setDatain interface- DataFormat
- Parameters:
- d- the object data -must be an array of Objects
 
- 
clearDataClears the current data buffer in memory and forces the next read() to load the data from file.The function read() loads data from file into memory only if the data is not read. If data is already in memory, read() just returns the memory buffer. Sometimes we want to force read() to re-read data from file. For example, when the selection is changed, we need to re-read the data. - Specified by:
- clearDatain interface- DataFormat
- See Also:
- getData(),- DataFormat.read()
 
- 
getHeightReturns the dimension size of the vertical axis.This function is used by GUI applications such as HDFView. GUI applications display a dataset in a 2D table or 2D image. The display order is specified by the index array of selectedIndex as follow: - selectedIndex[0] -- height
- The vertical axis
- selectedIndex[1] -- width
- The horizontal axis
- selectedIndex[2] -- depth
- The depth axis is used for 3 or more dimensional datasets.
 int[] selectedIndex = dataset.getSelectedIndex(); selectedIndex[0] = 0; selectedIndex[1] = 1; - Specified by:
- getHeightin interface- DataFormat
- Returns:
- the size of dimension of the vertical axis.
- See Also:
- getSelectedIndex(),- getWidth()
 
- 
getWidthReturns the dimension size of the horizontal axis.This function is used by GUI applications such as HDFView. GUI applications display a dataset in 2D Table or 2D Image. The display order is specified by the index array of selectedIndex as follow: - selectedIndex[0] -- height
- The vertical axis
- selectedIndex[1] -- width
- The horizontal axis
- selectedIndex[2] -- depth
- The depth axis, which is used for 3 or more dimension datasets.
 int[] selectedIndex = dataset.getSelectedIndex(); selectedIndex[0] = 0; selectedIndex[1] = 1; - Specified by:
- getWidthin interface- DataFormat
- Returns:
- the size of dimension of the horizontal axis.
- See Also:
- getSelectedIndex(),- getHeight()
 
- 
getSelectedIndexReturns the indices of display order.selectedIndex[] is provided for two purposes: - 
 selectedIndex[] is used to indicate the order of dimensions for display.
 selectedIndex[0] is for the row, selectedIndex[1] is for the column and
 selectedIndex[2] for the depth.
 For example, for a four dimension dataset, if selectedIndex[] = {1, 2, 3}, then dim[1] is selected as row index, dim[2] is selected as column index and dim[3] is selected as depth index. 
- 
 selectedIndex[] is also used to select dimensions for display for
 datasets with three or more dimensions. We assume that applications such
 as HDFView can only display data values up to three dimensions (2D
 spreadsheet/image with a third dimension which the 2D spreadsheet/image
 is selected from). For datasets with more than three dimensions, we need
 selectedIndex[] to tell applications which three dimensions are chosen
 for display. 
 For example, for a four dimension dataset, if selectedIndex[] = {1, 2, 3}, then dim[1] is selected as row index, dim[2] is selected as column index and dim[3] is selected as depth index. dim[0] is not selected. Its location is fixed at 0 by default.
 - Specified by:
- getSelectedIndexin interface- DataFormat
- Returns:
- the array of the indices of display order.
 
- 
 selectedIndex[] is used to indicate the order of dimensions for display.
 selectedIndex[0] is for the row, selectedIndex[1] is for the column and
 selectedIndex[2] for the depth.
 
- 
getCompressionReturns the string representation of compression information.For example, "SZIP: Pixels per block = 8: H5Z_FILTER_CONFIG_DECODE_ENABLED". - Specified by:
- getCompressionin interface- DataFormat
- Returns:
- the string representation of compression information.
 
- 
getFiltersReturns the string representation of filter information.- Returns:
- the string representation of filter information.
 
- 
getStorageLayoutReturns the string representation of storage layout information.- Returns:
- the string representation of storage layout information.
 
- 
getStorageReturns the string representation of storage information.- Returns:
- the string representation of storage information.
 
- 
getChunkSizeReturns the array that contains the dimension sizes of the chunk of the dataset. Returns null if the dataset is not chunked.- Returns:
- the array of chunk sizes or returns null if the dataset is not chunked.
 
- 
getDatatypeDescription copied from interface:DataFormatReturns the datatype of the data object.- Specified by:
- getDatatypein interface- DataFormat
- Returns:
- the datatype of the data object.
 
- 
convertFromUnsignedCDeprecated.Not for public use in the future.
 UsingconvertFromUnsignedC(Object, Object)- Parameters:
- dataIN- the object data
- Returns:
- the converted object
 
- 
convertFromUnsignedCConverts one-dimension array of unsigned C-type integers to a new array of appropriate Java integer in memory.Since Java does not support unsigned integer, values of unsigned C-type integers must be converted into its appropriate Java integer. Otherwise, the data value will not displayed correctly. For example, if an unsigned C byte, x = 200, is stored into an Java byte y, y will be -56 instead of the correct value of 200. Unsigned C integers are upgrade to Java integers according to the following table: 
 NOTE: this conversion cannot deal with unsigned 64-bit integers. Therefore, the values of unsigned 64-bit datasets may be wrong in Java applications.Mapping Unsigned C Integers to Java Integers Unsigned C Integer JAVA Integer unsigned byte signed short unsigned short signed int unsigned int signed long unsigned long signed long If memory data of unsigned integers is converted by convertFromUnsignedC(), convertToUnsignedC() must be called to convert the data back to unsigned C before data is written into file. - Parameters:
- dataIN- the input 1D array of the unsigned C-type integers.
- dataOUT- the output converted (or upgraded) 1D array of Java integers.
- Returns:
- the upgraded 1D array of Java integers.
- See Also:
- convertToUnsignedC(Object, Object)
 
- 
convertToUnsignedCDeprecated.Not for public use in the future.
 UsingconvertToUnsignedC(Object, Object)- Parameters:
- dataIN- the input 1D array of the unsigned C-type integers.
- Returns:
- the upgraded 1D array of Java integers.
 
- 
convertToUnsignedCConverts the array of converted unsigned integers back to unsigned C-type integer data in memory.If memory data of unsigned integers is converted by convertFromUnsignedC(), convertToUnsignedC() must be called to convert the data back to unsigned C before data is written into file. - Parameters:
- dataIN- the input array of the Java integer.
- dataOUT- the output array of the unsigned C-type integer.
- Returns:
- the converted data of unsigned C-type integer array.
- See Also:
- convertFromUnsignedC(Object, Object)
 
- 
byteToStringConverts an array of bytes into an array of Strings for a fixed string dataset.A C-string is an array of chars while an Java String is an object. When a string dataset is read into a Java application, the data is stored in an array of Java bytes. byteToString() is used to convert the array of bytes into an array of Java strings so that applications can display and modify the data content. For example, the content of a two element C string dataset is {"ABC", "abc"}. Java applications will read the data into a byte array of {65, 66, 67, 97, 98, 99). byteToString(bytes, 3) returns an array of Java String of strs[0]="ABC", and strs[1]="abc". If memory data of strings is converted to Java Strings, stringToByte() must be called to convert the memory data back to byte array before data is written to file. - Parameters:
- bytes- the array of bytes to convert.
- length- the length of string.
- Returns:
- the array of Java String.
- See Also:
- stringToByte(String[], int)
 
- 
stringToByteConverts a string array into an array of bytes for a fixed string dataset.If memory data of strings is converted to Java Strings, stringToByte() must be called to convert the memory data back to byte array before data is written to file. - Parameters:
- strings- the array of string.
- length- the length of string.
- Returns:
- the array of bytes.
- See Also:
- byteToString(byte[] bytes, int length)
 
- 
getDimNamesReturns the array of strings that represent the dimension names. Returns null if there is no dimension name.Some datasets have pre-defined names for each dimension such as "Latitude" and "Longitude". getDimNames() returns these pre-defined names. - Returns:
- the names of dimensions, or null if there is no dimension name.
 
- 
isStringChecks if a given datatype is a string. Sub-classes must replace this default implementation.- Parameters:
- tid- The data type identifier.
- Returns:
- true if the datatype is a string; otherwise returns false.
 
- 
getSizeReturns the size in bytes of a given datatype. Sub-classes must replace this default implementation.- Parameters:
- tid- The data type identifier.
- Returns:
- The size of the datatype
 
- 
getOriginalClassGet Class of the original data buffer if converted.- Specified by:
- getOriginalClassin interface- DataFormat
- Returns:
- the Class of originalBuf
 
- 
isVirtual
- 
getVirtualFilename
- 
getVirtualMaps
 
-