File Structure for Datasets

For simulation data files in CNN microservice HDF5 binary file format is used. All pre-prepared data sets are available using micro-service API.

HDF5 files have the following structure:

  • “/camera” : Group with camera geometry description (in current implementation only hexagonal grid cameras are supported)
    • “/camera/x” : Dataset of float64 values with shape (N_pixels,) – x coordinate of the camera pixels
    • “/camera/y” : Dataset of float64 values with shape (N_pixels,) – y coordinate of the camera pixels
  • “/data” : Group with camera images information
    • “/data/images” : Dataset of float64 values with shape (Nimages, Npixels,) – pixel values in p.e.
    • “/data/index” : Dataset of int32 values with shape (N_images,) – event indexes
  • “/simulation_metadata” : Group with description of the event (in current realisation is required for processing, but not used)
    • “/simulationmetadata/distancetocore” : Dataset of float32 values with shape (Nimages,) – distance from telescope to shower core
    • “/simulationmetadata/energytev” : Dataset of float32 values with shape (N_images,) – primary particle energy in TeV
    • “/simulationmetadata/isgamma” : Dataset of bool values with shape (N_images,) – if primary particle type is gamma than equal to True or False in other case

An example how to print the structure of the HDF5 file using python:

import h5py 

def print_stucture_of_hdf5(obj, path="/", lvl=0): 
if isinstance(obj[path], h5py._hl.dataset.Dataset): 
	print("".join([""] * lvl) + 
		" {} : Dataset of {} with shape {}".format( 
			path,obj[path].dtype, 
			obj[path].shape)) 
elif isinstance(obj[path], h5py._hl.group.Group): 
	print("".join([""] * lvl) + 
		" {} : Group".format(path)) 
else: 
	print("".join([""] * lvl) + 
		" {} : {}".format(path, type(obj[path]))) 
if "keys" in dir(obj[path]): 
	for key in obj[path].keys(): 
		print_stucture_of_hdf5( 
			obj, 
			path="{}/{}".format(path if path != "/" else "", key), 
			lvl=lvl + 1) 


with h5py.File("./dataset.h5") as f: 
	print_stucture_of_hdf5(f)

Output:

 / : Group 
/camera : Group 
/camera/x : Dataset of float64 with shape (560,) 
/camera/y : Dataset of float64 with shape (560,) 
/data : Group 
/data/images : Dataset of float64 with shape (10, 560) 
/data/index : Dataset of int32 with shape (10,) 
/simulation_metadata : Group 
/simulation_metadata/distance_to_core : Dataset of float32 with shape (10,) 
/simulation_metadata/energy_tev : Dataset of float32 with shape (10,) 
/simulation_metadata/is_gamma : Dataset of bool with shape (10,) 
Scroll Up