rail.core.data module

Rail-specific data management

rail.core.data.DATA_STORE()[source]

Return the factory instance

class rail.core.data.DataHandle(tag, data=None, path=None, creator=None)[source]

Bases: object

Class to act as a handle for a bit of data. Associating it with a file and providing tools to read & write it to that file

Parameters:
  • tag (str) – The tag under which this data handle can be found in the store

  • data (any or None) – The associated data

  • path (str or None) – The path to the associated file

  • creator (str or None) – The name of the stage that created this data handle

close(**kwargs)[source]

Close

finalize_write(**kwargs)[source]

Finalize and close file written by chunks

property has_data

Return true if the data for this handle are loaded

property has_path

Return true if the path for the associated file is defined

initialize_write(data_lenght, **kwargs)[source]

Initialize file to be written by chunks

property is_written

Return true if the associated file has been written

iterator(**kwargs)[source]

Iterator over the data

classmethod make_name(tag)[source]

Construct and return file name for a particular data tag

open(**kwargs)[source]

Open and return the associated file

Notes

This will simply open the file and return a file-like object to the caller. It will not read or cache the data

read(force=False, **kwargs)[source]

Read and return the data from the associated file

set_data(data, partial=False)[source]

Set the data for a chunk, and set the partial flag to true

size(**kwargs)[source]

Return the size of the data associated to this handle

suffix = ''
write(**kwargs)[source]

Write the data to the associatied file

write_chunk(start, end, **kwargs)[source]

Write the data to the associatied file

class rail.core.data.DataStore(**kwargs)[source]

Bases: dict

Class to provide a transient data store

This class: 1) associates data products with keys 2) provides functions to read and write the various data produces to associated files

add_data(key, data, handle_class, path=None, creator='DataStore')[source]

Create a handle for some data, and insert it into the DataStore

allow_overwrite = True
open(key, mode='r', **kwargs)[source]

Open and return the file associated to a particular key

read(key, force=False, **kwargs)[source]

Read the data associated to a particular key

read_file(key, handle_class, path, creator='DataStore', **kwargs)[source]

Create a handle, use it to read a file, and insert it into the DataStore

write(key, **kwargs)[source]

Write the data associated to a particular key

write_all(force=False, **kwargs)[source]

Write all the data in this DataStore

class rail.core.data.FitsHandle(tag, data=None, path=None, creator=None)[source]

Bases: TableHandle

DataHandle for a table written to fits

suffix = 'fits'
class rail.core.data.FlowDict[source]

Bases: dict

A specialized dict to keep track of individual flow objects: this is just a dict these additional features

  1. Keys are paths

  2. Values are flow objects, this is checked at runtime.

  3. There is a read(path, force=False) method that reads a flow object and inserts it into the dictionary

  4. There is a single static instance of this class

read(path, force=False)[source]

Read a Flow object from disk and add it to this dictionary

class rail.core.data.FlowHandle(tag, data=None, path=None, creator=None)[source]

Bases: ModelHandle

A wrapper around a file that describes a PZFlow object

flow_factory = {}
suffix = 'pkl'
class rail.core.data.Hdf5Handle(tag, data=None, path=None, creator=None)[source]

Bases: TableHandle

DataHandle for a table written to HDF5

suffix = 'hdf5'
class rail.core.data.ModelDict[source]

Bases: dict

A specialized dict to keep track of individual estimation models objects: this is just a dict these additional features

  1. Keys are paths

  2. There is a read(path, force=False) method that reads a model object and inserts it into the dictionary

  3. There is a single static instance of this class

open(path, mode, **kwargs)[source]

Open the file and return the file handle

read(path, force=False, reader=None, **kwargs)[source]

Read a model into this dict

write(model, path, force=False, writer=None, **kwargs)[source]

Write the model, this default implementation uses pickle

class rail.core.data.ModelHandle(tag, data=None, path=None, creator=None)[source]

Bases: DataHandle

DataHandle for machine learning models

model_factory = {}
suffix = 'pkl'
class rail.core.data.PqHandle(tag, data=None, path=None, creator=None)[source]

Bases: TableHandle

DataHandle for a parquet table

suffix = 'pq'
class rail.core.data.QPHandle(tag, data=None, path=None, creator=None)[source]

Bases: DataHandle

DataHandle for qp ensembles

suffix = 'hdf5'
class rail.core.data.TableHandle(tag, data=None, path=None, creator=None)[source]

Bases: DataHandle

DataHandle for single tables of data

suffix = None
rail.core.data.default_model_read(modelfile)[source]

Default function to read model files, simply used pickle.load

rail.core.data.default_model_write(model, path)[source]

Write the model, this default implementation uses pickle