rail.core.stage module

Base class for PipelineStages in Rail

class rail.core.stage.RailPipeline[source]

Bases: MiniPipeline

A pipeline intended for interactive use

Mainly this allows for more concise pipeline specification, along the lines of:

self.stage_1 = Stage1Class.build(…) self.stage_2 = Stage2Class.build(connections=dict(input=self.stage1.io.output), …)

And end up with a fully specified pipeline.

class rail.core.stage.RailStage(args, comm=None)[source]

Bases: PipelineStage

Base class for rail stages

This inherits from ceci.PipelineStage and implements rail-specific data handling In particular, this provides some very useful features:

1. Access to the DataStore, which keeps track of the various data used in a pipeline, and provides access to each by a unique key.

2. Functionality to help manage multiple instances of a particular class of stage. The original ceci design didn’t have a mechanism to handle this. If you tried you would run into name clashes between the different instances. In ceci 1.7 we added functionality to ceci to allow you to have multiple instances of a single class, in particular we distinguish between the class name (cls.name) and and the name of the particular instance (self.instance_name) and added aliasing for inputs and outputs, so that different instances of PipelineStage would be able to give different names to their inputs and outputs. However, using that functionality in a consistent way requires a bit of care. So here we are providing methods to do that, and to do it in a way that uses the DataStore to keep track of the various data products.

Notes

These methods typically take a tag as input (i.e., something like “input”), but use the “aliased_tag” (i.e., something like “inform_pz_input”) when interacting with the DataStore.

In particular, the get_handle(), get_data() and input_iterator() will get the data from the DataStore under the aliased tag. E.g., if you call self.get_data(‘input’) for a Stage that has aliased “input” to “special_pz_input”, it will get the data associated to “special_pz_input” in the DataStore.

Similarly, add_handle() and set_data() will add the data to the DataStore under the aliased tag e.g., if you call self.set_data(‘input’) for a Stage that has aliased “input” to “special_pz_input”, it will store the data in the DataStore under the key “special_pz_input”.

And connect_input() will do the alias lookup both on the input and output. I.e., it is the same as calling self.set_data(inputTag, other.get_handle(outputTag, allow_missing=True), do_read=False)

add_data(tag, data=None)[source]

Adds a handle to the DataStore associated to a particular tag and attaches data to it.

Parameters:
  • tag (str) – The tag (from cls.inputs or cls.outputs) for this data

  • data (any) –

Returns:

data – The data accesed by the handle assocated to the tag

Return type:

any

add_handle(tag, data=None, path=None)[source]

Adds a DataHandle associated to a particular tag

Parameters:
  • tag (str) – The tag (from cls.inputs or cls.outputs) for this data

  • data (any or None) – If not None these data will be associated to the handle

  • path (str or None) – If not None, this will be the path used to read the data

Returns:

handle – The handle that gives access to the associated data

Return type:

DataHandle

classmethod build(**kwargs)[source]

Return an object that can be used to build a stage

config_options = {'output_mode': <ceci.config.StageParameter object>}
connect_input(other, inputTag=None, outputTag=None)[source]

Connect another stage to this stage as an input

Parameters:
  • other (RailStage) – The stage whose output is being connected

  • inputTag (str) – Which input tag of this stage to connect to. None -> self.inputs[0]

  • outputTag (str) – Which output tag of the other stage to connect to. None -> other.outputs[0]

Returns:

handle

Return type:

The input handle for this stage

get_data(tag, allow_missing=True)[source]

Gets the data associated to a particular tag

Notes

1. This gets the data via the DataHandle, and can and will read the data from disk if needed.

Parameters:
  • tag (str) – The tag (from cls.inputs or cls.outputs) for this data

  • allow_missing (bool) – If False this will raise a key error if the tag is not in the DataStore

Returns:

data – The data accesed by the handle assocated to the tag

Return type:

any

get_handle(tag, path=None, allow_missing=False)[source]

Gets a DataHandle associated to a particular tag

Parameters:
  • tag (str) – The tag (from cls.inputs or cls.outputs) for this data

  • path (str or None) – The path to the data, only needed if we might need to read the data

  • allow_missing (bool) – If False this will raise a key error if the tag is not in the DataStore

Returns:

handle – The handle that give access to the associated data

Return type:

DataHandle

input_iterator(tag, **kwargs)[source]

Iterate the input assocated to a particular tag

Parameters:
  • tag (str) – The tag (from cls.inputs or cls.outputs) for this data

  • kwargs (dict[str, Any]) – These will be passed to the Handle’s iterator method

classmethod make_and_connect(**kwargs)[source]

Make a stage and connects it to other stages

Notes

kwargs are used to set stage configuration, the should be key, value pairs, where the key is the parameter name and the value is value we want to assign

The ‘connections’ keyword is special, it is a dict[str, DataHandle] and should define the Input connections for this stage

Return type:

A stage

name = 'RailStage'
set_data(tag, data, path=None, do_read=True)[source]

Sets the data associated to a particular tag

Notes

1. If data is a DataHandle and tag is one of the input tags, then this will add an alias between the two, i.e., it will set self.config.alias[tag] = data.tag. This allows the user to make connections between stages simply by passing DataHandles between them.

Parameters:
  • tag (str) – The tag (from cls.inputs or cls.outputs) for this data

  • data (any) – The data being set,

  • path (str or None) – Can be used to set the path for the data

  • do_read (bool) – If True, will read the data if it is not set

Returns:

data – The data accesed by the handle assocated to the tag

Return type:

any

class rail.core.stage.RailStageBuild(stage_class, **kwargs)[source]

Bases: object

A small utility class that building stages

This provides a mechasim to get the name of the stage from the attribute name in the Pipeline the stage belongs to.

I.e., we can do:

a_pipe.stage_name = StageClass.build(…)

And get a stage named ‘stage_name’, rather than having to do:

a_stage = StageClass.make_stage(..) a_pipe.add_stage(a_stage)

build(name)[source]

Actually build the stage, this is called by the pipeline the stage belongs to

Parameters:

name (str) – The name for this stage we are building

Returns:

stage – The newly built stage

Return type:

RailStage

class rail.core.stage.StageIO(parent)[source]

Bases: object

A small utility class for Stage Input/ Output

This make it possible to get access to stage inputs and outputs as attributes rather that by using the get_handle() method.

In short it maps

a_stage.get_handle(‘input’, allow_missing=True) to a_stage.input

This allows users to be more concise when writing pipelines.