rail.core.stage module
Base class for PipelineStages in Rail
- class rail.core.stage.RailPipeline[source]
Bases:
MiniPipeline
A pipeline intended for interactive use
Mainly this allows for more concise pipeline specification, along the lines of:
self.stage_1 = Stage1Class.build(…) self.stage_2 = Stage2Class.build(connections=dict(input=self.stage1.io.output), …)
And end up with a fully specified pipeline.
- class rail.core.stage.RailStage(args, comm=None)[source]
Bases:
PipelineStage
Base class for rail stages
This inherits from ceci.PipelineStage and implements rail-specific data handling In particular, this provides some very useful features:
1. Access to the DataStore, which keeps track of the various data used in a pipeline, and provides access to each by a unique key.
2. Functionality to help manage multiple instances of a particular class of stage. The original ceci design didn’t have a mechanism to handle this. If you tried you would run into name clashes between the different instances. In ceci 1.7 we added functionality to ceci to allow you to have multiple instances of a single class, in particular we distinguish between the class name (cls.name) and and the name of the particular instance (self.instance_name) and added aliasing for inputs and outputs, so that different instances of PipelineStage would be able to give different names to their inputs and outputs. However, using that functionality in a consistent way requires a bit of care. So here we are providing methods to do that, and to do it in a way that uses the DataStore to keep track of the various data products.
Notes
These methods typically take a tag as input (i.e., something like “input”), but use the “aliased_tag” (i.e., something like “inform_pz_input”) when interacting with the DataStore.
In particular, the get_handle(), get_data() and input_iterator() will get the data from the DataStore under the aliased tag. E.g., if you call self.get_data(‘input’) for a Stage that has aliased “input” to “special_pz_input”, it will get the data associated to “special_pz_input” in the DataStore.
Similarly, add_handle() and set_data() will add the data to the DataStore under the aliased tag e.g., if you call self.set_data(‘input’) for a Stage that has aliased “input” to “special_pz_input”, it will store the data in the DataStore under the key “special_pz_input”.
And connect_input() will do the alias lookup both on the input and output. I.e., it is the same as calling self.set_data(inputTag, other.get_handle(outputTag, allow_missing=True), do_read=False)
- add_data(tag, data=None)[source]
Adds a handle to the DataStore associated to a particular tag and attaches data to it.
- Parameters:
tag (str) – The tag (from cls.inputs or cls.outputs) for this data
data (any) –
- Returns:
data – The data accesed by the handle assocated to the tag
- Return type:
any
- add_handle(tag, data=None, path=None)[source]
Adds a DataHandle associated to a particular tag
- Parameters:
tag (str) – The tag (from cls.inputs or cls.outputs) for this data
data (any or None) – If not None these data will be associated to the handle
path (str or None) – If not None, this will be the path used to read the data
- Returns:
handle – The handle that gives access to the associated data
- Return type:
- config_options = {'output_mode': <ceci.config.StageParameter object>}
- connect_input(other, inputTag=None, outputTag=None)[source]
Connect another stage to this stage as an input
- Parameters:
other (RailStage) – The stage whose output is being connected
inputTag (str) – Which input tag of this stage to connect to. None -> self.inputs[0]
outputTag (str) – Which output tag of the other stage to connect to. None -> other.outputs[0]
- Returns:
handle
- Return type:
The input handle for this stage
- get_data(tag, allow_missing=True)[source]
Gets the data associated to a particular tag
Notes
1. This gets the data via the DataHandle, and can and will read the data from disk if needed.
- Parameters:
tag (str) – The tag (from cls.inputs or cls.outputs) for this data
allow_missing (bool) – If False this will raise a key error if the tag is not in the DataStore
- Returns:
data – The data accesed by the handle assocated to the tag
- Return type:
any
- get_handle(tag, path=None, allow_missing=False)[source]
Gets a DataHandle associated to a particular tag
- Parameters:
tag (str) – The tag (from cls.inputs or cls.outputs) for this data
path (str or None) – The path to the data, only needed if we might need to read the data
allow_missing (bool) – If False this will raise a key error if the tag is not in the DataStore
- Returns:
handle – The handle that give access to the associated data
- Return type:
- input_iterator(tag, **kwargs)[source]
Iterate the input assocated to a particular tag
- Parameters:
tag (str) – The tag (from cls.inputs or cls.outputs) for this data
kwargs (dict[str, Any]) – These will be passed to the Handle’s iterator method
- classmethod make_and_connect(**kwargs)[source]
Make a stage and connects it to other stages
Notes
kwargs are used to set stage configuration, the should be key, value pairs, where the key is the parameter name and the value is value we want to assign
The ‘connections’ keyword is special, it is a dict[str, DataHandle] and should define the Input connections for this stage
- Return type:
A stage
- name = 'RailStage'
- set_data(tag, data, path=None, do_read=True)[source]
Sets the data associated to a particular tag
Notes
1. If data is a DataHandle and tag is one of the input tags, then this will add an alias between the two, i.e., it will set self.config.alias[tag] = data.tag. This allows the user to make connections between stages simply by passing DataHandles between them.
- Parameters:
tag (str) – The tag (from cls.inputs or cls.outputs) for this data
data (any) – The data being set,
path (str or None) – Can be used to set the path for the data
do_read (bool) – If True, will read the data if it is not set
- Returns:
data – The data accesed by the handle assocated to the tag
- Return type:
any
- class rail.core.stage.RailStageBuild(stage_class, **kwargs)[source]
Bases:
object
A small utility class that building stages
This provides a mechasim to get the name of the stage from the attribute name in the Pipeline the stage belongs to.
I.e., we can do:
a_pipe.stage_name = StageClass.build(…)
And get a stage named ‘stage_name’, rather than having to do:
a_stage = StageClass.make_stage(..) a_pipe.add_stage(a_stage)
- class rail.core.stage.StageIO(parent)[source]
Bases:
object
A small utility class for Stage Input/ Output
This make it possible to get access to stage inputs and outputs as attributes rather that by using the get_handle() method.
In short it maps
a_stage.get_handle(‘input’, allow_missing=True) to a_stage.input
This allows users to be more concise when writing pipelines.