cleanX.image_work.journaling_pipeline module¶
- class cleanX.image_work.journaling_pipeline.JournalingPipeline(steps=None, batch_size=None, journal=True, keep_journal=False)¶
Bases:
Pipeline
This class extends
Pipeline
with the ability to store the progress and the state in a database.- __init__(steps=None, batch_size=None, journal=True, keep_journal=False)¶
Initializes pipeline with two additional arguments controlling the behavior of persistent storage. See
Pipeline
for remaining arguments.- Parameters:
journal (Union[bool, str]) – If
True
is passed, the pipeline code will use a preconfigured directory to store the journal. Otherwise, this must be the path to the directory to store the journal database.keep_journal (bool) – Controls whether the journal is kept after successful completion of the pipeline.
- classmethod restore(journal_dir, skip=0, **overrides)¶
Restore the previously created journaling pipeline from the last executed step.
- Parameters:
journal_dir (Suitable for
os.path.join()
) – The directory containing journal database to restore from.skip – Skip this many steps before attempting to resume the pipeline. This is useful if you know that the step that failed will fail again, but you want to execute the rest of the steps in the pipeline.
**overrides – Arguments to pass to the created pipeline instance that will override those restored from the journal.
- Returns:
Fresh
JournalingPipeline
object fast-forwarded to the last executed step +skip
.- Return type:
JournalingPipeline
- process(source)¶
Starts this pipeline.
- Parameters:
source (Iterable) – This must be an iterable that yields file names for the images to be processed.
- process_batch_agg(batch, step)¶
- process_batch_parallel(batch, step)¶
- process_step(step, srciter)¶