cleanX.image_work package¶
Submodules¶
- cleanX.image_work.image_functions module
Cv2Error
cv2_imread()
crop_np()
crop_np_white()
find_outliers_sum_of_pixels_across_set()
hist_sum_of_pixels_across_set()
crop()
subtle_sharpie_enhance()
harsh_sharpie_enhance()
salting()
simple_rotate_no_pil()
blur_out_edges()
multi_rotation_augmentation_no_pill()
show_major_lines_on_image()
find_big_lines()
separate_image_averager()
dimensions_to_df()
dimensions_to_histo()
proportions_ht_wt_to_histo()
find_very_hazy()
find_by_sample_upper()
find_sample_upper_greater_than_lower()
find_outliers_by_total_mean()
find_outliers_by_mean_to_df()
create_matrix()
find_tiny_image_differences()
tesseract_specific()
find_suspect_text()
find_suspect_text_by_length()
histogram_difference_for_inverts()
inverts_by_sum_compare()
histogram_difference_for_inverts_todf()
find_duplicated_images()
find_duplicated_images_todf()
show_images_in_df()
dataframe_up_my_pics()
Rotator
simple_spinning_template()
make_contour_image()
avg_image_maker()
set_image_variability()
avg_image_maker_by_label()
zero_to_twofivefive_simplest_norming()
rescale_range_from_histogram_low_end()
make_histo_scaled_folder()
give_size_count_df()
give_size_counted_dfs()
image_quality_by_size()
find_close_images()
show_close_images()
image_to_histo()
black_end_ratio()
outline_segment_by_otsu()
binarize_by_otsu()
column_sum_folder()
blind_quality_matrix()
fourier_transf()
pad_to_size()
cut_to_size()
cut_or_pad()
rotated_with_max_clean_area()
noise_sum_cv()
noise_sum_median_blur()
noise_sum_gaussian()
noise_sum_bilateral()
noise_sum_bilateralLO()
noise_sum_5k()
noise_sum_7k()
blind_noise_matrix()
segmented_blind_noise_matrix()
make_inverted()
cv2_phash_for_dupes()
- cleanX.image_work.journaling_pipeline module
- cleanX.image_work.pipeline module
- cleanX.image_work.steps module
get_known_steps()
RegisteredStep
Step
Aggregate
Mean
GroupHistoHtWt
GroupHistoHtWt.__init__()
GroupHistoHtWt.agg()
GroupHistoHtWt.post()
GroupHistoHtWt.__reduce__()
GroupHistoHtWt.aggregate()
GroupHistoHtWt.apply()
GroupHistoHtWt.begin_transaction()
GroupHistoHtWt.commit_transaction()
GroupHistoHtWt.from_cmd_args()
GroupHistoHtWt.pre()
GroupHistoHtWt.read()
GroupHistoHtWt.to_json()
GroupHistoHtWt.write()
GroupHistoPorportion
GroupHistoPorportion.__init__()
GroupHistoPorportion.agg()
GroupHistoPorportion.post()
GroupHistoPorportion.__reduce__()
GroupHistoPorportion.aggregate()
GroupHistoPorportion.apply()
GroupHistoPorportion.begin_transaction()
GroupHistoPorportion.commit_transaction()
GroupHistoPorportion.from_cmd_args()
GroupHistoPorportion.pre()
GroupHistoPorportion.read()
GroupHistoPorportion.to_json()
GroupHistoPorportion.write()
Acquire
Save
FourierTransf
ContourImage
ProjectionHorizoVert
ProjectionHorizoVert.apply()
ProjectionHorizoVert.__init__()
ProjectionHorizoVert.__reduce__()
ProjectionHorizoVert.begin_transaction()
ProjectionHorizoVert.commit_transaction()
ProjectionHorizoVert.from_cmd_args()
ProjectionHorizoVert.read()
ProjectionHorizoVert.to_json()
ProjectionHorizoVert.write()
BlackEdgeCrop
WhiteEdgeCrop
Sharpie
BlurEdges
CleanRotate
Normalize
HistogramNormalize
InvertImages
OtsuBinarize
OtsuLines
Projection
Module contents¶
- cleanX.image_work.create_pipeline(steps, batch_size=None, journal=None, keep_journal=False)¶
Create a pipeline that will execute the
steps
. Ifjournal
is not false, create a journaling pipeline, that can be pick up from the failed step.- Parameters:
steps (Sequence[Step]) – A sequence of
Step
to be executed in this pipeline.batch_size (int) – Controls how many steps are processed concurrently.
journal (Union[bool, str]) – If
True
is passed, the pipeline code will use a preconfigured directory to store the journal. Otherwise, this must be the path to the directory to store the journal database.keep_journal (bool) – Controls whether the journal is kept after successful completion of the pipeline.
- Returns:
a
Pipeline
object or one of its descendants.- Return type:
- cleanX.image_work.restore_pipeline(journal_dir, skip=0, **overrides)¶
Restores previously interrupted pipeline. The pipeline should have been created with
journal
set. If the creating code didn’t specify the directory to keep the journal, it may be obtained in this way:p = create_pipeline(steps=(...), journal=True) journal_dir = p.journal_dir # After pipeline failed p = restore_pipeline(journal_dir)
- Parameters:
journal_dir (Suitable for
os.path.join()
) – The directory containing journal database to restore from.skip – Skip this many steps before attempting to resume the pipeline. This is useful if you know that the step that failed will fail again, but you want to execute the rest of the steps in the pipeline.
**overrides – Arguments to pass to the created pipeline instance that will override those restored from the journal.
- Returns:
Fresh
JournalingPipeline
object fast-forwarded to the last executed step +skip
.- Return type: