pdmlabs.postprocessing.min_max_scaler

pdmlabs.postprocessing.min_max_scaler#

Min-Max scaling post-processor for anomaly score normalization.

MinMaxPostProcessor normalizes scores to [0, 1] range. Fits scaler on the test data itself (within each transform call), so each batch gets scaled relative to its own min/max values. This is useful for: - Normalizing scores to probability-like [0, 1] range - Comparing scores across different models/sources - Computing normalized confidence scores

Note: Fits on data each time, so different batches may have different scales. For consistent scaling across batches, use a pre-fitted scaler.

Classes

MinMaxPostProcessor(event_preferences)

Normalize anomaly scores to [0, 1] using min-max scaling.

class pdmlabs.postprocessing.min_max_scaler.MinMaxPostProcessor(event_preferences: EventPreferences)#

Bases: PostProcessorInterface

Normalize anomaly scores to [0, 1] using min-max scaling.

This post-processor scales scores so that min score -> 0 and max score -> 1. Each call to transform() fits a new scaler on that batch.

scores_buffer_per_source#

Maintains recent scores per source for online/streaming mode.

Type:: dict

Examples

>>> from pdmlabs.postprocessing.min_max_scaler import MinMaxPostProcessor
>>> processor = MinMaxPostProcessor(event_preferences={'failure': [], 'reset': []})
>>> processor.fit([df_train], ['bearing_1'], events_df)  # No-op
>>>
>>> scores = [0.5, 1.0, 2.0, 1.5]  # Range [0.5, 2.0]
>>> normalized = processor.transform(scores, 'bearing_1', events_df)
>>> # Result: [0.0, 0.333..., 1.0, 0.833...]  (scaled to [0, 1])

fit(historic_data: list[DataFrame], historic_sources: list[str], event_data: DataFrame, anomaly_ranges=None) → None#

No-op fit (scaler is fitted per transform call).

Parameters:

historic_data (list[pd.DataFrame]) – Ignored.
historic_sources (list[str]) – Ignored.
event_data (pd.DataFrame) – Ignored.
anomaly_ranges – Ignored.

get_params()#

Return hyperparameters (none for this post-processor).

Returns:: Empty dict {}.
Return type:: dict

transform(scores: list[float], source: str, event_data: DataFrame) → list[float]#

Scale scores to [0, 1] range based on min/max of this batch.

Fits a new scaler on the provided scores, then transforms them. Note: Different batches will have independent scalings.

Parameters:

scores (list[float]) – Anomaly scores to normalize.
source (str) – Source identifier (unused).
event_data (pd.DataFrame) – Event log (unused).

Returns:

Normalized scores in [0, 1] range (same length as input).

Return type:

list[float]

Examples

>>> scores = [0.5, 1.0, 2.0, 1.5]
>>> normalized = processor.transform(scores, 'bearing_1', events_df)
>>> print(normalized)  # [0.0, 0.333..., 1.0, 0.833...]

transform_one(score_point: float, source: str, is_event: bool) → float#

Scale single score using accumulated buffer.

Maintains a buffer of recent scores per source. Fits scaler to buffer, then normalizes the new score.

Parameters:

score_point (float) – Single anomaly score to normalize.
source (str) – Source identifier (used to maintain separate buffers).
is_event (bool) – Event flag (unused).

Returns:

Normalized score (scaled relative to buffer min/max).

Return type:

float

Note

May crash or behave unexpectedly if buffer contains only one unique value (range becomes 0).

pdmlabs.postprocessing.min_max_scaler

Contents

pdmlabs.postprocessing.min_max_scaler#