pdmlabs.postprocessing.min_max_scaler#
Min-Max scaling post-processor for anomaly score normalization.
MinMaxPostProcessor normalizes scores to [0, 1] range. Fits scaler on the test data itself (within each transform call), so each batch gets scaled relative to its own min/max values. This is useful for: - Normalizing scores to probability-like [0, 1] range - Comparing scores across different models/sources - Computing normalized confidence scores
Note: Fits on data each time, so different batches may have different scales. For consistent scaling across batches, use a pre-fitted scaler.
Classes
|
Normalize anomaly scores to [0, 1] using min-max scaling. |
- class pdmlabs.postprocessing.min_max_scaler.MinMaxPostProcessor(event_preferences: EventPreferences)#
Bases:
PostProcessorInterfaceNormalize anomaly scores to [0, 1] using min-max scaling.
This post-processor scales scores so that min score -> 0 and max score -> 1. Each call to transform() fits a new scaler on that batch.
- scores_buffer_per_source#
Maintains recent scores per source for online/streaming mode.
- Type:
dict
Examples
>>> from pdmlabs.postprocessing.min_max_scaler import MinMaxPostProcessor >>> processor = MinMaxPostProcessor(event_preferences={'failure': [], 'reset': []}) >>> processor.fit([df_train], ['bearing_1'], events_df) # No-op >>> >>> scores = [0.5, 1.0, 2.0, 1.5] # Range [0.5, 2.0] >>> normalized = processor.transform(scores, 'bearing_1', events_df) >>> # Result: [0.0, 0.333..., 1.0, 0.833...] (scaled to [0, 1])
- fit(historic_data: list[DataFrame], historic_sources: list[str], event_data: DataFrame, anomaly_ranges=None) None#
No-op fit (scaler is fitted per transform call).
- Parameters:
historic_data (list[pd.DataFrame]) – Ignored.
historic_sources (list[str]) – Ignored.
event_data (pd.DataFrame) – Ignored.
anomaly_ranges – Ignored.
- get_params()#
Return hyperparameters (none for this post-processor).
- Returns:
Empty dict {}.
- Return type:
dict
- transform(scores: list[float], source: str, event_data: DataFrame) list[float]#
Scale scores to [0, 1] range based on min/max of this batch.
Fits a new scaler on the provided scores, then transforms them. Note: Different batches will have independent scalings.
- Parameters:
scores (list[float]) – Anomaly scores to normalize.
source (str) – Source identifier (unused).
event_data (pd.DataFrame) – Event log (unused).
- Returns:
Normalized scores in [0, 1] range (same length as input).
- Return type:
list[float]
Examples
>>> scores = [0.5, 1.0, 2.0, 1.5] >>> normalized = processor.transform(scores, 'bearing_1', events_df) >>> print(normalized) # [0.0, 0.333..., 1.0, 0.833...]
- transform_one(score_point: float, source: str, is_event: bool) float#
Scale single score using accumulated buffer.
Maintains a buffer of recent scores per source. Fits scaler to buffer, then normalizes the new score.
- Parameters:
score_point (float) – Single anomaly score to normalize.
source (str) – Source identifier (used to maintain separate buffers).
is_event (bool) – Event flag (unused).
- Returns:
Normalized score (scaled relative to buffer min/max).
- Return type:
float
Note
May crash or behave unexpectedly if buffer contains only one unique value (range becomes 0).