pdmlabs.postprocessing.post_processor#

Abstract base class interface for post-processors.

Post-processors transform anomaly detection scores to improve decision quality. They operate on MODEL OUTPUT (anomaly scores), not raw sensor data:

  • Smoothing: Reduce score variance (moving average, self-tuning normalization)

  • Thresholding: Convert scores to binary anomaly labels or adaptive thresholds

  • Normalization: Scale scores to [0, 1] for consistent interpretation

  • Filtering: Remove noise or refine predictions across time windows

Typical pipeline: RAW DATA -> (Preprocessing) -> SENSOR FEATURES -> (Detection Model) -> ANOMALY SCORES -> (PostProcessing) -> FINAL ANOMALY LABELS/CONFIDENCE

Classes

PostProcessorInterface(event_preferences)

Abstract base class for anomaly score post-processors.

class pdmlabs.postprocessing.post_processor.PostProcessorInterface(event_preferences: EventPreferences)#

Bases: ABC

Abstract base class for anomaly score post-processors.

Post-processors operate on model outputs (anomaly scores) to improve: - Score quality (smoothing, normalization) - Interpretability (thresholding to binary labels) - Robustness (adaptive thresholds based on history)

Each post-processor must implement fit/transform in two modes: - Batch mode: transform() processes many scores at once - Online/streaming mode: transform_one() processes one score at a time

event_preferences#

Event configuration dict

Type:

EventPreferences

abstract fit(historic_data: list[DataFrame], historic_sources: list[str], event_data: DataFrame, anomaly_ranges=None) None#

Fit post-processor on training data (anomaly scores or raw data).

Some post-processors are stateless and fit() does nothing. Others compute statistics from train data to calibrate thresholds or normalization.

Parameters:
  • historic_data (list[pd.DataFrame]) – Training data, one per source.

  • historic_sources (list[str]) – Source identifiers.

  • event_data (pd.DataFrame) – Event log with failure/reset events.

  • anomaly_ranges – Optional data structure marking normal/anomalous regions.

abstract get_params()#

Return hyperparameters.

Returns:

Hyperparameter names and values (e.g., {β€˜window_length’: 5}).

Return type:

dict

abstract transform(scores: list[float], source: str, event_data: DataFrame) list[float]#

Transform batch of anomaly scores (offline/batch mode).

Parameters:
  • scores (list[float]) – Anomaly scores to post-process.

  • source (str) – Source identifier (e.g., β€˜bearing_1’).

  • event_data (pd.DataFrame) – Event log (unused by most post-processors).

Returns:

Post-processed scores (same length as input).

Return type:

list[float]

abstract transform_one(score_point: float, source: str, is_event: bool) float#

Transform single anomaly score (online/streaming mode).

Used when processing one score at a time (e.g., real-time anomaly detection). Maintains internal buffer for context-aware transformations (moving average, etc).

Parameters:
  • score_point (float) – Single anomaly score to post-process.

  • source (str) – Source identifier (used to maintain per-source state).

  • is_event (bool) – Whether this score is from an event sample.

Returns:

Post-processed score.

Return type:

float