pdmlabs.postprocessing.self_tuning#

Self-tuning score normalization post-processor (adaptive z-score).

SelfTuningPostProcessor normalizes anomaly scores using an adaptive z-score transformation based on a sliding window of historical scores:

z = (score - mean) / std_dev

Uses initial window_length scores to estimate mean/std, then applies normalization to all scores. This adapts the scale to the actual score distribution.

Useful when: - Anomaly score ranges vary across different datasets/models - Want to normalize to a standard normal-like distribution - Thresholding at 0 or fixed values (e.g., threshold=2.0 for 2-sigma)

Classes

SelfTuningPostProcessor(event_preferences,Β ...)

Normalize scores using adaptive z-score (mean and std from window).

class pdmlabs.postprocessing.self_tuning.SelfTuningPostProcessor(event_preferences: EventPreferences, window_length: int)#

Bases: PostProcessorInterface

Normalize scores using adaptive z-score (mean and std from window).

Computes mean and standard deviation from the first window_length scores, then normalizes all scores: (score - mean) / std. Handles edge case where std=0 by returning only (score - mean).

window_length#

Number of initial scores to use for computing mean and std.

Type:

int

scores_buffer_per_source#

Maintains recent scores per source for online/streaming mode.

Type:

dict

Examples

>>> from pdmlabs.postprocessing.self_tuning import SelfTuningPostProcessor
>>> processor = SelfTuningPostProcessor(
...     event_preferences={'failure': [], 'reset': []},
...     window_length=10
... )
>>> processor.fit([df_train], ['bearing_1'], events_df)
>>>
>>> scores = [0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 2.0, 3.0]
>>> normalized = processor.transform(scores, 'bearing_1', events_df)
>>> # First 10 scores used to compute mean/std, then all normalized
fit(historic_data: list[DataFrame], historic_sources: list[str], event_data: DataFrame, anomaly_ranges=None) None#

No-op fit (normalization is computed from score window).

Parameters:
  • historic_data (list[pd.DataFrame]) – Ignored.

  • historic_sources (list[str]) – Ignored.

  • event_data (pd.DataFrame) – Ignored.

  • anomaly_ranges – Ignored.

get_params()#

Return hyperparameters.

Returns:

{β€˜window_length’: number of scores to use for computing mean/std}

Return type:

dict

transform(scores: list[float], source: str, event_data: DataFrame) list[float]#

Normalize scores using z-score from initial window.

Computes mean and std from first window_length scores (removing duplicates). Then normalizes all scores: (score - mean) / std. If std=0, returns (score - mean) instead.

Parameters:
  • scores (list[float]) – Anomaly scores to normalize.

  • source (str) – Source identifier (unused).

  • event_data (pd.DataFrame) – Event log (unused).

Returns:

Normalized scores (same length as input).

Return type:

list[float]

Examples

>>> scores = [1.0, 1.1, 1.2, 1.3, 1.4, 2.0, 3.0]  # Mean ~1.2, some outliers
>>> normalized = processor.transform(scores, 'bearing_1', events_df)
>>> # Normalized so mean of first 5 = 0, std = 1
transform_one(score_point: float, source: str, is_event: bool) float#

Normalize single score using buffered window (online mode).

Maintains a buffer of the first window_length scores. Once buffer is full, computes mean/std from buffer and normalizes the incoming score.

Parameters:
  • score_point (float) – Single anomaly score to normalize.

  • source (str) – Source identifier (used to maintain separate buffers).

  • is_event (bool) – Event flag (unused).

Returns:

If buffer < window_length: returns score unchanged.

Otherwise: returns normalized score using buffered mean/std.

Return type:

float