pdmlabs.postprocessing.dynamicth

pdmlabs.postprocessing.dynamicth#

Dynamic adaptive thresholding post-processor (NASA LSTM Anomaly Detection).

DynamicThresholder implements an advanced adaptive thresholding algorithm adapted from NASA’s “Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding” paper. It converts scores to binary labels using statistical methods combined with anomaly sequence detection and pruning.

The algorithm: 1. Finds optimal threshold by maximizing impact on normal vs anomalous distributions 2. Groups detected anomalies into sequences 3. Prunes false positives using percentage difference criteria 4. Returns binary labels (0=normal, 1=anomaly)

Useful when: - Need sophisticated multi-pass anomaly detection - Baseline shifts significantly over time - Want to filter out isolated false positives (pruning)

Functions

dynamicThresholding(MAerror[, ...])

Adaptive thresholding with anomaly sequence detection and pruning.

Classes

DynamicThresholder(event_preferences[, ...])

Advanced adaptive thresholding using statistical and sequence analysis.

class pdmlabs.postprocessing.dynamicth.DynamicThresholder(event_preferences: EventPreferences, epsilon: float = 0.05, history_window=None)#

Bases: PostProcessorInterface

Advanced adaptive thresholding using statistical and sequence analysis.

Implements multi-pass thresholding algorithm that: - Tests multiple threshold candidates (in range mean ± [3-5]*std) - Scores each threshold based on impact on mean/std of normal vs anomaly groups - Selects threshold that best separates normal from anomalous - Groups anomalies into sequences and evaluates their statistical significance - Prunes anomalies with small impact on distribution

epsilon#

Pruning threshold. Percentage difference between consecutive anomaly impacts above which to keep it. Filters out small fluctuations.

Type:: float

history_window#

Number of recent scores for threshold calculation. None = use all history (set to 1 with alldata=True).

Type:: int

alldata#

If True, use entire history (not just recent window).

Type:: bool

anomaly_scores_dict#

Maintains score history per source.

Type:: dict

References

“Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding” - Provides the core algorithm and evaluation metrics.

Examples

>>> from pdmlabs.postprocessing.dynamicth import DynamicThresholder
>>> processor = DynamicThresholder(
...     event_preferences={'failure': [], 'reset': []},
...     epsilon=0.05,  # Prune if < 5% difference
...     history_window=1000,  # Use last 1000 scores
... )
>>> processor.fit([df_train], ['bearing_1'], events_df)
>>>
>>> scores = [0.5, 0.6, 0.55, 1.5, 0.7, 3.5, 0.8]
>>> labels = processor.transform(scores, 'bearing_1', events_df)
>>> # Returns [0, 0, 0, 1, 0, 1, 0]  (thresholds adapt as history grows)

fit(historic_data: list[DataFrame], historic_sources: list[str], event_data: DataFrame, anomaly_ranges=None) → None#

No-op fit (thresholds computed on-the-fly during transform).

Parameters:

historic_data (list[pd.DataFrame]) – Ignored.
historic_sources (list[str]) – Ignored.
event_data (pd.DataFrame) – Ignored.
anomaly_ranges – Ignored.

get_params()#

Return hyperparameters.

Returns:

{‘epsilon’: pruning threshold, ‘history_window’: window size,: ’All data in history’: whether using entire history}

Return type:

dict

transform(scores: list[float], source: str, event_data: DataFrame) → list[float]#

Convert scores to binary labels with dynamic thresholding.

Processes scores sequentially, computing adaptive threshold for each point based on distribution of all previous scores. Uses sophisticated algorithm to find optimal threshold and prune false positives.

Parameters:

scores (list[float]) – Anomaly scores to threshold.
source (str) – Source identifier (used to maintain separate histories).
event_data (pd.DataFrame) – Event log (unused).

Returns:

Binary anomaly labels (0 or 1).

Return type:

list[float]

Examples

>>> scores = [0.5, 0.6, 0.55, 1.2, 0.7, 2.5, 0.8]
>>> labels = processor.transform(scores, 'bearing_1', events_df)
>>> # Returns adaptive binary labels accounting for distribution changes

transform_one(score_point: float, source: str, is_event: bool) → float#

Threshold single score using dynamic thresholding (online mode).

Parameters:

score_point (float) – Single anomaly score to threshold.
source (str) – Source identifier (used to maintain separate histories).
is_event (bool) – Event flag (unused).

Returns:

1 if score is flagged as anomaly, 0 otherwise.

Return type:

float

pdmlabs.postprocessing.dynamicth.dynamicThresholding(MAerror, DesentThreshold=0.02, hscaleCount=1000, alldata=False)#

Adaptive thresholding with anomaly sequence detection and pruning.

Advanced algorithm from NASA’s spacecraft anomaly detection research. Uses multi-pass approach: 1. Test multiple threshold candidates (mean ± 3-5 stds) 2. Score each candidate by impact on distribution separation (Δμ/μ + Δσ/σ) 3. Group detected anomalies into temporal sequences 4. Prune weak anomalies based on percentage change threshold

This makes detection robust to: - Isolated false positives (pruned if impact < epsilon) - Score distribution shifts (adaptive threshold per point) - Clustered anomalies (treats as sequence, not individuals)

Parameters:

MAerror (list[float]) – All anomaly scores observed so far.
DesentThreshold (float, optional) – Pruning parameter. Minimum percentage difference between consecutive anomaly impacts to keep anomaly. Range [0, 1]. Lower = more aggressive pruning. Defaults to 0.02 (2%).
hscaleCount (int, optional) – History window size (recent scores to consider). Defaults to 1000. Used only if alldata=False.
alldata (bool, optional) – If True, use entire history instead of window. Defaults to False.

Returns:

(success_bool, threshold_value)

success_bool: True if anomaly detected and passed all filters, False if threshold couldn’t be computed or anomaly was pruned
threshold_value: Calculated threshold value

Return type:

tuple

Algorithm details:

z-vector: [3, 3.17, 3.33, …, 4.83] sigma multiples for threshold search
Δμ/μ: relative change in mean if anomalies excluded
Δσ/σ: relative change in std if anomalies excluded
Maximization: (Δμ/μ + Δσ/σ) / (num_anomalies + num_sequences * num_sequences)
Pruning: Sorts anomalies by impact, finds elbow point > epsilon

Edge cases:

len(history) == 1: Returns False (need more data)
No scores above threshold: Returns False
All scores are anomalies: Returns False (can’t prune reliably)
Degenerate std (all same values): Returns False

Time complexity: O(n*m) where n=candidates tested (12), m=history_length

References

“Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding” - Provides full algorithm with spacecraft telemetry examples

Examples

>>> scores = [0.5, 0.6, 0.55, 0.7, 0.8, 2.5, 3.0]
>>> success, thresh = dynamicThresholding(scores, DesentThreshold=0.05)
>>> # Evaluates ~12 thresholds, selects best separator
>>> # Groups 2.5, 3.0 as sequence, prunes if together they have low impact

pdmlabs.postprocessing.dynamicth

Contents

pdmlabs.postprocessing.dynamicth#