pdmlabs.thresholding.SurvSuperVisedTH

pdmlabs.thresholding.SurvSuperVisedTH#

Survival analysis to RUL (Remaining Useful Life) thresholder.

SurvToRUL converts survival probability scores (failure predictions) into RUL (Remaining Useful Life) estimates. Designed for prognostic health monitoring.

Survival analysis context: - Survival scores: probability that component survives until time t - RUL: predicted time until failure (days, hours, operations, etc.) - Threshold: optimal survival probability cutoff for RUL prediction

This thresholder learns an optimal threshold from validation data that minimizes error between predicted and true time-to-failure values.

Use cases: - Scheduled maintenance: When should we service this equipment? - Resource planning: Do we need a spare part before next failure? - Risk assessment: Which equipment will fail soonest?

Example

Survival score 0.8 at hour 100 means “80% chance equipment survives past hour 100” Using learned threshold, convert to RUL: “equipment will fail in ~30 hours”

Classes

SurvToRUL(event_preferences[, threshold_value])

Convert survival probabilities to RUL (Remaining Useful Life) predictions.

class pdmlabs.thresholding.SurvSuperVisedTH.SurvToRUL(event_preferences: EventPreferences, threshold_value=None)#

Bases: ThresholderInterface

Convert survival probabilities to RUL (Remaining Useful Life) predictions.

Learns a threshold mapping survival probabilities to remaining time until failure. Uses Mean Absolute Error (MAE) on validation data to find optimal threshold.

Survival score format: tuple (survival_probability, time_vector)

survival_probability: array of P(survive until each time)
time_vector: corresponding time points

threshold_value#

Learned threshold (0-1) on survival probability. Range [0, 1]. Interpretation: - ~0.0: aggressive (predict failure soon) - ~0.5: moderate (balanced) - ~1.0: conservative (predict failure far in future)

Type:: float

Algorithm: 1. Test 501 threshold values from 0 to 1 2. For each threshold:

Predict RUL for each validation sample

Compute MAE vs true time-to-failure

Select threshold with minimum MAE

Examples

>>> from pdmlabs.thresholding.SurvSuperVisedTH import SurvToRUL
>>> thresholder = SurvToRUL(
...     event_preferences={'failure': [], 'reset': []},
...     threshold_value=None  # Learn from data
... )
>>>
>>> # Survival scores: list of tuples (surv_prob_array, time_vector)
>>> surv_scores = [
...     (np.array([0.95, 0.90, 0.80, 0.60, 0.30]), np.array([1,2,3,4,5])),
...     (np.array([0.98, 0.95, 0.85, 0.70, 0.40]), np.array([1,2,3,4,5])),
... ]
>>> true_times = [[2.5], [3.0]]  # Hours until failure
>>> thresholder.fit([surv_scores], ['bearing_1'], events_df, true_times)
>>> print(thresholder.threshold_value)  # Learned threshold ~0.65
>>>
>>> # Predict RUL for new data
>>> new_surv = ([0.92, 0.87, 0.75, 0.55], [1,2,3,4])
>>> rul = thresholder.infer_threshold_one(new_surv, 'bearing_1', events_df)
>>> print(rul)  # Hours remaining until predicted failure

fit(historic_data: list, historic_sources: list[str], event_data: DataFrame, anomaly_ranges=None) → None#

Learn optimal survival probability threshold from labeled data.

Optimization finds threshold that best maps survival curves to time-to-failure.

Parameters:

historic_data (list) –
List of survival score arrays. Each element is list of tuples (survival_prob, time_vector). Structure: - historic_data[i][j] = (np.array, np.array)
- [0]: survival probabilities at each time
- [1]: time points corresponding to probabilities
historic_sources (list[str]) – Source identifiers (one per element of historic_data).
event_data (pd.DataFrame) – Event log (unused).
anomaly_ranges (list[list], optional) – True time-to-failure values. Structure: list of lists where element i corresponds to source i. Each inner list contains tuples: [(RUL1, ???), (RUL2, ???), …] Only first element of tuple is used (RUL value). Skip sources where first RUL value is 0.

Notes

Skips sources with no anomaly label (all zeros)
Learns MAE-optimal threshold across all valid sources
If threshold_value is already set, uses that (no learning)

get_params()#

Return thresholder parameters.

Returns:: {‘threshold_value’: the learned survival probability threshold}
Return type:: dict

infer_threshold(scores: list, source: str, event_data: DataFrame, scores_dates: list[Timestamp]) → list[float]#

Predict RUL for batch of survival scores.

Parameters:

scores (list) – List of tuples (survival_prob_array, time_vector).
source (str) – Source identifier (unused).
event_data (pd.DataFrame) – Event log (unused).
scores_dates (list[pd.Timestamp]) – Score timestamps (unused).

Returns:

Predicted RUL values for each sample.

Return type:

list[float]

infer_threshold_one(score: float, source: str, event_data: DataFrame) → float#

Predict RUL for single survival score (online mode).

Parameters:

score (float) – Single survival score (unused - kept for interface compatibility).
source (str) – Source identifier (unused).
event_data (pd.DataFrame) – Event log (unused).

Returns:

The learned threshold_value itself (not a RUL prediction).: Note: This method returns scalar, while batch mode computes RUL.

Return type:

float

optimize_threshold(curves, x, true_times)#

Find threshold that minimizes MAE on validation data.

Tests 501 evenly-spaced thresholds from 0 to 1. For each: - Predicts RUL by finding where survival crosses threshold - Computes absolute error vs true times

Parameters:

curves (list) – List of survival probability arrays.
x (np.array) – Time vector corresponding to survival probabilities. Must be same for all curves.
true_times (list) – Ground truth time-to-failure for each curve.

Returns:

Threshold value (0-1) that minimizes MAE.

Return type:

float

predicted_time(curve, x, theta)#

Predict RUL by finding crossing point of survival threshold.

Finds first time point where survival probability <= threshold. Represents the predicted failure time.

Parameters:

curve (np.array) – Survival probability curve [p1, p2, …, pN] representing P(survive until time x[i]) for each position i.
x (np.array) – Time vector corresponding to curve. Must be sorted ascending.
theta (float) – Threshold survival probability (0-1).

Returns:

Predicted time of failure (RUL).

If curve never crosses threshold: returns x[-1] (latest time)
Otherwise: returns first time where curve <= threshold

Return type:

float

pdmlabs.thresholding.SurvSuperVisedTH

Contents

pdmlabs.thresholding.SurvSuperVisedTH#