pdmlabs.thresholding.SurvSuperVisedTH#
Survival analysis to RUL (Remaining Useful Life) thresholder.
SurvToRUL converts survival probability scores (failure predictions) into RUL (Remaining Useful Life) estimates. Designed for prognostic health monitoring.
Survival analysis context: - Survival scores: probability that component survives until time t - RUL: predicted time until failure (days, hours, operations, etc.) - Threshold: optimal survival probability cutoff for RUL prediction
This thresholder learns an optimal threshold from validation data that minimizes error between predicted and true time-to-failure values.
Use cases: - Scheduled maintenance: When should we service this equipment? - Resource planning: Do we need a spare part before next failure? - Risk assessment: Which equipment will fail soonest?
Example
Survival score 0.8 at hour 100 means β80% chance equipment survives past hour 100β Using learned threshold, convert to RUL: βequipment will fail in ~30 hoursβ
Classes
|
Convert survival probabilities to RUL (Remaining Useful Life) predictions. |
- class pdmlabs.thresholding.SurvSuperVisedTH.SurvToRUL(event_preferences: EventPreferences, threshold_value=None)#
Bases:
ThresholderInterfaceConvert survival probabilities to RUL (Remaining Useful Life) predictions.
Learns a threshold mapping survival probabilities to remaining time until failure. Uses Mean Absolute Error (MAE) on validation data to find optimal threshold.
- Survival score format: tuple (survival_probability, time_vector)
survival_probability: array of P(survive until each time)
time_vector: corresponding time points
- threshold_value#
Learned threshold (0-1) on survival probability. Range [0, 1]. Interpretation: - ~0.0: aggressive (predict failure soon) - ~0.5: moderate (balanced) - ~1.0: conservative (predict failure far in future)
- Type:
float
Algorithm: 1. Test 501 threshold values from 0 to 1 2. For each threshold:
Predict RUL for each validation sample
Compute MAE vs true time-to-failure
Select threshold with minimum MAE
Examples
>>> from pdmlabs.thresholding.SurvSuperVisedTH import SurvToRUL >>> thresholder = SurvToRUL( ... event_preferences={'failure': [], 'reset': []}, ... threshold_value=None # Learn from data ... ) >>> >>> # Survival scores: list of tuples (surv_prob_array, time_vector) >>> surv_scores = [ ... (np.array([0.95, 0.90, 0.80, 0.60, 0.30]), np.array([1,2,3,4,5])), ... (np.array([0.98, 0.95, 0.85, 0.70, 0.40]), np.array([1,2,3,4,5])), ... ] >>> true_times = [[2.5], [3.0]] # Hours until failure >>> thresholder.fit([surv_scores], ['bearing_1'], events_df, true_times) >>> print(thresholder.threshold_value) # Learned threshold ~0.65 >>> >>> # Predict RUL for new data >>> new_surv = ([0.92, 0.87, 0.75, 0.55], [1,2,3,4]) >>> rul = thresholder.infer_threshold_one(new_surv, 'bearing_1', events_df) >>> print(rul) # Hours remaining until predicted failure
- fit(historic_data: list, historic_sources: list[str], event_data: DataFrame, anomaly_ranges=None) None#
Learn optimal survival probability threshold from labeled data.
Optimization finds threshold that best maps survival curves to time-to-failure.
- Parameters:
historic_data (list) β
List of survival score arrays. Each element is list of tuples (survival_prob, time_vector). Structure: - historic_data[i][j] = (np.array, np.array)
[0]: survival probabilities at each time
[1]: time points corresponding to probabilities
historic_sources (list[str]) β Source identifiers (one per element of historic_data).
event_data (pd.DataFrame) β Event log (unused).
anomaly_ranges (list[list], optional) β True time-to-failure values. Structure: list of lists where element i corresponds to source i. Each inner list contains tuples: [(RUL1, ???), (RUL2, ???), β¦] Only first element of tuple is used (RUL value). Skip sources where first RUL value is 0.
Notes
Skips sources with no anomaly label (all zeros)
Learns MAE-optimal threshold across all valid sources
If threshold_value is already set, uses that (no learning)
- get_params()#
Return thresholder parameters.
- Returns:
{βthreshold_valueβ: the learned survival probability threshold}
- Return type:
dict
- infer_threshold(scores: list, source: str, event_data: DataFrame, scores_dates: list[Timestamp]) list[float]#
Predict RUL for batch of survival scores.
- Parameters:
scores (list) β List of tuples (survival_prob_array, time_vector).
source (str) β Source identifier (unused).
event_data (pd.DataFrame) β Event log (unused).
scores_dates (list[pd.Timestamp]) β Score timestamps (unused).
- Returns:
Predicted RUL values for each sample.
- Return type:
list[float]
- infer_threshold_one(score: float, source: str, event_data: DataFrame) float#
Predict RUL for single survival score (online mode).
- Parameters:
score (float) β Single survival score (unused - kept for interface compatibility).
source (str) β Source identifier (unused).
event_data (pd.DataFrame) β Event log (unused).
- Returns:
- The learned threshold_value itself (not a RUL prediction).
Note: This method returns scalar, while batch mode computes RUL.
- Return type:
float
- optimize_threshold(curves, x, true_times)#
Find threshold that minimizes MAE on validation data.
Tests 501 evenly-spaced thresholds from 0 to 1. For each: - Predicts RUL by finding where survival crosses threshold - Computes absolute error vs true times
- Parameters:
curves (list) β List of survival probability arrays.
x (np.array) β Time vector corresponding to survival probabilities. Must be same for all curves.
true_times (list) β Ground truth time-to-failure for each curve.
- Returns:
Threshold value (0-1) that minimizes MAE.
- Return type:
float
- predicted_time(curve, x, theta)#
Predict RUL by finding crossing point of survival threshold.
Finds first time point where survival probability <= threshold. Represents the predicted failure time.
- Parameters:
curve (np.array) β Survival probability curve [p1, p2, β¦, pN] representing P(survive until time x[i]) for each position i.
x (np.array) β Time vector corresponding to curve. Must be sorted ascending.
theta (float) β Threshold survival probability (0-1).
- Returns:
- Predicted time of failure (RUL).
If curve never crosses threshold: returns x[-1] (latest time)
Otherwise: returns first time where curve <= threshold
- Return type:
float