π Concepts#
This page explains the core ideas behind PdMLabs so you can decide how to model your use case before writing code.
Mental Model#
PdMLabs is an experimentation framework for predictive maintenance.
At a high level, each run follows the same pattern:
Define a dataset in the expected PdMLabs format.
Select an experiment flavor (online, incremental, unsupervised, supervised, etc.).
Choose one or more methods.
Compose a pipeline with preprocessor, method, postprocessor, and thresholder.
Run parameter search and evaluate with PdM-oriented metrics.
This shared structure enables fair comparison across methods and modeling flavors.
Core Building Blocks#
PdMLabs is built around four pluggable components:
Preprocessor: transforms raw records before scoring.
Method: produces anomaly/probability/survival-like scores.
Postprocessor: smooths or transforms scores.
Thresholder: converts scores to decision thresholds or target values.
In the codebase, this composition is represented by PdMPipeline.
Experiment Flavors#
PdMLabs supports multiple experiment strategies to match different data assumptions.
For an architectural breakdown of how these flavors execute internally, see Analysis of PdMLabs Experiment Flavors.
Anomaly Detection
AutoProfileSemiSupervisedPdMExperimentBuilds profile windows and can re-fit after reset events.
IncrementalSemiSupervisedPdMExperimentTrains and predicts over rolling windows.
SemiSupervisedPdMExperimentFits once on historic data, then scores target data.
UnsupervisedPdMExperimentScores without a fitting phase.
Supervised / Time-to-Event
SupervisedPdMExperimentClassification-style workflow using labels.
SupervisedRULPdMExperimentRemaining useful life (RUL) workflow.
Supervised_SA_PdMExperimentSurvival-analysis-oriented workflow.
Data Contract#
Most framework behavior depends on a dataset dictionary with standard keys. Typical keys include:
event_dataandevent_preferenceshistoric_data,historic_sourcestarget_data,target_sourcesdatespredictive_horizon,lead,slide,betamax_wait_time
For supervised workflows, labels such as anomaly_labels (and in some cases
target_labels) are required.
The helper module pdmlabs.loadAnomalyDetectionDataset provides utility
functions to build and enrich dataset dictionaries.
Events, Failures, and Resets#
PdMLabs uses event metadata to determine where failures and resets happen and which sources are affected.
The event_preferences object defines how to interpret event rows by:
descriptiontypesourcetarget_sources
This is important because evaluation and some experiment flavors depend on episode boundaries and reset logic.
Evaluation Philosophy#
PdMLabs evaluates results in a predictive-maintenance context, not only with generic binary metrics.
Main ideas include:
Episode-aware splitting around failure timestamps.
Predictive horizon and lead-time semantics.
Multiple AD recall variants (e.g. AD1/AD2/AD3 style behavior).
AUC-PR style summaries.
Optional range/VUS/affiliation metrics.
This helps teams evaluate whether a method gives useful early warnings in practice, not just good aggregate classification scores.
For a full list of supported metrics and instructions on how to add your own, see Evaluation & Metrics.
Optimization, Reproducibility, and MLflow#
Hyperparameter search is integrated into experiments via Mango (Bayesian or random search) and can use constraint functions to avoid invalid parameter combinations.
MLflow logging is deeply integrated in the run lifecycle. For every successful experiment, PdMLabs logs:
All tested parameter configurations and resulting metrics.
The best, fully-fitted pipeline as an MLflow
pyfuncmodel.
This means the entire processing chainβpreprocessor, method, postprocessor, and thresholderβis saved as a single object. You can later load it directly via MLflow and start making predictions:
import mlflow
pipeline = mlflow.pyfunc.load_model("runs:/<RUN_ID>/best_pdm_pipeline")
predictions = pipeline.predict({
'target_data': new_data_df,
'source': 'asset_1',
'event_data': new_event_df
})
This enables seamless transition from experimentation to production deployment. For more details on deploying and inference, check the User Guide!
Extensibility#
You can add custom components by implementing the framework interfaces:
MethodInterfaceand specialized method interfacesRecordLevelPreProcessorInterfacePostProcessorInterfaceThresholderInterface
Once implemented, they can be used with run_experiment like built-in components.
How To Use This Page#
Read π Our Manifesto for the high-level motivation.
Use Quickstart to run your first experiment.
Use π API Reference for detailed API signatures and module docs.