pdmlabs.experiment.batch.unsupervised_experiment#
Classes
|
Unsupervised anomaly detection without any labeled data. |
- class pdmlabs.experiment.batch.unsupervised_experiment.UnsupervisedPdMExperiment(experiment_name: str, pipeline: PdMPipeline, param_space: dict, constraint_function: Callable = None, target_data: list[DataFrame] = None, target_sources: list[str] = None, historic_data: list[DataFrame] = [], historic_sources: list[str] = [], optimization_param: str = 'AD1_AUC', initial_random: int = 2, num_iteration: int = 20, batch_size: int = 1, n_jobs: int = 1, random_state: int = 42, random_n_tries: int = 3, constraint_max_retries: int = 10, historic_data_header: str = 'infer', target_data_header: str = 'infer', artifacts: str = 'artifacts', debug: bool = False, delay: float = None, log_best_scores: bool = False, maximize: bool = True, custom_evaluators: list = None)#
Bases:
PdMExperimentUnsupervised anomaly detection without any labeled data.
This experiment flavor is designed for scenarios where: - You have no labeled anomaly data - You want the method to learn patterns from the entire dataset - Each target scenario is evaluated independently
The approach fits the method on a per-scenario basis (similar to semi-supervised) but without leveraging any labels. The method must determine anomalies based on statistical or distributional properties alone.
Suitable for: - Early-stage PdM where no failure labels exist - Discovering unknown failure modes - Baseline comparisons
- Raises:
IncompatibleMethodException – If method does not implement UnsupervisedMethodInterface.
Examples
>>> from pdmlabs.method.isolation_forest import IsolationForest >>> experiment = UnsupervisedPdMExperiment( ... experiment_name='unsupervised-demo', ... pipeline=pipeline, ... param_space={'method_contamination': [0.01, 0.05, 0.1]}, ... num_iteration=20 ... ) >>> results = experiment.execute()
- execute() dict#
Run unsupervised experiment without labeled training data.
For each parameter combination: 1. For each target scenario (segmented by reset/failure events):
Fits method using only the entire scenario data (unsupervised)
Predicts anomaly scores
Applies postprocessor and thresholder
Evaluates using PdM metrics to find best parameters
- Returns:
Result dictionary with best_params, best_objective, and threshold.
- Return type:
dict
- Raises:
IncompatibleMethodException – If method is not UnsupervisedMethodInterface.