pdmlabs.experiment.batch.unsupervised_experiment

pdmlabs.experiment.batch.unsupervised_experiment#

Classes

UnsupervisedPdMExperiment(experiment_name, ...)

Unsupervised anomaly detection without any labeled data.

class pdmlabs.experiment.batch.unsupervised_experiment.UnsupervisedPdMExperiment(experiment_name: str, pipeline: PdMPipeline, param_space: dict, constraint_function: Callable = None, target_data: list[DataFrame] = None, target_sources: list[str] = None, historic_data: list[DataFrame] = [], historic_sources: list[str] = [], optimization_param: str = 'AD1_AUC', initial_random: int = 2, num_iteration: int = 20, batch_size: int = 1, n_jobs: int = 1, random_state: int = 42, random_n_tries: int = 3, constraint_max_retries: int = 10, historic_data_header: str = 'infer', target_data_header: str = 'infer', artifacts: str = 'artifacts', debug: bool = False, delay: float = None, log_best_scores: bool = False, maximize: bool = True, custom_evaluators: list = None)#

Bases: PdMExperiment

Unsupervised anomaly detection without any labeled data.

This experiment flavor is designed for scenarios where: - You have no labeled anomaly data - You want the method to learn patterns from the entire dataset - Each target scenario is evaluated independently

The approach fits the method on a per-scenario basis (similar to semi-supervised) but without leveraging any labels. The method must determine anomalies based on statistical or distributional properties alone.

Suitable for: - Early-stage PdM where no failure labels exist - Discovering unknown failure modes - Baseline comparisons

Raises:

IncompatibleMethodException – If method does not implement UnsupervisedMethodInterface.

Examples

>>> from pdmlabs.method.isolation_forest import IsolationForest
>>> experiment = UnsupervisedPdMExperiment(
...     experiment_name='unsupervised-demo',
...     pipeline=pipeline,
...     param_space={'method_contamination': [0.01, 0.05, 0.1]},
...     num_iteration=20
... )
>>> results = experiment.execute()
execute() dict#

Run unsupervised experiment without labeled training data.

For each parameter combination: 1. For each target scenario (segmented by reset/failure events):

  1. Fits method using only the entire scenario data (unsupervised)

  2. Predicts anomaly scores

  3. Applies postprocessor and thresholder

  1. Evaluates using PdM metrics to find best parameters

Returns:

Result dictionary with best_params, best_objective, and threshold.

Return type:

dict

Raises:

IncompatibleMethodException – If method is not UnsupervisedMethodInterface.