pdmlabs.experiment.batch.auto_profile_semi_supervised_experiment#

Classes

AutoProfileSemiSupervisedPdMExperiment(...)

Semi-supervised anomaly detection with automatic profile-based learning.

class pdmlabs.experiment.batch.auto_profile_semi_supervised_experiment.AutoProfileSemiSupervisedPdMExperiment(*args, **kwargs)#

Bases: PdMExperiment

Semi-supervised anomaly detection with automatic profile-based learning.

This experiment flavor implements an “auto-profiling” semi-supervised approach:

  1. For each target scenario, uses an initial profile (first N timesteps) as normal behavior

  2. Fits the anomaly detection method only on this profile

  3. Applies the fitted method to detect anomalies in the rest of the scenario

  4. Automatically determines the profile size via hyperparameter search

This is useful when: - You have unlabeled data with clear patterns at the start (normal operating condition) - You want to adapt to gradual drift without constant retraining - You have limited labeled anomaly examples

The “auto-profiling” optimization searches over profile_size (and optionally init_profile_size) to find the size of the normal behavior window that yields best performance.

pipeline#

Must have ‘failure’ or ‘reset’ events to define scenario boundaries.

Type:

PdMPipeline

param_space#

Must include ‘profile_size’ key. Example: {‘profile_size’: [10, 20, 50], ‘method_alpha’: [0.1, 0.5, 1.0]}

Type:

dict

Raises:
  • IncompatibleMethodException – If method does not implement SemiSupervisedMethodInterface.

  • ValueError – If pipeline lacks required event definitions.

Examples

>>> from pdmlabs.method.isolation_forest import IsolationForest
>>> from pdmlabs.preprocessing.no_preprocessor import NoPreprocessor
>>> # ... setup pipeline ...
>>> param_space = {
...     'profile_size': [10, 20, 50],
...     'method_alpha': [0.1, 1.0]
... }
>>> experiment = AutoProfileSemiSupervisedPdMExperiment(
...     experiment_name='auto-profile-demo',
...     pipeline=pipeline,
...     param_space=param_space,
...     num_iteration=30,
...     n_jobs=4
... )
>>> results = experiment.execute()
>>> print(f"Best profile size: {results['best_params']['profile_size']}")
Best profile size: 20
execute() dict#

Run the auto-profile semi-supervised optimization experiment.

Searches parameter space to find the best profile size and method parameters. For each combination:

  1. For each target scenario: a. Segments by reset/failure events b. Uses first N timesteps (profile_size) as normal pattern c. Fits method on profile d. Predicts on remaining data e. Applies postprocessor and thresholder

  2. Evaluates across all scenarios using PdM metrics

  3. Returns best parameters

Returns:

Result dictionary with:
  • ’best_params’: Best found parameters (includes profile_size)

  • ’best_objective’: Best metric value achieved

  • ’th’: Best threshold for decision boundary

Return type:

dict

Raises:
  • IncompatibleMethodException – If method is not SemiSupervisedMethodInterface.

  • Exception – If pipeline setup is invalid or data processing fails.

Examples

>>> experiment = AutoProfileSemiSupervisedPdMExperiment(...)
>>> results = experiment.execute()
>>> print(results['best_params']['profile_size'])
25