pdmlabs.experiment.streaming#
Streaming experiment classes for online/real-time PdM evaluation (experimental).
Status: Early-Stage / Stubs
Streaming experiments are designed for real-time, online scenarios where: - Data arrives continuously (not all available upfront) - Models must adapt or update as new data is seen - Predictions are needed immediately (not retrospectively)
Current State: This module contains placeholder implementations. Streaming support is planned for future versions. For production use, prefer batch experiments.
Available Classes:
- StreamingSemiSupervisedPdMExperiment
Placeholder for online semi-supervised anomaly detection. Status: Stub (not implemented)
- StreamingUnsupervisedPdMExperiment
Placeholder for online unsupervised anomaly detection. Status: Stub (not implemented)
Future Roadmap:
- Phase 1 (Future)
Per-sample prediction interface
Streaming parameter tuning
Automated concept drift detection
- Phase 2 (Future)
Online model adaptation (no retraining needed)
Memory-efficient windoring strategies
Real-time MLflow integration
- Phase 3 (Future)
Ensemble methods for streaming
Anomaly score confidence intervals
Multi-source fusion
Recommendation:
For now, use batch experiments (pdmlabs.experiment.batch) for all production PdM applications. Revisit streaming when fully implemented.
Alternative: Use temporal cross-validation in batch experiments to simulate streaming performance (train on early data, test on later data).
See also
pdmlabs.experiment.batch: Production-ready batch experiments
pdmlabs.experiment.experiment: PdMExperiment base class
- class pdmlabs.experiment.streaming.StreamingSemiSupervisedPdMExperiment(experiment_name: str, pipeline: PdMPipeline, param_space: dict, constraint_function: Callable = None, target_data: list[DataFrame] = None, target_sources: list[str] = None, historic_data: list[DataFrame] = [], historic_sources: list[str] = [], optimization_param: str = 'AD1_AUC', initial_random: int = 2, num_iteration: int = 20, batch_size: int = 1, n_jobs: int = 1, random_state: int = 42, random_n_tries: int = 3, constraint_max_retries: int = 10, historic_data_header: str = 'infer', target_data_header: str = 'infer', artifacts: str = 'artifacts', debug: bool = False, delay: float = None, log_best_scores: bool = False, maximize: bool = True, custom_evaluators: list = None)#
Bases:
PdMExperimentStreaming (online) semi-supervised anomaly detection.
Status: Experimental/Stub Implementation
This experiment flavor is designed for streaming data scenarios: - Processes data continuously as it arrives (row-by-row or in small batches) - Adapts models online without batch retraining - Produces predictions in real-time
Current Implementation: This is an early-stage stub that iterates over target data but does not yet implement full streaming evaluation logic. Use batch experiments for production.
Future Work: - Streaming parameter tuning - Online model adaptation - Concept drift detection - Memory-efficient processing
- Raises:
NotImplementedError – Full streaming functionality not yet implemented.
Examples
>>> experiment = StreamingSemiSupervisedPdMExperiment(...) >>> # Note: streaming experiments are currently stubs >>> # Use batch experiments instead for now
- execute() None#
Execute placeholder streaming experiment (not fully implemented).
- Returns:
Streaming experiments are currently stubs.
- Return type:
None
- class pdmlabs.experiment.streaming.StreamingUnsupervisedPdMExperiment(experiment_name: str, pipeline: PdMPipeline, param_space: dict, constraint_function: Callable = None, target_data: list[DataFrame] = None, target_sources: list[str] = None, historic_data: list[DataFrame] = [], historic_sources: list[str] = [], optimization_param: str = 'AD1_AUC', initial_random: int = 2, num_iteration: int = 20, batch_size: int = 1, n_jobs: int = 1, random_state: int = 42, random_n_tries: int = 3, constraint_max_retries: int = 10, historic_data_header: str = 'infer', target_data_header: str = 'infer', artifacts: str = 'artifacts', debug: bool = False, delay: float = None, log_best_scores: bool = False, maximize: bool = True, custom_evaluators: list = None)#
Bases:
PdMExperimentStreaming (online) unsupervised anomaly detection.
Status: Stub Implementation
This experiment flavor is designed for unsupervised streaming data: - Processes continuous data streams without labels - Adapts models in real-time - Produces anomaly scores online
Current Implementation: This is a placeholder stub with no execution logic. Use batch experiments for full functionality. Streaming support is planned for future versions.
Design Goals: - Minimal memory footprint for long-running applications - Per-sample or mini-batch prediction - Automatic concept drift handling - No offline/batch retraining required
- Raises:
NotImplementedError – Streaming functionality not yet implemented.
Examples
>>> # Streaming experiments are not yet implemented >>> # Use UnsupervisedPdMExperiment (batch) instead
- execute() None#
Execute placeholder unsupervised streaming experiment.
- Returns:
Not implemented.
- Return type:
None
Modules