pdmlabs.method.DAMP

pdmlabs.method.DAMP#

Functions

`DAMP_2_0`(time_series, subsequence_length, ...)	Computes DAMP of a time series.
`MASS_V2`([x, y])

Classes

Damp(event_preferences, sub_sequence_length, ...)

For multivariate data we run Damp on each dimension and then combine results using average or maximum.

pdmlabs.method.DAMP.DAMP_2_0(time_series: ndarray, subsequence_length: int, stride: int, location_to_start_processing: int) → Tuple[ndarray, float, int]#

Computes DAMP of a time series. Website: https://sites.google.com/view/discord-aware-matrix-profile/home Algorithm: https://drive.google.com/file/d/1FwiLHrgoOUOTHeIXHAFgy2flQ1alRSoN/view

Parameters:

time_series (np.ndarray) – Univariate time series
subsequence_length (int) – Window size
stride (int) – Window stride
location_to_start_processing (int) – Start/End index of test/train set
lookahead (int, optional) – How far to look ahead for pruning. Defaults to 0.
enable_output (bool, optional) – Print results and save plot. Defaults to True.

Raises:

Exception – See code.
Description – https://docs.google.com/presentation/d/1_-LGilUJpYRbRZpitw05EgkiOZX52kRd/edit#slide=id.p11

Returns:

Matrix profile, discord score and its corresponding position in the profile

Return type:

Tuple[np.ndarray, float, int]

class pdmlabs.method.DAMP.Damp(event_preferences: EventPreferences, sub_sequence_length: int, init_length: int, stride: int = 1, aggregation_strategy='avg', *args, **kwargs)#

Bases: UnsupervisedMethodInterface

For multivariate data we run Damp on each dimension and then combine results using average or maximum.

In the future, we can use multivariate sub-sequence distances.

get_all_models()#

Return reference to internal model(s).

Returns model instances for inspection, export, or further processing. Structure depends on method - may return single model or dict of models.

Returns:

Underlying model object(s). Examples:

Single model: sklLearn model instance
Multiple models: {‘bearing_1’: model1, ‘bearing_2’: model2}
None: If model is not accessible/applicable

Return type:

model or dict

get_library() → str#

Return the underlying library/framework name.

Returns:

Name of library used (e.g., ‘sklearn’, ‘torch’, ‘custom’).: Used for dependency tracking and method categorization.

Return type:

str

get_params() → dict#

Return hyperparameters and configuration.

Returns:

Dictionary of hyperparameters (e.g.,: {‘n_neighbors’: 5, ‘contamination’: 0.1}). Useful for logging, model comparison, and reproducibility.

Return type:

dict

predict(target_data: DataFrame, source: str, event_data: DataFrame) → list[float]#

Predict anomaly scores for batch of samples (offline mode).

Computes anomaly score for each row in target_data independently. Higher scores indicate more anomalous behavior.

Parameters:

target_data (pd.DataFrame) – Feature matrix with dates in index, features in columns. Must have same features as training data.
source (str) – Source identifier (e.g., ‘bearing_1’). Used to select source-specific model if method maintains multiple models.
event_data (pd.DataFrame) – Event log with columns ‘date’, ‘type’, ‘source’, ‘description’. Can be used for context-aware scoring.

Returns:

Anomaly scores (float) with length = target_data.shape[0].: Score range and semantics depend on method: - Distance-based: typically [0, ∞) where higher = more anomalous - Probability-based: typically [0, 1] or (-∞, 0] log-likelihood - Reconstruction-based: typically [0, ∞) reconstruction error

Return type:

list

Examples

>>> method = SomeAnomalyDetector(event_preferences={...})
>>> method.fit([df_train], ['bearing_1'], events_df, labels)
>>> df_test = pd.DataFrame([feature values], index=[dates])
>>> scores = method.predict(df_test, 'bearing_1', events_df)
>>> print(len(scores), scores[0])  # (100, 0.45)

Raises:: NotImplementedError – If method hasn’t been fit (for supervised methods).

predict_one(new_sample: Series, source: str, is_event: bool) → float#

Predict anomaly score for single sample (online/streaming mode).

Computes anomaly score for one observation at a time. Useful for: - Real-time anomaly detection - Online learning scenarios - Memory-efficient processing

May maintain internal state (buffers, windows) for context-aware scoring.

Parameters:

new_sample (pd.Series) – Single observation with feature values. Index should contain feature names matching training data.
source (str) – Source identifier for source-specific models.
is_event (bool) – Whether this sample is from an event timestamp. Can affect how method processes the sample (e.g., special handling for known events vs normal operation).

Returns:

Anomaly score for this single sample (same scale as predict()).

Return type:

float

Examples

>>> method = SomeAnomalyDetector(event_preferences={...})
>>> method.fit([df_train], ['bearing_1'], events_df, labels)
>>>
>>> # Online scoring
>>> for idx, row in df_test.iterrows():
...     is_event = idx in event_timestamps
...     score = method.predict_one(row, 'bearing_1', is_event)
...     print(f'{idx}: {score}')

pdmlabs.method.DAMP.MASS_V2(x=None, y=None)#

pdmlabs.method.DAMP

Contents

pdmlabs.method.DAMP#