pdmlabs.preprocessing.record_level.default#

Identity/passthrough preprocessor that applies no transformations.

DefaultPreProcessor is useful for: - Baseline comparisons (no preprocessing) - When raw features are already in good format - Testing if preprocessing helps or hurts performance

Classes

DefaultPreProcessor(event_preferences)

No-op preprocessor that returns data unchanged.

class pdmlabs.preprocessing.record_level.default.DefaultPreProcessor(event_preferences: EventPreferences)#

Bases: RecordLevelPreProcessorInterface

No-op preprocessor that returns data unchanged.

This is an identity transformation: fit() does nothing, transform() returns input data as-is. Useful for experiments that compare preprocessing vs. no preprocessing, or for pipelines where feature engineering happens elsewhere.

Examples

>>> from pdmlabs.preprocessing.record_level.default import DefaultPreProcessor
>>> preprocessor = DefaultPreProcessor(event_preferences={'failure': [], 'reset': []})
>>> preprocessor.fit([df_train], ['bearing_1'], events_df)
>>> df_test_transformed = preprocessor.transform(df_test, 'bearing_1', events_df)
>>> df_test_transformed.equals(df_test)  # Always True
True
fit(historic_data: list[DataFrame], historic_sources: list[str], event_data: DataFrame, anomaly_ranges=None) None#

No-op fit (does nothing).

Parameters:
  • historic_data (list[pd.DataFrame]) – Ignored.

  • historic_sources (list[str]) – Ignored.

  • event_data (pd.DataFrame) – Ignored.

  • anomaly_ranges – Ignored.

get_params()#

Return empty parameter dict (no hyperparameters).

Returns:

Empty dict {}.

Return type:

dict

transform(target_data: DataFrame, source: str, event_data: DataFrame) DataFrame#

Return input unchanged.

Parameters:
  • target_data (pd.DataFrame) – Data to transform.

  • source (str) – Source identifier (ignored).

  • event_data (pd.DataFrame) – Event log (ignored).

Returns:

Same as target_data (identity transformation).

Return type:

pd.DataFrame

transform_one(new_sample: Series, source: str, is_event: bool) Series#

Return single sample unchanged.

Parameters:
  • new_sample (pd.Series) – Sample to transform.

  • source (str) – Source identifier (ignored).

  • is_event (bool) – Event flag (ignored).

Returns:

Same as new_sample.

Return type:

pd.Series