pdmlabs.preprocessing.record_level.min_max_scaler#
Min-Max scaling preprocessor for normalizing features to [0, 1] range.
- Min-Max scaling transforms features to a fixed range [0, 1] using:
x_scaled = (x - x_min) / (x_max - x_min)
Useful when: - Features have different units/scales - Anomaly detectors (neural nets, distance-based methods) are sensitive to scale - Want to ensure all features contribute equally
Keeps source-specific scalers (one per device/source) to handle variations in sensor ranges across different equipment instances.
Classes
|
Scale features to [0, 1] using min-max normalization (per source). |
- class pdmlabs.preprocessing.record_level.min_max_scaler.MinMaxScaler(event_preferences: EventPreferences)#
Bases:
RecordLevelPreProcessorInterfaceScale features to [0, 1] using min-max normalization (per source).
This preprocessor maintains separate scalers for each source (device/subsystem), allowing for different value ranges across equipment. For example, βbearing_1β might have vibration in range [0, 100] while βbearing_2β has [0, 50].
- scaler_per_source#
Maps source identifier to fitted sklearn MinMaxScaler. Populated during fit(), used in transform().
- Type:
dict
Examples
>>> from pdmlabs.preprocessing.record_level.min_max_scaler import MinMaxScaler >>> import pandas as pd >>> >>> # Training data >>> df_train = pd.DataFrame({'vibration': [10, 20, 30], 'temp': [50, 60, 70]}) >>> df_test = pd.DataFrame({'vibration': [15, 25], 'temp': [55, 65]}) >>> >>> scaler = MinMaxScaler(event_preferences={'failure': [], 'reset': []}) >>> scaler.fit([df_train], ['bearing_1'], events_df) >>> df_test_scaled = scaler.transform(df_test, 'bearing_1', events_df) >>> # df_test_scaled now has values in [0, 1]
- fit(historic_data: list, historic_sources: list[str], event_data: DataFrame, anomaly_ranges=None) None#
Fit scalers for each source using training data.
Computes min/max values for each sourceβs features from the training data. Creates source-specific SKLearnMinMaxScaler instances.
- Parameters:
historic_data (list[pd.DataFrame]) β Training DataFrames, one per source.
historic_sources (list[str]) β Source identifiers (e.g., [βbearing_1β, βbearing_2β]).
event_data (pd.DataFrame) β Event log (unused by this preprocessor).
anomaly_ranges β Unused.
- get_params()#
Return hyperparameters (none for this preprocessor).
- Returns:
Empty dict {} (no hyperparameters to configure).
- Return type:
dict
- transform(target_data: DataFrame, source: str, event_data: DataFrame) DataFrame#
Apply fitted scaler to transform test data to [0, 1].
- Parameters:
target_data (pd.DataFrame) β Test data to scale.
source (str) β Source identifier (used to select the appropriate scaler).
event_data (pd.DataFrame) β Event log (unused).
- Returns:
- Scaled data in [0, 1] range. Returns original data if
source not found in scaler_per_source (fallback for generalization).
- Return type:
pd.DataFrame
Examples
>>> df_test_scaled = scaler.transform(df_test, 'bearing_1', events_df) >>> print(df_test_scaled.min().min()) # Near 0 >>> print(df_test_scaled.max().max()) # Near 1
- transform_one(new_sample: Series, source: str, is_event: bool) Series#
Scale a single sample using fitted scaler.
- Parameters:
new_sample (pd.Series) β Single row to scale.
source (str) β Source identifier.
is_event (bool) β Whether this is an event row (unused).
- Returns:
Scaled sample.
- Return type:
pd.Series