YAML Metadata Warning:The pipeline tag "time-series-classification" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

⚡ PowerPlant Anomaly Transformer

Self-supervised anomaly detection model for power plant predictive maintenance.

Detects sensor anomalies and predicts equipment failures across thermal units, wind turbines, solar panels, batteries, and EMS/SCADA systems — all from a single model.

Architecture

A 5.2M parameter Transformer built on state-of-the-art time series anomaly detection research:

Component Source What it does
Self-supervised training AnomalyBERT Trains on normal data only — no failure labels needed
Patch-based encoding PatchTST Efficient 16-timestep patch tokenization
RevIN normalization PatchTST Handles distribution shift across sensors
Relative positional bias AnomalyBERT Captures local temporal patterns
Dual-head scoring TimeRCD Reconstruction + anomaly classification
Asset-type embeddings Custom Single model handles 5 asset types

How It Works

Raw Sensor Data [B, 64, 512]
    ↓
RevIN Normalize (handle distribution shift)
    ↓
Patch Embedding (16-step patches, stride 8 → 63 tokens)
    ↓
+ Asset Type Embedding (thermal/wind/solar/battery/scada)
    ↓
6-layer Transformer Encoder (relative position bias)
    ↓
┌─────────────────────┬──────────────────────┐
│ Reconstruction Head │ Anomaly Scoring Head │
│ (predict original)  │ (classify anomaly)   │
└─────────────────────┴──────────────────────┘
    ↓                        ↓
Combined Score = 0.6 × anomaly_score + 0.4 × recon_error

Supported Assets & Sensors

🔥 Thermal Units (32 sensors)

Exhaust temperatures (×4), inlet/compressor/turbine temps, bearing temperatures (×3), lube oil temp, cooling water temp, pressures (inlet/compressor/fuel/lube), vibrations (X/Y/axial ×5), fuel/air/steam flows, RPM, power output, efficiency, emissions (NOx/CO), IGV position, humidity.

🌬️ Wind Turbines (24 sensors)

Wind speed/direction, ambient temp/humidity, rotor/generator RPM, pitch angles (×3), power output, reactive power, nacelle/gearbox/generator/bearing temps, hydraulic pressure, yaw angle, tower acceleration, blade vibration.

☀️ Solar Panels (16 sensors)

GHI/POA irradiance, ambient/panel temps (×3), DC voltage/current (×2 strings), AC power/voltage/current, inverter temp, frequency, power factor.

🔋 Batteries (8 sensors)

Voltage, current, temperature, SOC, impedance, capacity, cycle count, cell balance delta.

📡 EMS/SCADA (64 sensors)

Bus voltages (×8), line flows MW/MVAR (×4 lines), transformer taps/temps (×4), system frequency, ACE, area load/generation, tie-line flows, reserve margin, breaker status (×4), capacitor banks (×4), reactors (×4), line currents/loading, frequency deviation, ramp rate, spinning reserve, regulation signal.

Failure Patterns Detected

Pattern Description Power Plant Example
Spike Sudden extreme value Bearing failure, pressure surge, voltage transient
Drift Gradual trend change Heat exchanger fouling, blade erosion, battery degradation
Level Shift Sudden offset Sensor recalibration, operating mode change
Oscillation Abnormal periodicity Rotor imbalance, turbine resonance, grid oscillation
Dropout Signal loss Communication failure, sensor malfunction
Noise Increase Higher variance Bearing wear, turbulence increase, loose connection

Quick Start

1. Load the Model

import torch
from model import PowerPlantAnomalyTransformer

# Load checkpoint
checkpoint = torch.load('model.pt', map_location='cpu', weights_only=False)
config = checkpoint['C']

model = PowerPlantAnomalyTransformer(
    n_channels=config['nc'],
    context_length=config['cl'],
    patch_length=config['pl'],
    stride=config['st'],
    d_model=config['dm'],
    num_heads=config['nh'],
    num_layers=config['nl'],
    d_ff=config['df'],
)
model.load_state_dict(checkpoint['msd'])
model.eval()

2. Score Sensor Data

# Prepare your SCADA data: [batch, channels, timesteps]
# Pad to 64 channels if fewer sensors
sensor_data = torch.randn(1, 64, 512)  # Replace with real data
asset_type = torch.tensor([0])  # 0=thermal, 1=wind, 2=solar, 3=battery, 4=scada

# Get anomaly scores
scores = model.get_anomaly_scores(sensor_data, asset_type)

# Per-timestep anomaly score (0 = normal, 1 = anomaly)
anomaly_score = scores['combined_score']  # shape: [1, 512]

# Set threshold based on your plant's tolerance
threshold = 0.6
anomalies = anomaly_score > threshold
print(f"Anomalous timesteps: {anomalies.sum().item()} / 512")

3. Score a Full Time Series (CSV)

python inference.py \
    --model model.pt \
    --data scada_readings.csv \
    --asset thermal \
    --threshold 0.6 \
    --output alerts.csv

Fine-tuning on Your Plant's Data

This model is pre-trained via self-supervised learning. For best results on your specific plant:

from torch.optim import AdamW

# 1. Collect 1-2 weeks of NORMAL operating data
# 2. Format as windows: [num_windows, n_channels, 512]
# 3. Fine-tune (the model learns YOUR plant's normal patterns):

optimizer = AdamW(model.parameters(), lr=1e-5, weight_decay=0.01)

model.train()
for epoch in range(10):
    for batch in your_dataloader:
        result = model(
            batch['data'],           # [B, channels, 512]
            batch['asset_type'],     # [B] integer 0-4
            training=True
        )
        loss = result['loss']
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
    print(f"Epoch {epoch+1}: loss={loss.item():.4f}")

# Save fine-tuned model
torch.save({'msd': model.state_dict(), 'C': config}, 'model_finetuned.pt')

Data Preparation Tips

  1. Sampling rate: Match your SCADA historian (typically 1-10 second intervals)
  2. Window size: 512 timesteps = ~8.5 minutes at 1s sampling, ~85 minutes at 10s
  3. Channel ordering: Keep consistent channel ordering across windows
  4. Padding: If you have fewer sensors than the model's max (64), pad with zeros
  5. Normal data only: For fine-tuning, use ONLY data from normal operating periods

Architecture Details

Parameter Value
Total parameters 5,162,433
Context length 512 timesteps
Patch length 16
Patch stride 8
Number of patches 63
Transformer dimension 256
Attention heads 8
Encoder layers 6
FFN dimension 1024
Max channels 64
Asset types 5 (thermal, wind, solar, battery, SCADA)

Integration with Existing Systems

SCADA/EMS Integration

# Real-time scoring loop
import time

while True:
    # Read latest 512 samples from SCADA historian
    data = read_scada_historian(n_samples=512, channels=sensor_list)
    
    # Convert to tensor
    x = torch.tensor(data, dtype=torch.float32).unsqueeze(0)
    at = torch.tensor([asset_type_id])
    
    # Score
    scores = model.get_anomaly_scores(x, at)
    max_score = scores['combined_score'].max().item()
    
    # Alert if threshold exceeded
    if max_score > 0.7:
        send_alarm(f"Anomaly detected! Score: {max_score:.2f}")
    
    time.sleep(60)  # Check every minute

With Chronos for Forecasting + Anomaly Detection

For a complementary approach using forecasting-based anomaly detection:

# pip install chronos-forecasting
from chronos import BaseChronosPipeline

# Use Chronos for forecasting, this model for pattern anomalies
chronos = BaseChronosPipeline.from_pretrained("amazon/chronos-bolt-base")

# Forecast next values
forecast = chronos.predict(sensor_series[-512:], prediction_length=48)
p10, p50, p90 = forecast[0, 0], forecast[0, 1], forecast[0, 2]

# Anomaly = actual outside prediction interval OR this model flags anomaly
chronos_anomaly = (actual > p90.numpy()) | (actual < p10.numpy())
transformer_anomaly = model_scores > 0.6

# Combined detection: either method flags it
is_anomaly = chronos_anomaly | transformer_anomaly

Research Background

This model synthesizes insights from the following papers:

  1. AnomalyBERT (Jungmin Ryu et al., 2023) — Self-supervised Transformer for time series anomaly detection using data degradation. F1=0.854 on SWaT industrial benchmark.

  2. PatchTST (Yuqi Nie et al., ICLR 2023) — Patch-based Transformer for time series. Channel-independent processing with RevIN normalization enables handling variable sensor counts.

  3. TimeRCD (THU-SAIL, 2025) — Zero-shot foundation model for anomaly detection. Dual-head (reconstruction + anomaly scoring) architecture improves anomaly-normal separation.

  4. MOMENT (CMU AutonLab, ICML 2024) — Time series foundation model. Demonstrated that pre-training on diverse time series transfers well to industrial anomaly detection.

  5. THEMIS (2024) — Uses Chronos encoder embeddings for zero-shot anomaly detection via spectral scoring. F1=78.8% on NASA MSL without any training.

License

Apache 2.0

Citation

@misc{powerplant-anomaly-transformer-2025,
  title={PowerPlant Anomaly Transformer: Self-supervised Predictive Maintenance},
  year={2025},
  note={Based on AnomalyBERT, PatchTST, and TimeRCD architectures},
  url={https://huggingface.co/kkonathala/powerplant-anomaly-transformer}
}
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for kkonathala/powerplant-anomaly-transformer