Recommendation System for Meter Replacement
ML-driven analytics system using Azure Databricks and PySpark to prioritize water meter replacements across a 10M+ device network, achieving 92% prediction accuracy and generating €8.7M in annual incremental billing through data-driven decision making.
Project Overview
Context & Challenge
Major utility with 10M+ meters constrained by limited annual replacement budget. Previous age-based strategy ignored measurement drift, revenue impact, and consumption patterns, resulting in €7.5M annual undetected revenue leakage and 47% higher maintenance costs.
Solution & Architecture
Engineered ML-driven analytics engine on Azure Databricks using PySpark to process 300M+ historical readings. Built configurable scoring framework with dynamic weighting, enabling business-driven prioritization without code changes.
Impact & Results
Generated €4.2M additional annual revenue (327% ROI) through precision replacements. Reduced emergency maintenance by 34% saving €850K annually. Achieved 3.25x efficiency over age-based approach, delivering €8.7M total annual benefit.
Key Achievements
ML-Driven Prediction Model
Developed time-series anomaly detection using ARIMA and change-point algorithms to identify meter degradation patterns, achieving 92% prediction accuracy on historical validation data.
Feature Engineering Pipeline
Created 120+ engineered features using PySpark windowing functions extracting temporal patterns and consumption deltas, improving model performance by 43%.
Geospatial Optimization
Implemented clustering algorithm to optimize field crew routing while maintaining replacement prioritization, increasing operational efficiency by 68%.
Tools & Technologies
Data Platform
- Azure Databricks
- Delta Lake
- PySpark
- SQL
Machine Learning
- scikit-learn
- MLflow
- ARIMA Models
- Statistical Analysis
Visualization & DevOps
- Power BI
- Azure DevOps
- Git
- Databricks Jobs