Inspiration
Pipeline integrity management is a beat-the-clock challenge. Operators are drowning in data, receiving massive inspection reports years apart with thousands of anomalies. The specific inspiration for Traceline came from the realization that raw data is useless without context. Knowing a pipe has 20% metal loss is one thing; knowing that same spot was only 5% loss five years ago is a critical alert. We wanted to build the "time machine" for pipeline data, a tool that instantly connects the past to the present to predict the future.
What it does
Traceline is an automated integrity management engine that ingests raw In-Line Inspection (ILI) datasets from different years (2007 to 2022) to perform high-precision anomaly matching and growth analysis.
Intelligent Alignment: It doesn't just overlay data; it mechanically aligns two different inspections using physical Girth Welds as reference anchors, correcting for "odometer drift" between runs.
Multi-Dimensional Matching: It uses a weighted algorithm to match anomalies based on corrected axial distance, clock position, and physical dimensions, handling edge cases like pipe rotation and sensor noise.
Growth Forensics: It calculates the annualized growth rate for every defect, instantly flagging critical outliers:
Rapid Growth: Features degrading faster than safety thresholds (>3% per year).
New Anomalies: Corrosion that appeared since the last inspection.
Measurement Errors: "Negative growth" anomalies that indicate sensor calibration issues.
How we built it
We built the core logic in Python to leverage robust data science libraries.
Pandas & NumPy: Used for high-speed data ingestion and vectorized operations to process thousands of pipeline features in milliseconds.
FastAPI: We wrapped our matching engine in a modern REST API, allowing users to upload CSVs and receive a structured JSON response suitable for dashboard visualization.
Matching Algorithm: $$ MatchScore = 0.4(1 - \frac{\Delta d}{T_d}) + 0.3(1 - \frac{\Delta c}{T_c}) + 0.3(1 - \Delta_{dim}) $$
Challenges we ran into
The Odometer Problem: The biggest hurdle was that "1,000 feet" in 2015 was "1,002.5 feet" in 2022. Wheel slippage during inspections makes raw distance unreliable. Building the dynamic calibration based on Girth Welds was essential to solve this.
The Clock Wrap-Around: Matching a defect at 11:55 o'clock to one at 12:05 o'clock required implementing modular arithmetic logic to understand that these are actually close neighbors, not opposite sides of the pipe.
Data Ambiguity: Distinguishing between a "New" feature and a "Missed" match required fine-tuning our tolerance thresholds. We had to iterate on our sensitivity to balance False Positives vs. False Negatives.
Accomplishments that we're proud of
The "Golden Thread": We successfully threaded the data from 2015 to 2022, proving that we could accurately track the same physical atom of corrosion across 7 years of operational history.
Actionable Insights: Instead of just returning a merged list, our system outputs decisions: "Inspect This," "Monitor This," or "Ignore This."
What we learned
We learned that context is king. An anomaly's depth means nothing without its history. We also learned that in pipeline data, topology matters more than absolute geometry, relative positions to welds are far more reliable than absolute GPS or odometer readings.
What's next for Traceline
Root Cause Analysis: correlating rapid growth zones with external factors like soil type or terrain elevation and geographical
Built With
- fastapi
- python
- react-native
- tailwind

Log in or sign up for Devpost to join the conversation.