Skip to content

QIanGua/DASMatrix

Repository files navigation

DASMatrix

DASMatrix Logo

Distributed Acoustic Sensing Data Processing and Analysis Framework

Python Version License CI Status Documentation δΈ­ζ–‡ζ–‡ζ‘£


πŸ“– Project Overview

DASMatrix is a high-performance Python framework specifically designed for Distributed Acoustic Sensing (DAS) data processing and analysis. This framework provides a comprehensive toolkit for reading, processing, analyzing, and visualizing DAS data, suitable for research and applications in geophysics, structural health monitoring, and security surveillance.

✨ Core Features

  • πŸš€ High-Efficiency Data Reading: Support for 12+ data formats (DAT, HDF5, PRODML, Silixa, Febus, Terra15, APSensing, ZARR, NetCDF, SEG-Y, MiniSEED, TDMS) with Lazy Loading
  • ⚑ HPC Engine: Built on Xarray and Dask for TB-level Out-of-Core processing with Numba JIT-optimized kernels and operator fusion
  • πŸ”— Fluent Chainable API: Intuitive signal processing workflows through DASFrame
  • 🧠 AI Inference Integration: Native support for PyTorch and ONNX models with high-performance inference pipelines
  • πŸ“Š Professional Signal Processing: Comprehensive tools including spectral analysis, filtering, integration, and event detection
  • πŸ€– Intelligent Agent Tools: AI Agent toolkit supporting natural-language driven deep analysis and automated discovery
  • πŸ“ˆ Scientific-Grade Visualization: Multiple plot types including time-domain waveforms, spectra, spectrograms, waterfalls
  • πŸ“ Unit System: First-class physical unit support via Pint integration
  • 🎲 Built-in Examples: Easy generation of synthetic data (sine waves, chirps, events) for testing

πŸš€ Quick Start

Installation

Option 1: Using uv (Recommended)

# Clone the repository
git clone https://github.com/QIanGua/DASMatrix.git
cd DASMatrix

# Install with uv (automatically creates virtual environment)
uv sync

Option 2: Using pip

# Clone the repository
git clone https://github.com/QIanGua/DASMatrix.git
cd DASMatrix

# Install with pip
pip install -e .

Basic Usage

1. Modern API with DASFrame (Recommended)

from DASMatrix import df

# Create DASFrame with lazy loading
frame = df.read("data.h5")

# Build processing pipeline
processed = (
    frame
    .detrend(axis="time")   # Remove trend
    .bandpass(1, 500)       # Bandpass filter
    .normalize()            # Normalize
)

# Advanced Analysis (Modern STFT API)
stft_frame = frame.stft(nperseg=1024, noverlap=512)

# Execute computation (Parallelized via Dask)
result = processed.collect()

# Optional: use HybridEngine for supported ops
# Supported ops:
# - slice
# - detrend(time)
# - demean(time)
# - abs
# - scale
# - normalize
# - bandpass
# - lowpass
# - highpass
# - notch
# - fft
# - hilbert
# - fk_filter
# - median_filter
# - stft
result_hybrid = processed.collect(engine="hybrid")

# Scientific Visualization (with Auto-Decimation Protection)
processed.plot_heatmap(title="HPC Waterfall", max_samples=2000)

API Migration Notes

The project now standardizes on snake_case APIs. Legacy CamelCase methods remain available for one compatibility cycle and emit DeprecationWarning.

Legacy API Preferred API
reader.ReadRawData(path) reader.read_raw_data(path)
processor.FKFilter(...) processor.fk_filter(...)
processor.ComputeSpectrum(...) processor.compute_spectrum(...)
processor.FindPeakFrequencies(...) processor.find_peak_frequencies(...)
DASMatrix.api.stream_func(...) DASMatrix.stream(...)

2. Legacy API

from DASMatrix.acquisition import DASReader, DataType
from DASMatrix.config import SamplingConfig

# Configure sampling parameters
config = SamplingConfig(
    fs=10000,      # Sampling frequency 10kHz
    channels=512,  # 512 channels
    wn=5.0,        # 5Hz high-pass filter
    byte_order="big"
)

# Read data
reader = DASReader(config, DataType.DAT)
raw_data = reader.read_raw_data("path/to/data.dat")

3. Visualization Example

from DASMatrix.visualization import DASVisualizer
import matplotlib.pyplot as plt

# Create visualizer
visualizer = DASVisualizer(
    output_path="./output",
    sampling_frequency=config.fs
)

# Time-domain waveform
visualizer.WaveformPlot(
    data[:, 100],          # Channel 100 data
    time_range=(0, 10),    # Show 0-10 seconds
    amplitude_range=(-0.5, 0.5),
    title="Waveform Plot",
    file_name="waveform_ch100"
)

# Spectrum plot
visualizer.SpectrumPlot(
    data[:, 100],
    title="Spectrum Plot",
    db_range=(-80, 0),
    file_name="spectrum_ch100"
)

# Spectrogram
visualizer.SpectrogramPlot(
    data[:, 100],
    freq_range=(0, 500),
    time_range=(0, 10),
    cmap="inferno",
    file_name="spectrogram_ch100"
)

# Waterfall plot (time-channel)
visualizer.WaterfallPlot(
    data,
    title="Waterfall Plot",
    colorbar_label="Amplitude",
    value_range=(-0.5, 0.5),
    file_name="waterfall"
)

plt.show()

4. AI Inference & Model Prediction

from DASMatrix.ml.model import ONNXModel

# Load optimized model
model = ONNXModel("path/to/model.onnx")

# Predict directly in processing chain
predictions = (
    df.read("data.h5")
    .bandpass(10, 100)
    .normalize()
    .predict(model) # Returns inference results
)

# Use Intelligent Agent Tools
from DASMatrix.agent import DASAgentTools
agent_tools = DASAgentTools()
# Inference orchestrated by an LLM-based Agent via natural language
result = agent_tools.run_inference(data_id="...", model_path="...")

πŸ“š Documentation

πŸ—οΈ Project Structure

DASMatrix/
β”œβ”€β”€ acquisition/           # Data acquisition module
β”‚   β”œβ”€β”€ formats/          # Format plugins
β”‚   └── das_reader.py     # DAS data reader class
β”œβ”€β”€ api/                   # Core API
β”‚   β”œβ”€β”€ dasframe.py       # DASFrame (Xarray/Dask Backend)
β”‚   └── df.py            # Functional API entry points
β”œβ”€β”€ ml/                    # [NEW] AI/Machine Learning Module
β”‚   β”œβ”€β”€ model.py          # Model Wrappers (Torch/ONNX)
β”‚   β”œβ”€β”€ pipeline.py       # Inference Pipelines
β”‚   └── exporter.py       # Model Export Utilities
β”œβ”€β”€ agent/                 # [NEW] Agent Engineering Framework
β”‚   β”œβ”€β”€ tools.py          # Intelligent Analysis Toolkit
β”‚   └── session.py        # Task Session Management
β”œβ”€β”€ config/                # Configuration module
β”‚   β”œβ”€β”€ sampling_config.py # Sampling configuration
β”‚   └── visualization_config.py  # Visualization configuration
β”œβ”€β”€ processing/            # Data processing module
β”‚   β”œβ”€β”€ das_processor.py  # DAS data processor class
β”‚   β”œβ”€β”€ numba_filters.py  # Numba-optimized filters
β”‚   └── engine.py         # Computation graph engine
β”œβ”€β”€ visualization/         # Visualization module
β”‚   └── das_visualizer.py # DAS visualization class
β”œβ”€β”€ units.py               # Unit system (Pint-based)
β”œβ”€β”€ examples.py            # Example data generation
└── utils/                 # Utility functions
    └── time.py           # Time conversion tools

πŸš€ High Performance Computing (HPC)

DASMatrix is engineered for massive DAS datasets:

  • Zero-Copy Loading: Utilizing np.memmap for binary formats to index TBs of data in milliseconds.
  • Kernel Fusion: Multiple operations (e.g., demean -> filter -> abs) are fused into a single machine-code loop via Numba, minimizing memory traffic.
  • Lazy Computation Graph: Every operation returns a lazy DASFrame. Real computation only happens when you explicitly collect() or plot().
  • Auto-Decimation: Interactive visualization of huge datasets is protected by automatic downsampling to keep your UI responsive.

πŸ”§ Development

Development Setup

# Install development dependencies
uv sync --dev

# Run tests
just test

# If tests hang at "collecting ..." due to Matplotlib font cache,
# use a writable cache directory:
MPLCONFIGDIR=/tmp/mplcache MPLBACKEND=Agg just test

# If uv crashes on macOS due to SystemConfiguration proxy access,
# ensure system proxy is configured (example uses local proxy):
sudo networksetup -setwebproxy "Wi-Fi" 127.0.0.1 7890
sudo networksetup -setsecurewebproxy "Wi-Fi" 127.0.0.1 7890
sudo networksetup -setsocksfirewallproxy "Wi-Fi" 127.0.0.1 7890

# Restore proxy settings:
sudo networksetup -setwebproxystate "Wi-Fi" off
sudo networksetup -setsecurewebproxystate "Wi-Fi" off
sudo networksetup -setsocksfirewallproxystate "Wi-Fi" off

# Run performance benchmarks
just benchmark

# Code quality checks
just check-all

# Quick fixes
just fix-all

Code Quality Tools

  • Ruff: Linting and formatting
  • MyPy: Type checking
  • Pre-commit hooks: Automatic code quality checks
  • GitHub Actions: CI/CD pipeline

🀝 Contributing

We welcome contributions, issues, and suggestions! Please participate in project development through GitHub Issues and Pull Requests.

Contributing Guidelines

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License.

🌟 Star History

Star History Chart


πŸ‡¨πŸ‡³ δΈ­ζ–‡ζ–‡ζ‘£ | πŸ‡ΊπŸ‡Έ English

About

Distributed Acoustic Sensing Data Processing and Analysis Framework

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors