Decentralized wildlife intelligence, verified by experts.
๐ Second place submission to Hack SMU 2026
Tusk 'n Tidy is the world's first smart audio library that cleans up noisy jungle recordings to let experts verify and translate what animals are actually saying: a Web3-powered platform that lets anyone upload, analyze, and verify elephant field data with research-backed audio processing and on-chain trust.
Language and conservation efforts rely on accurate elephant audio data, but today, that data is often fragmented, unverifiable, or inaccessible. Biologists and citizen scientists collect valuable recordings, yet trust and validation remain bottlenecks.
Tusk โn Tidy ensures that every contribution is traceable, verified, and rewarded, empowering:
- ๐ฟ Citizen scientists to contribute meaningful data
- ๐งช Biologists to validate findings with confidence
- ๐ Researchers to make elephant emotion-driven decisions
Wildlife research is powerful but messy:
โHow do we know this recording is real?โ โWhere did this data come from?โ โWhats happening in this audio?โ
We wanted to build a system where every piece of data has proof, provenance, and expert validation.
To do this, we developed the world's largest and most navigable dataset of clean, labeled elephant field recordings covering 29 contexts and 91 actions with a total of 5,510 audio samples with datapoints.
- ๐งน Multi-Stage Audio Cleaning: Combines classical DSP (spectral subtraction, Wiener filtering, NMF) to produce clean, research-grade audio signals. See our dataflow below:
RAW NOISY WAV FILE
โ
โผ
[1. PAPER IMPLEMENTATION]
[pubs.aip](https://pubs.aip.org/asa/jasa/article/141/4/2715/1059147/Automated-detection-of-low-frequency-rumbles-of)
STFT with nfft=1024, hop=200, Hann window
โ (Complex spectrogram: magnitude + phase)
โผ
[2. PAPER IMPLEMENTATION]
[arxiv](https://arxiv.org/abs/2410.12082)
Log-frequency axis transformation
โ (Makes harmonic structure linear)
โผ
[3. PAPER IMPLEMENTATION]
[pmc.ncbi.nlm.nih](https://pmc.ncbi.nlm.nih.gov/articles/PMC8648737/)
SPECTRAL SUBTRACTION (ฮฑ=1.5, ฮฒ=0.02)
โ (Removes stationary noise: generator hum)
โผ
[4] WIENER FILTERING
โ (Smooths noise removal, reduces musical noise)
โผ
[5] NMF SEPARATION
โ (Removes tonal components: car engine, generator RPM)
โผ
[6] U-NET MASK PREDICTION (BioCPPNet architecture)
โ (Deep learning source separation)
โ Outputs: mask_elephant, mask_noise
โผ
[7] APPLY MASK: Sxx_elephant = Sxx_noisy ร mask_elephant
โ
โผ
[8] AST FRAME-LEVEL VERIFICATION (arXiv 2410.12082)
โ (Detects exact rumble boundaries, removes non-elephant frames)
โ
โผ
[9] INVERSE STFT โ Time-domain waveform
โ (nfft=1024, hop=200, Hann window)
โ
โผ
[10] BAND-PASS FILTER: 8โ180 Hz
โ (Removes any residual high-freq noise, DC offset
โ
โผ
CLEAN ELEPHANT AUDIO RECORDING
- ๐ง Research-backed Audio Processing: Filter recordings using research pipelines to detect and isolate animal motivations and thinking.
- ๐ค LIVE Monitoring: Capture audio in real time and instantly run cleaning, detection, and labelingโsurfacing elephant calls with live spectrograms and on-the-fly annotations for immediate insight.
- ๐ Decentralized Uploads: Users upload raw field recordings directly to IPFS, ensuring permanent, tamper-proof storage.
- ๐งช Expert Verification: Verified biologists review uploads, cleaning, and labels and confirm findings via blockchain-backed approvals.
- ๐ Interactive Explorer: Browse recordings, view spectrograms, and analyze elephant audio in real time.
- Behavioral context: Each call is framed as multi-class prediction over ethogram-style contexts (e.g. affiliative, protest & distress, social play, movement & leadership), backed by a high-accuracy acoustic classifier on a 256-D fingerprint.
- Valence & arousal: The linguistics pipeline also learns coarse valence (positive / neutral / negative) and arousal (low / medium / high) from conttext, used for emotion summaries.
- Interpretation cards: Fuse cluster ID, predicted context, confidence language, and valence/arousal tags so reviewers can spot-check stories call-by-call.
- Input representation: 256 dimensions of elephant-specific features grouped into rumble-band energy, ~7.7 Hz tremor, temporal phases (onset / body / offset), and timbre (MFCCs + mel statistics), roughly a voice print for infrasonic calls.
- Model family (dashboard narrative): LightGBM-style gradient boosting on tabular acoustics (91.6% accuracy on evaluation).
- KNN: KNN uses five audios with the smallest absolute duration gap to the cleaned recording, then overlays them on the same UMAP, visualizing biological differences in duration connection versus emotion connection.
- Link to behavior: UMAP clusters acoustic clusters to dominant behavioral contexts.
- Open Science: Anyone can explore and contribute to global biodiversity data.
- Live Field Recordings: Analyze elephant emotions and actions using audio in real-time.
- Wildlife researchers collecting and validating animal recordings
- Conservation organizations tracking endangered species
- Citizen scientists contributing field data globally
- Academic institutions building open-access biological datasets
- LIVE environmental monitoring using audio-based species detection
Our Web3 Stack is as follows:
- Solana Web3.js for blockchain interaction
- Wallet authentication via Phantom/Solflare
- Anchor Framework for smart contracts
What Solana Does in Our System
- ๐งพ Proof of Origin: Every audio upload is tied to a wallet signature and stored on-chain as a CID reference, creating a permanent, tamper-proof record of who submitted what.
- ๐งช Expert Verification as a Transaction: When a biologist approves a recording, itโs not just a UI action, itโs a verification event.
- ๐ Reputation System: Users build credibility through verified contributions, stored transparently and resistant to manipulation.
- ๐ช Incentive Alignment: Smart contracts reward both contributors and validators, ensuring high-quality data and honest reviews.
- ๐ Real-Time Sync: A WebSocket indexer listens to on-chain events and updates the app instantlyโbridging blockchain and a fast user experience.
Heavy data stored on IPFS + Trust/verification stored on Solana = efficient + scalable + verifiable
Frontend
- HTML, Tailwind CSS, JavaScript
- Plotly.js for in-browser waveform visualization
Backend & APIs
- Python + Flask
- Audio / DSP: librosa, scipy, soundfile, matplotlib
AIs & Data Processing
- Audio pipeline: STFT + spectral denoising, U-Net source separation
- Python: librosa, scipy, scikit-learn
- Gemini API: Lightweight quiz and answer-card synthesis, grounded to the retrieved transcript span.
- Google Cloud Text-to-Speech: Reads back care instructions in a clear, natural voice.
Media & Data
- Custom largest dataset: 5,510+ segmented elephant field recordings covering 29 contexts and 91 actions.
- Handling large audio uploads with efficient browser chunking + IPFS storage
- Designing a robust non-AI processing pipeline and bridging it with AI for noisy real-world field recordings
- Creating a UX that balances scientific depth with accessibility
- End-to-End Pipeline: Upload โ IPFS โ Research-backed processing โ expert verification โ on-chain record
- AI-Powered Bioacoustics: Accurate emotion analyzer of wildlife sounds
- Decentralized Trust Layer: Every contribution backed by verifiable blockchain transactions
- Interactive Explorer: Real-time browsing of verified biological data
- Incentive System: Contributors and experts rewarded fairly via smart contracts
- Real-world data (like wildlife audio) is messy, and signal processing matters
- UX is critical, even in technical platforms, to drive adoption for biologists
- Build a mobile field recording app for phones for easier data collection
- Add DAO governance for community-driven validation standards
- Integrate with global biodiversity databases (e.g., GBIF)
- Introduce real-time alerts for species safety concerns
Tusk โn Tidy transforms biological data into a trusted knowledge network. Weโre cleaning the noise, proving the truth, and protecting the giants.
Letโs change conservation together, one recording at a time! ๐๏ธ
hacksmu26/
โโโ frontend/ # Web dashboard
โ โโโ index.html # Landing page
โ โโโ analysis.html # Analysis dashboard
โ โโโ cleanup.html # Audio cleanup terminal
โ โโโ css/styles.css # Dark institutional theme
โ โโโ js/ # Frontend scripts
โโโ backend/
โ โโโ elephant_linguistics/ # Call analysis pipeline
โ โโโ elephant_ethogram/ # Ethogram data processing
โโโ app.py # Flask API for audio cleanup
โโโ elephant_audio_cleaner.py # Audio cleaning module
โโโ requirements.txt # Python dependencies
cd hacksmu26
# Install root dependencies (for audio cleanup)
pip install -r requirements.txt
# Install linguistics pipeline dependencies
pip install -r backend/elephant_linguistics/requirements.txtcd backend/elephant_linguistics
# Generate sample data (optional, for testing)
python generate_sample_data.py
# Run the full analysis pipeline
python run_from_csv.py --csv sample_data/features.csvThis will:
- Analyze elephant calls and cluster them into call types
- Train context classifiers
- Generate visualizations and export data to the frontend
cd frontend
# Start a local HTTP server
python -m http.server 8080Then open http://localhost:8080 in your browser.
cd hacksmu26
# Start the Flask API server
python app.pyThe audio cleanup API will run at http://127.0.0.1:5000
Then navigate to http://localhost:8080/cleanup.html to use the audio cleanup feature.
See /backend/elephant_training for more details on the audio processing pipeline and how it can be run independently.