ZK Attestations vs Traditional KYC: A Practical Guide
Why most KYC flows only need a boolean answer, and how ZK-SNARKs, STARKs, and Groth16 can replace raw data sharing — with implementation tradeoffs.
ZK Proofs
Currently at Saptang Labs, where I build ML inference pipelines, embedding search, and transformer-based NLP services. Before this, I ran a fintech company (DIPP-registered) building ZK infrastructure for proof generation — replacing data-heavy workflows with boolean attestations. I also spend time on independent work: GLOF prediction from satellite imagery, quantum circuits for intrusion detection, real-time fact-checking on live broadcasts. Most of these were hackathon entries that ended up winning.
GitHub ResumeJune 2024 - Present
Built the ML layer for a large-scale multimodal analytics platform — face recognition (InsightFace, DBSCAN clustering, Qdrant), object detection (Florence-2), multilingual OCR across 14 languages, speaker diarization, and video classification. Architected as 5 microservices across 7 databases. Also wrote 11 transformer-based NLP packages (entity extraction, sentiment, topic classification) served via Gemma 3 12B on vLLM, processing ~10K posts/day.
August 2023 - June 2024 (On Hold)
DIPP-registered fintech company (DIPP-143482). Built ZK infrastructure for proof generation — the premise being that most financial workflows consuming raw personal data only need a boolean answer. Designed Circom circuits and Groth16-based proving systems to replace data-heavy KYC and compliance checks with verifiable boolean attestations.
November 2022 - April 2026
Studied biotechnology — genomics, proteomics, metabolic pathways. Taught myself systems programming, ML, and cryptography alongside. Most applied experience comes from production work at Saptang Labs and the independent projects listed below.
2023
CS50 AI from Cambridge. Economics of Banking & Finance Markets and Operations & Supply Chain Management, both from IIT Kanpur.
Here is some of my recent work — production systems, hackathon builds, and independent research.

End-to-end pipeline for glacial lake outburst flood prediction. Sentinel-1 SAR imagery is processed to track surface deformation; temporal sequences are modeled through an LSTM to capture ice-melt dynamics. XGBoost classifies risk levels on fused feature vectors. An ESP32 sensor array (water level, temperature, flow rate) provides real-time ground truth calibration. Downstream module simulates water flow to compute evacuation corridors and time-to-impact estimates. Won C-DAC HimaShield Grand Challenge 2025 (Rs 5,00,000). Covered in Press Information Bureau release.

Hybrid quantum-classical intrusion detection system. A 4-qubit variational quantum circuit (PennyLane, angle embedding, strongly entangling layers) is trained on CICIDS NetFlow features and benchmarked against XGBoost and LightGBM. The objective was to evaluate whether parameterized quantum circuits offer any advantage on non-linear decision boundaries in high-dimensional traffic data. VirusTotal API handles IOC enrichment and threat correlation. Stack: Next.js frontend, FastAPI backend. 3rd Prize, Indian Army Terrier Cyber Quest 2025.

Streaming pipeline for misinformation detection in live broadcasts. FFMPEG ingests TV/radio feeds, Whisper handles ASR with timestamp alignment, DistilBERT extracts verifiable claims from transcript segments, and Google Fact Check API cross-references them against indexed databases. Claims are scored with confidence metrics and surfaced on a Streamlit dashboard with source attribution. Designed for newsroom and regulatory deployment. Winner, NFSU Forensic Hackathon 2025 (Rs 1,00,000).

Privacy-preserving identity verification over Aadhaar credentials using ZK-SNARKs. Circom arithmetic circuits with Groth16 proving system let users attest to predicates (age ≥ 18, valid document, state residency) without revealing the underlying data. Proof generation in-browser via WASM in ~500ms, server-side verification in ~10–50ms, proof size < 1KB. The goal is to replace KYC flows that needlessly consume private data with a single boolean attestation. Designed around DPDP Act 2023 requirements.

Consent management layer built on the same principle as Xylem: if a workflow only needs a yes/no answer, it should not require raw personal data. Implements field-level access control, versioned consent records with cryptographic audit trails, and automated DPDP Act compliance reporting. API-first architecture — consent policies are defined as code with pluggable storage backends.

Binary classifier for the GST Hackathon — trained on ~700K taxpayer records with 21 attributes to detect fraudulent filings. Pipeline: missing-value imputation, feature scaling, RandomizedSearchCV over 100 iterations for hyperparameter tuning. Evaluated on accuracy, precision, recall, F1, AUC-ROC, log loss, calibration curves, and cumulative gain/lift charts. SHAP feature importance analysis identifies which attributes are most predictive of fraud.

Daily open source contribution discovery and workflow tool targeting ZKP and Rust ML repositories. Scans 10+ repos across arkworks, SP1, EZKL, Candle, and Burn — fetches open issues via GitHub CLI, scores them by contributor-friendliness (labels, keyword matches, comment count, assignee status, recency), and generates ranked markdown digests. Includes a shell-based workflow helper that automates fork, clone, branch creation, and draft PR submission. Designed to reduce friction for sustained open source contributions.
Notes on things I've built, broken, and figured out.
If you want to talk about AI, ML, or cryptography — reach out.