I work on observability engineering for production-scale distributed systems, helping teams understand system behavior through metrics, logs, and traces. My focus is on improving reliability, visibility, and cost-efficiency of observability platforms.
Alongside my professional work, Iβm deeply interested in AI forensics and digital investigation β exploring how intelligent systems fail, how telemetry can act as evidence, and how forensic thinking can improve trust and accountability in modern software.
- Scaled Observability SDK adoption across 100+ applications, enabling structured logging, metrics, and distributed tracing
- Deployed Jaeger in production, driving ** observability cost reduction**
- Migrated observability platforms (Honeycomb β Jaeger, Datadog β Prometheus + Grafana)
- Built observability scorecards to track adoption and maturity across services
- Developed parity audit tools to validate migrations and enable data-driven cost optimization
- Designed dashboards and alerts for BOSH storage workloads
- Supported production incidents, tool upgrades, and observability governance
- Built a diffusion-based text-to-3D generation pipeline using PyTorch and NLP
- Improved model fidelity by 46% compared to baseline approaches
- Reduced training time by 43% through NeRF optimizations
- Enhanced semantic accuracy using LLMs (LLaMA 2) for complex prompt understanding
- Metrics, Logs, Traces: Prometheus, Grafana, Thanos, OpenTelemetry, Jaeger
- Production Observability: Distributed tracing, alerting, dashboards, telemetry cost optimization
- Platform Migrations: Honeycomb β Jaeger, Datadog β Prometheus + Grafana
- Governance: Observability scorecards, standards, parity audits
- Container Orchestration: Kubernetes
- Cloud & Infra: AWS, BOSH
- CI/CD & GitOps: ArgoCD, Helm, CI pipelines, SonarQube
- Automation: Python, Bash
- Backend: Java (Spring Boot), Python
- Event-Driven Systems: Kafka (AWS MSK), consumer lag analysis, offset management
- Quality Engineering: Unit, integration, and component testing
- Design Focus: Reliability, scalability, failure-driven system design
- Digital Forensics Tools: Autopsy, FTK Imager, EnCase, X-Ways, Zimmerman tools
- Foundations: Evidence handling, threat analysis, forensic workflows
- Exploration Areas: AI system failures, telemetry as forensic evidence, accountability in AI systems
- Languages: Python, Java, C++, Go
- Systems: Linux (CSI Linux, Kali), Windows
- Version Control: Git, GitHub
Iβm also a screenwriter, interested in storytelling around intent, conflict, and truth β a perspective that naturally complements investigative and forensic thinking.
Understanding systems by studying how they fail.

