I'm anshuman, a CS undergrad at UPES, India. I work on GPU programming, distributed systems, and ML infrastructure — the layer between the model and the metal.
Right now I'm contributing to HPX (an HPC framework for parallel and distributed applications) through GSoC 2026, building hierarchical collective operations and benchmarking them on a DGX H100 cluster.
I'm also working through every kernel in the ML systems stack from scratch — CUDA tiled GEMM, Triton fused softmax, INT4 quantization, FlashAttention — and documenting it all in a public wiki.
Applying for MS in CS (Fall 2027) with a focus on systems for ML.
Built from scratch with Python (FastAPI), vanilla HTML/CSS/JS, SQLite, and WebSockets. Procedural rain and ambient audio via Web Audio API. Real-time cursor presence via WebSocket.