ASQUARE
About me<<< back

I'm anshuman, a CS undergrad at UPES, India. I work on GPU programming, distributed systems, and ML infrastructure — the layer between the model and the metal.

Right now I'm contributing to HPX (an HPC framework for parallel and distributed applications) through GSoC 2026, building hierarchical collective operations and benchmarking them on a DGX H100 cluster.

I'm also working through every kernel in the ML systems stack from scratch — CUDA tiled GEMM, Triton fused softmax, INT4 quantization, FlashAttention — and documenting it all in a public wiki.

Applying for MS in CS (Fall 2027) with a focus on systems for ML.

Interests
  • GPU Kernel Engineering — CUDA C++, Triton, fused operators, Tensor Cores, mixed precision.
  • Distributed Systems — MPI, NCCL, collective communication, HPX, FSDP.
  • ML Infrastructure — quantization (INT4/INT8, GPTQ, QLoRA), inference serving, vLLM.
  • Open Source — STEllAR-GROUP/HPX contributor, GSoC 2026.
  • Coffee at night —> always.
Find me
GitHub
where the code lives
Twitter
occasional thoughts
Pinterest
visual references
Email
the reliable way
This website

Built from scratch with Python (FastAPI), vanilla HTML/CSS/JS, SQLite, and WebSockets. Procedural rain and ambient audio via Web Audio API. Real-time cursor presence via WebSocket.

Source on GitHub.

Right Now
listening
scanning frequencies...
reading
building
this café
coffee today
☕ ☕ ☕
✧ Current status
anshuman • just now
somewhere between awake and elsewhere.
online now: ...
total visitors: loading...
Settings
(c) 2026 anshuman. built by hand. no templates. just html and coffee.
the last café · github · email
sound off