Electrical Engineering student at UC Irvine with a focus on computer architecture, RTL design, and hardware acceleration. I work at the intersection of chip design and machine learning — turning research ideas into silicon-level solutions. My work has been published at IEEE/ACM DATE 2026.
Currently getting into ASIC back-end flow — synthesis, P&R, timing closure. Still a lot to learn, but that's the direction.
The problem: ternary LLMs are memory-efficient (8× smaller than FP16), but running them on CPUs is bottlenecked by lookup table (LUT) fetches — those LUTs occupy less than 0.01% of RAM yet account for 87.6% of all memory transactions and 91.6% of execution time. T-SAR fixes this by repurposing SIMD vector registers to generate LUTs on-the-fly instead of loading them from memory, turning a bandwidth-bound problem into a compute-bound one — with only minimal ISA extensions and no new ALUs.
Results: 5.6–24.5× GEMM latency reduction and 1.1–86.2× GEMV throughput improvement over state-of-the-art CPU baselines, across models from 125M to 100B parameters. Memory request volume drops by 8.7–13.8×. On a mobile CPU, a 7B model prefill goes from >20s to under 1.7s. Hardware cost is minimal: only +1.4% area and +3.2% power on a 256-bit SIMD slice (synthesized at TSMC 28nm). Energy efficiency is 2.5–4.9× better than NVIDIA Jetson AGX Orin on the same workloads.
Synthesized RTL accelerator designs onto FPGA and worked through the timing and utilization issues that only show up once you're on real hardware. Learned a lot about the gap between simulation and actual deployment.
Implemented a Verilog module for hardware-efficient softmax approximation in Q8 fixed-point. Spent a fair amount of time getting the value scaling right — small errors in fixed-point arithmetic compound quickly.
Designed and verified UART TX/RX modules in SystemVerilog — baud-rate generation, control logic, and testbench simulation. A foundational project that made me much more careful about timing constraints.
Interested in hardware, architecture, or research collaboration?
Let's talk.