Industry

Senior Engineer, ArchitectureTenstorrent, Santa Clara, CA
September 2025 – present

  • Benchmark development for emerging agentic LLM and server platform workloads.
  • Competitive analysis using industry-standard and internal benchmarks to identify performance gaps.
  • Collaboration with architects and RTL designers on micro-architectural features for performance and power efficiency.
  • Performance models and EDA frameworks to characterize workloads and predict performance under realistic scenarios.

Engineer – Wave Computing, Santa Clara, CA
February 2018 – July 2019

Research Intern – DEVCOM Army Research Lab (DEVCOM ARO), Marina del Rey, CA
May 2022 – August 2022

Academic

Research Assistant – Ming Hsieh Department of Electrical and Computer Engineering, USC
August 2019 – August 2025

Research highlights (supervised by Prof. Viktor Prasanna):

  • Tensor decomposition on CPU, GPU, and FPGA: Parallel algorithms and novel tensor formats to reduce sparse MTTKRP time; custom designs across CPU, GPU, and FPGA.
  • Photonic SRAM / SPRINT: Hardware-algorithm co-design mapping MTTKRP to SPRINT; performance models for peak and sustained performance; collaboration with the hardware team on bottlenecks.
  • GNN for SAR-ATR: Explainable models for GNNs on SAR imagery; human-in-the-loop to improve accuracy.

Shared memory controller (2020-2021): Memory controller for heterogeneous platforms handling irregular traffic from CPU, GPU, and FPGA via routing/mapping.

Prior (University of Moratuwa): Reconfigurable co-processor with limited precision for DNNs; SDN switch on FPGA (OpenFlow); HEVC SCC extension (Intra Block Copy, Palette Coding) at ParaQum Technologies.