Ph.D., Georgia Institute of Technology
Focusing on Speech-Language Alignment and Scaling Laws. Prior to joining NVIDIA, I worked at Amazon AGI (ex-ASR Language Modeling), WA; Google (now Gemini Audio at DeepMind), CA, USA, and Hitachi Central Research Laboratory, Tokyo, Japan.
Exploring semantic and non-semantic alignment for LLMs.
Developing sample-efficient and cross-modal inference.
Building robust evaluation frameworks and intervention-resilient architectures.
A comprehensive tutorial on integrating LLMs with speech recognition systems, covering task-activating prompting and cross-modal alignment techniques.
Introduction to parameter-efficient adaptation methods for speech models, including prompt-tuning and in-context learning approaches.