Huck Yang

Sr. Research Scientist, NVIDIA Research

Ph.D., Georgia Institute of Technology

About

I focus on πŸ—£οΈ speech-language alignment and scaling laws. Prior to joining NVIDIA, I worked full-time at Amazon AGI, working with Andreas Stolcke in Ivan Bulyko's team, and as a Research Scientist intern at Google Speech & Brain teams (now DeepMind), co-hosted by Bo Li and Yu Zhang in Tara N. Sainath's team.

πŸŽ“ My Ph.D. topic is on noise-robust voice model adaptation (now post-training), advised by Prof. Chin-Hui Lee.

🧬 I visited Prof. Jesper Tegnér's group on self-evolutionary algorithms and interned at TSMC in mixed-signal IC design before starting my PhD.



Latest News

View All News β†’
Jan 25, 2025

six ICLR 25 papers and one EMNLP 25 Tutorial, accepted

Read More β†’
Oct 2, 2024

three EMNLP 24 and one NeurIPS 24, accepted

Read More β†’
May 2, 2024

one ACL (oral) 24 and one US Patent, accepted

Read More β†’

Selected Publications

DCASE 2025

Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning

Chao-Han Huck Yang, Sreyan Ghosh, Qing Wang, Jaeyeon Kim, Hengyi Hong, Sonal Kumar, Guirui Zhong, Zhifeng Kong, et al.

SLT 2024

LLM Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition

Chao-Han Huck Yang, Taejin Park, Yuan Gong, Yuanchao Li, Zhehuai Chen, Yen-Ting Lin, Chen Chen, Yuchen Hu, Kunal Dhawan, et al.

ICLR 2024

It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition

Chen Chen, Ruizhe Li, Yuchen Hu, Sabato Marco Siniscalchi, Pin-Yu Chen, Engsiong Chng, Chao-Han Huck Yang

ASRU 2023

Generative Speech Recognition Error Correction with Large Language Models and Task-activating Prompting

Chao-Han Huck Yang, Yile Gu, Yi-Chieh Liu, Shalini Ghosh, Ivan Bulyko, Andreas Stolcke

AAAI 2022

Training a Resilient Q-Network against Observational Interference

Chao-Han Huck Yang, I-Te Danny Hung, Yi Ouyang, Pin-Yu Chen

ICML 2021

Voice2series: Reprogramming Acoustic Models for Time Series Classification

Chao-Han Huck Yang, Yun-Yun Tsai, Pin-Yu Chen

Research Areas

Speech-Language Alignment

Exploring semantic and non-semantic alignment for LLMs.

LLM ASR and Translation Cross-Modal

Test-Time Scaling and Reasoning

Developing sample-efficient and cross-modal inference.

Scaling Laws Reward Modeling Decoding

Robust Evaluation and Causality

Building robust evaluation frameworks and intervention-resilient architectures.

Causal Inference Robustness Privacy

Tutorials

EMNLP 2025

Spoken Conversational Agents with Large Language Models

A comprehensive tutorial on integrating LLMs with speech recognition systems, covering task-activating prompting and cross-modal alignment techniques.

Interspeech 2025

Efficient Adaptation in Speech Language Modeling

Introduction to parameter-efficient adaptation methods for speech models, including prompt-tuning and in-context learning approaches.

Interspeech 2023

Cross-Modal Alignment for Voice Foundational Models

Overview of robust speech recognition techniques using large language models, focusing on noise-resilient architectures.