Huck C.-H. Yang
Sr. Research Scientist, Nvidia Research
Creux du Van, Swiss
Hello! I am a Senior Research Scientist at NVIDIA Research, currently working on the Robust Sequence Modeling and Multilingual Audio-Text Alignments in NVIDIA.
I obtained my Ph.D. from Georgia Institute of Technology, advised by Prof. Chin-Hui Lee. I was a research intern at Google Speech & Brain, co-hosted by Bo Li and Yu Zhang, in Tara Sainath’s team; worked full-time at Amazon AGI with Andreas Stolcke in Ivan Bulyko’s team. I am also working with Prof. Jesper Tegner on bio-sequence modeling since 2017.
My primary research lies in the area of Acoustic Model Alignments and Speech-Language Modeling. Specifically:
-
Speech-Language Model Alignment: I explore new cross-modal alignment alogrithms (task-activating prompting, whispering-LLaMA, LLM-ASR) for adapting large language model (LLM) for noise-robust speech recognition, audio captioning, and generative error correction.
-
Parameter-Efficient Sequence Modeling: I study new in-context learning, prompt-tuning, adapter, and their theoretical justifications (Voice2Series) to improve the current class of large-scale acoustic model adaptation (TIH) and general time series understanding.
-
Data Privacy and Robust Evaluation: My earlier work includes developing privacy-preserving, intervention-resilient algorithms (Causal-Inference Q-Network), and benchmark (HyPoradise) for audio and general deep reinforcement learning that comply with data protection regulations, aimed at human-oriented interaction with conversational signals.
Tutorial Presenter: IJCAI 2021, ICASSP 2022, 2023, 2024, Interspeech 2023, ASRU 2023.
Reviewer or Program Committee: ICASSP (2019-24), NeurIPS (2020-24), AAAI (2020-23), ICML (2021-24), ASRU 2023
Technical Committee: Applied Signal Processing Systems, IEEE Signal Processing Society