News Archive - Huck C.-H. Yang

Jan 25, 2025

ICLR 2025:
"Audio Large Language Models Can Be Descriptive Speech Quality Evaluators"
📄 Paper
ICLR 2025:
"Towards Neural Scaling Laws for Time Series Foundation Models"
📄 Paper 💻 Code
ICLR 2025:
"A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning"
📄 Paper
ICLR 2025:
"UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation"
📄 Paper 🔊 Demo
ICLR 2025:
"Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks"
📄 Paper
ICLR 2025:
"Fugatto 1 - Foundational Generative Audio Transformer Opus 1"
📄 Paper

Oct 2, 2024

EMNLP 2024 Main Track:
"From Descriptive Richness to Bias: Unveiling the Dark Side of Generative Image Caption Enrichment"
Collaboration with U Osaka and NVIDIA Research 📄 Paper 💻 Code
EMNLP 2024 Main Track:
"Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities"
Collaboration with Tsinghua University 📄 Paper
EMNLP 2024 Industry Track:
"FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model"
Collaboration with CMU 📄 Paper 💻 Code
NeurIPS 2024:
"Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models"
Collaboration with Nanyang Tech 📄 Paper 💻 Code

May 2, 2024

ACL 2024 Oral:
"GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators"
📄 Paper 💻 Code
US Patent:
"Parameter-efficient model reprogramming for cross-lingual speech recognition"

Filed by Google LLC. Priority to US18/490,808
Work completed at Google Speech/Brain (now part of Google DeepMind in Gemini Core) 📄 Patent Document

Feb 2, 2024

ICLR 2024 Spotlight:
"Large Language Models are Efficient Learners of Noise-Robust Speech Recognition"
📄 Paper 💻 Code
ICASSP 2024 Oral (Best Paper Candidate):
"Can whisper perform speech-based in-context learning?"
📄 Paper

Oct 2, 2023

EMNLP 2023 Oral:
"Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition"
📄 Paper 💻 Code
NeurIPS 2023:
"Hyporadise: An open baseline for generative speech recognition with large language models"
📄 Paper
UAI 2023:
"Pessimistic Model Selection for Offline Deep Reinforcement Learning"
📄 Paper