← Back to Home

News Archive

Jan 25, 2025

Six ICLR 25 Papers and EMNLP 25 Tutorial Accepted

  • ICLR 2025:

    "Audio Large Language Models Can Be Descriptive Speech Quality Evaluators"

    📄 Paper
  • ICLR 2025:

    "Towards Neural Scaling Laws for Time Series Foundation Models"

    📄 Paper 💻 Code
  • ICLR 2025:

    "A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning"

    📄 Paper
  • ICLR 2025:

    "UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation"

    📄 Paper 🔊 Demo
  • ICLR 2025:

    "Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks"

    📄 Paper
  • ICLR 2025:

    "Fugatto 1 - Foundational Generative Audio Transformer Opus 1"

    📄 Paper
Oct 2, 2024

Three EMNLP 24 and One NeurIPS 24 Papers Accepted

  • EMNLP 2024 Main Track:

    "From Descriptive Richness to Bias: Unveiling the Dark Side of Generative Image Caption Enrichment"

    Collaboration with U Osaka and NVIDIA Research 📄 Paper 💻 Code
  • EMNLP 2024 Main Track:

    "Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities"

    Collaboration with Tsinghua University 📄 Paper
  • EMNLP 2024 Industry Track:

    "FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model"

    Collaboration with CMU 📄 Paper 💻 Code
  • NeurIPS 2024:

    "Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models"

    Collaboration with Nanyang Tech 📄 Paper 💻 Code
May 2, 2024

ACL 2024 Oral Presentation and US Patent Granted

  • ACL 2024 Oral:

    "GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators"

    📄 Paper 💻 Code
  • US Patent:

    "Parameter-efficient model reprogramming for cross-lingual speech recognition"

    Filed by Google LLC. Priority to US18/490,808

    Work completed at Google Speech/Brain (now part of Google DeepMind in Gemini Core) 📄 Patent Document
Feb 2, 2024

ICLR 24 (Spotlight), ASRU 23, and ICASSP 24 (Oral) Papers Accepted

  • ICLR 2024 Spotlight:

    "Large Language Models are Efficient Learners of Noise-Robust Speech Recognition"

    📄 Paper 💻 Code
  • ICASSP 2024 Oral (Best Paper Candidate):

    "Can whisper perform speech-based in-context learning?"

    📄 Paper
Oct 2, 2023

UAI 23, EMNLP 23 (Oral), and NeurIPS 23 Papers Accepted

  • EMNLP 2023 Oral:

    "Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition"

    📄 Paper 💻 Code
  • NeurIPS 2023:

    "Hyporadise: An open baseline for generative speech recognition with large language models"

    📄 Paper
  • UAI 2023:

    "Pessimistic Model Selection for Offline Deep Reinforcement Learning"

    📄 Paper