Workshop Program

Date & Time: April 7, 2:00 pm – 5:30 pm

Venue: HYDERABAD INTERNATIONAL CONVENTION CENTRE (HICC)

Room: MR1.04

Detailed Program

Overview Talk: 15 mins (2:00 – 2:15 pm)

Invited Talk: Oriol Nieto – 25 mins (2:15 – 2:45 pm)

Invited Talk: Bhuvana Ramabhadran – 25 mins (2:45 – 3:15 pm)

Invited Talk: Zhuo Chen – 25 mins (3:15 – 3:45 pm)

Tea Break: 15 mins (3:45 – 4:00 pm)

Poster Session: 90 mins (4:00 – 5:30 pm)

List of Accepted Papers

  1. Performance evaluation of SLAM-ASR: The Good, The Bad, The Ugly, and the Way Forward
    Authors: Shashi Kumar, Iuliia Thorbecke, Sergio Burdisso, Esaú Villatoro-Tello, Manjunath K E, Kadri Hacioğlu, Pradeep Rangappa, Petr Motlicek, Aravind Ganapathiraju, Andreas Stolcke
  2. StableTTS: Towards Efficient Denoising Acoustic Decoder for Text to Speech Synthesis with Consistency Flow Matching
    Authors: Zhiyong Chen, Xinnuo Li, Shuhang Wu, Zhi Yang, Zhiqi Ai, Shugong Xu
  3. USMID: A Unimodal Speaker-Level Membership Inference Detector for Contrastive Pretraining
    Authors: Ruoxi Cheng, Yizhong Ding, cao shuirong, Shitong Shao, Zhiqiang Wang
  4. MACE: Leveraging Audio for Evaluating Audio Captioning Systems
    Authors: Satvik Dixit, Soham Deshmukh, Bhiksha Raj
  5. Musimple: A Simplified Music Generation System With Diffusion Transformer
    Authors: Zheqi Dai, Haolin He, Qiuqiang Kong
  6. Discrete Speech Unit Extraction via Independent Component Analysis
    Authors: Tomohiko Nakamura, Kwanghee Choi, Keigo Hojo, Yoshiaki Bando, Satoru Fukayama, Shinji Watanabe
  7. PAWS: A Physical Acoustic Wave Simulation Dataset for Sound Modeling and Rendering
    Authors: Tianming Yin, Yiyang Zhou, Xuzhou Ye, Qiuqiang Kong
  8. Indics2ST: Indian Multilingual Translation Corpus For Evaluating Speech-Large Language Models
    Authors: Sanket Shah, Kavya Saxena, Kancharana Manideep Bharadwaj, Sharath Adavanne, Nagaraj Adiga
  9. Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection
    Authors: Han Yin, Yang Xiao, Jisheng Bai, Rohan Kumar Das
  10. Closing the Loop on Speech to Music Translation: Automatically Generating Synthetic Percussive Sequences on the Mridangam from Konnakol
    Authors: Gopika Krishnan, Julia Drabek, Akshay Anantapadmanabhan, Kaustuv Kanti Ganguli, Carlos Guedes
  11. TSPE: Task-Specific Prompt Ensemble for Improved Zero-Shot Audio Classification
    Authors: Nishit Anand, Ashish Seth, Ramani Duraiswami, Dinesh Manocha
  12. A Suite for Acoustic Language Model Evaluation
    Authors: Gallil Maimon, Amit Roth, Yossi Adi