Peer-Reviewed Articles

Representative figure for A Generative-First Neural Audio Autoencoder

In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) · 2026

A Generative-First Neural Audio Autoencoder

Jonah Casebeer, Ge Zhu, Zhepei Wang, Nicholas J. Bryan

Representative figure for Rethinking Music Captioning with Music Metadata LLMs

In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) · 2026

Rethinking Music Captioning with Music Metadata LLMs

Irmak Bukey, Zhepei Wang, Chris Donahue, Nicholas J. Bryan

Representative figure for On Class Separability Pitfalls In Audio-Text Contrastive Zero-Shot Learning

In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) · 2025

On Class Separability Pitfalls In Audio-Text Contrastive Zero-Shot Learning

Tiago Tavares, Fabio Ayres, Zhepei Wang, Paris Smaragdis

Representative figure for Audio Editing with Non-Rigid Text Prompts

In Interspeech · 2024

Audio Editing with Non-Rigid Text Prompts

Francesco Paissan, Luca Della Libera, Zhepei Wang, Paris Smaragdis, Mirco Ravanelli, Yusuf Cem Subakan

Representative figure for A Framework for Unified Real-time Personalized and Non-Personalized Speech Enhancement

In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) · 2023

A Framework for Unified Real-time Personalized and Non-Personalized Speech Enhancement

Zhepei Wang, Ritwik Giri, Devansh Shah, Jean-Marc Valin, Michael M. Goodwin, Paris Smaragdis

Representative figure for Unsupervised Improvement of Audio-Text Cross-Modal Representations

In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) · 2023

Unsupervised Improvement of Audio-Text Cross-Modal Representations

Zhepei Wang, Yusuf Cem Subakan, Krishna Subramani, Junkai Wu, Tiago Tavares, Fabio Ayres, Paris Smaragdis

Representative figure for Compute and Memory Efficient Universal Sound Source Separation

In Journal of Signal Processing Systems · 2022

Compute and Memory Efficient Universal Sound Source Separation

Efthymios Tzinis, Zhepei Wang, Xilin Jiang, Paris Smaragdis

Representative figure for Improved Singing Voice Separation with Chromagram-Based Pitch-Aware Remixing

In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) · 2022

Improved Singing Voice Separation with Chromagram-Based Pitch-Aware Remixing

Siyuan Yuan, Zhepei Wang, Umut Isik, Ritwik Giri, Jean-Marc Valin, Michael M. Goodwin, Arvindh Krishnaswamy

Representative figure for Learning Representations for New Sound Classes With Continual Self-Supervised Learning

In IEEE Signal Processing Letters · 2022

Learning Representations for New Sound Classes With Continual Self-Supervised Learning

Zhepei Wang, Yusuf Cem Subakan, Xilin Jiang, Junkai Wu, Efthymios Tzinis, Mirco Ravanelli, Paris Smaragdis

Representative figure for Semi-Supervised Singing Voice Separation With Noisy Self-Training

In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) · 2021

Semi-Supervised Singing Voice Separation With Noisy Self-Training

Zhepei Wang, Ritwik Giri, Umut Isik, Jean-Marc Valin, Arvindh Krishnaswamy

Representative figure for Separate But Together: Unsupervised Federated Learning for Speech Enhancement from Non-IID Data

In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) · 2021

Separate But Together: Unsupervised Federated Learning for Speech Enhancement from Non-IID Data

Efthymios Tzinis, Jonah Casebeer, Zhepei Wang, Paris Smaragdis

Representative figure for Sound Event Detection with Adaptive Frequency Selection

In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) · 2021

Sound Event Detection with Adaptive Frequency Selection

Zhepei Wang, Jonah Casebeer, Adam Clemmitt, Efthymios Tzinis, Paris Smaragdis

Representative figure for Sudo RM -RF: Efficient Networks for Universal Audio Source Separation

In IEEE International Workshop on Machine Learning for Signal Processing (MLSP) · 2020

Sudo RM -RF: Efficient Networks for Universal Audio Source Separation

Efthymios Tzinis, Zhepei Wang, Paris Smaragdis

Representative figure for Two-Step Sound Source Separation: Training On Learned Latent Targets

In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) · 2020

Two-Step Sound Source Separation: Training On Learned Latent Targets

Efthymios Tzinis, Shrikant Venkataramani, Zhepei Wang, Yusuf Cem Subakan, Paris Smaragdis

Representative figure for Continual Learning of New Sound Classes Using Generative Replay

In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) · 2019

Continual Learning of New Sound Classes Using Generative Replay

Zhepei Wang, Yusuf Cem Subakan, Efthymios Tzinis, Paris Smaragdis, Laurent Charlin

Representative figure for Multi-View Networks For Multi-Channel Audio Classification

In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) · 2019

Multi-View Networks For Multi-Channel Audio Classification

Jonah Casebeer, Zhepei Wang, Paris Smaragdis

Preprints

Representative figure for Timestamped Audio Captioning

arXiv preprint arXiv:2602.15766 · 2026

Timestamped Audio Captioning

Sonal Kumar, Prem Seetharaman, Ke Chen, Oriol Nieto, Jiaqi Su, Zhepei Wang, Rithesh Kumar, Dinesh Manocha, Nicholas J. Bryan, Zeyu Jin, Justin Salamon

Representative figure for Semi-supervised Time Domain Target Speaker Extraction with Attention

preprint · 2022

Semi-supervised Time Domain Target Speaker Extraction with Attention

Zhepei Wang, Ritwik Giri, Shrikant Venkataramani, Umut Isik, Jean-Marc Valin, Paris Smaragdis, Michael M. Goodwin, Arvindh Krishnaswamy

Patents and Patent Applications

U.S. Patent 12,531,067 · 2026

Semi-supervised Training of a Machine Learning Model for Target Speaker Audio Enhancement

Ritwik Giri, Michael Mark Goodwin, Arvindh Krishnaswamy, Mehmet Umut Isik, Jean-Marc Valin, Zhepei Wang, Shrikant Venkataramani, Paris Smaragdis

U.S. Patent Application 2025/0111857 · 2025

Unified Audio Suppression Model

Ritwik Giri, Zhepei Wang, Devansh Shah, Jean-Marc Valin, Michael Mark Goodwin

Reports

Clinic (Senior Capstone Project) Report in Computer Science, Harvey Mudd College · 2018

Image-Text Classification to Correct the Amazon PrimeNow Search Experience

Zhepei Wang, Alex Mitchell, Kofi Sekyi-Appiah, Tina Zhu

Thesis

Doctoral Dissertation in Computer Science, University of Illinois Urbana-Champaign · 2023

Data-Efficient Approaches for Audio Classification and Separation

Zhepei Wang