Profile photo

Shitong Xu

Ph.D. Candidate in Computer Science
University of Oxford

shitong.xu [at] cs.ox.ac.uk


GitHub | LinkedIn | Google Scholar

About Me

I am a third-year Ph.D. candidate in Computer Science at the University of Oxford, supervised by Prof. Niki Trigoni and Prof. Andrew Markham. I obtained my Master and Bachlor degree in Imperial College London, where I completed my Master thesis under the supervision of Prof. Ben Glocker.

Research Interest

During my PhD, my research focuses on developing deep learning models that integrate information across multiple modalities, including audio, text, and electroglottographic (EGG) signals, to understand real-world sound sources' spatial, temporal, and characteristic features. More broadly, my research interests include:

  • Audio Signal Processing
  • Multimodal Large Language Models
  • Physics-Informed Machine Learning
In my previous projects, I explored encoding the target speaker's speech characteristics from noisy audio enrollment pairs, addressing the scarcity of clean enrollment audio in real-world target speaker extraction scenario. I also coauthored with labmates on spatial audio processing projects, including reverberation effect editing and sound source localization with distributed microphones. Building on these experience, I am now extending large audio-language models (LALMs) to perform complex audio processing tasks such as audio retrieval and spatial audio understanding.

I'm actively looking for internship and collaboration opportunities — feel free to reach out via email (shitong.xu [at] cs.ox.ac.uk).

Research Experience

Target Speaker Extraction through Comparing Noisy Positive and Negative Audio Enrollments.
Shitong Xu, Yiyuan Yang, Niki Trigoni, Andrew Markham.
Neural Information Processing Systems (NeurIPS), 2025

Efficient and Microphone-Fault-Tolerant 3D Sound Source Localization.
Yiyuan Yang, Shitong Xu, Niki Trigoni, Andrew Markham.
Interspeech 2025

SPEAR: Receiver-to-Receiver Acoustic Neural Warping Field.
Yuhang He*, Shitong Xu*, Jiaxing Zhong, Sangyun Shin, Niki Trigoni, Andrew Markham.
2024

CLIP-Diffusion-LM: Apply Diffusion Model on Image Captioning.
Shitong Xu
2022

Education and Intern Experience

Education

University of Oxford
Oct 2023 - Present
PhD in Computer Science
Imperial College London
Oct 2019 - Aug 2023
MEng (Integrated Bachelor's and Master's) in Computing
Graduated with First Class Honours

Internship

Software Engineer Intern
Cisco system, Apr 2022 - Sep 2022
Advised by Shubham Bakshi, Arthur Drozdov

Teaching Assistant

Deep Learning for Healthcare
Physics Informed Machine Learning