Zexu Pan

Orcid: 0000-0002-8106-1176

According to our database1, Zexu Pan authored at least 22 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Hierarchical Edge Refinement Network for Guided Depth Map Super-Resolution.
IEEE Trans. Computational Imaging, 2024

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization.
CoRR, 2024

Restoring Speaking Lips from Occlusion for Audio-Visual Speech Recognition.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Time-Domain Speech Separation Networks With Graph Encoding Auxiliary.
IEEE Signal Process. Lett., 2023

NeuroHeed+: Improving Neuro-steered Speaker Extraction with Joint Auditory Attention Detection.
CoRR, 2023

Generation or Replication: Auscultating Audio Latent Diffusion Models.
CoRR, 2023

LocSelect: Target Speaker Localization with an Auditory Selective Hearing Mechanism.
CoRR, 2023

Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker Speech.
CoRR, 2023

Rethinking the visual cues in audio-visual speaker extraction.
CoRR, 2023

Target Active Speaker Detection with Audio-visual Cues.
CoRR, 2023

ImagineNet: Target Speaker Extraction with Intermittent Visual Cue Through Embedding Inpainting.
Proceedings of the IEEE International Conference on Acoustics, 2023

Scenario-Aware Audio-Visual TF-Gridnet for Target Speech Extraction.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Selective Listening by Synchronizing Speech With Lips.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

USEV: Universal Speaker Extraction With Visual Cue.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Speaker Extraction With Co-Speech Gestures Cue.
IEEE Signal Process. Lett., 2022

Towards End-to-end Speaker Diarization in the Wild.
CoRR, 2022

A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction.
Proceedings of the Interspeech 2022, 2022

VCSE: Time-Domain Visual-Contextual Speaker Extraction Network.
Proceedings of the Interspeech 2022, 2022

2021
Is Someone Speaking?: Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Multi-Target DoA Estimation with an Audio-Visual Fusion Mechanism.
Proceedings of the IEEE International Conference on Acoustics, 2021

Muse: Multi-Modal Target Speaker Extraction with Visual Cues.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Multi-Modal Attention for Speech Emotion Recognition.
Proceedings of the Interspeech 2020, 2020


  Loading...