Peiwen Sun

Orcid: 0009-0005-3016-8554

According to our database¹, Peiwen Sun authored at least 22 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Talker-T2AV: Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling.

[BibT_eX]

[DOI]

CoRR, April, 2026

AURA: Always-On Understanding and Real-Time Assistance via Video Streams.

[BibT_eX]

[DOI]

CoRR, April, 2026

PhoStream: Benchmarking Real-World Streaming for Omnimodal Assistants in Mobile Scenarios.

[BibT_eX]

[DOI]

CoRR, January, 2026

2025

OneThinker: All-in-one Reasoning Model for Image and Video.

[BibT_eX]

[DOI]

CoRR, December, 2025

PrismAudio: Decomposed Chain-of-Thoughts and Multi-dimensional Rewards for Video-to-Audio Generation.

[BibT_eX]

[DOI]

CoRR, November, 2025

SpaceVista: All-Scale Visual Spatial Reasoning from mm to km.

[BibT_eX]

[DOI]

CoRR, October, 2025

OmniAudio: Generating Spatial Audio from 360-Degree Video.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

FusionINN: Invertible Image Fusion for Brain Tumor Monitoring.

[BibT_eX]

[DOI]

CoRR, 2024

FusionINN: Decomposable Image Fusion for Brain Tumor Monitoring.

[BibT_eX]

[DOI]

Proceedings of the Trustworthy Artificial Intelligence for Healthcare, 2024

FlashSpeech: Efficient Zero-Shot Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Unveiling and Mitigating Bias in Audio Visual Segmentation.

[BibT_eX]

[DOI]

Peiwen Sun

Honggang Zhang

Di Hu

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Enhancing Few-shot Classification through Token Selection for Balanced Learning.

[BibT_eX]

[DOI]

Wangding Zeng

Peiwen Sun

Honggang Zhang

Proceedings of the International Joint Conference on Neural Networks, 2024

Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Can Textual Semantics Mitigate Sounding Object Segmentation Preference?

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

More than Vanilla Fusion: a Simple, Decoupling-free, Attention Module for Multimodal Fusion Based on Signal Theory.

[BibT_eX]

[DOI]

CoRR, 2023

Predicting Central Cervical Lymph Node Metastasis of Papillary Thyroid Carcinomas Using Multi-view Ultrasound Images.

[BibT_eX]

[DOI]

Proceedings of 2023 International Conference on Medical Imaging and Computer-Aided Diagnosis, 2023

A Method of Audio-Visual Person Verification by Mining Connections between Time Series.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022

Learning Audio-Visual embedding for Wild Person Verification.

[BibT_eX]

[DOI]

CoRR, 2022

2019

A New Type of ROS-Based Pedagogical Robot for Kids' Mathematics Education.

[BibT_eX]

[DOI]

Mahmood Al-khassaweneh

Proceedings of the 2019 IEEE International Conference on Electro Information Technology, 2019

Peiwen Sun

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...