Yuhang Cao

Orcid: 0009-0008-3627-590X

According to our database1, Yuhang Cao authored at least 25 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models.
CoRR, 2024

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model.
CoRR, 2024

2023
PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System.
CoRR, 2023

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition.
CoRR, 2023

DiaCorrect: Error Correction Back-end For Speaker Diarization.
CoRR, 2023

V3Det: Vast Vocabulary Visual Detection Dataset.
CoRR, 2023

Exploring the Power of Cross-Contextual Large Language Model in Mimic Emotion Prediction.
Proceedings of the 4th on Multimodal Sentiment Analysis Challenge and Workshop: Mimicked Emotions, 2023

Multimodal Cross-Lingual Features and Weight Fusion for Cross-Cultural Humor Detection.
Proceedings of the 4th on Multimodal Sentiment Analysis Challenge and Workshop: Mimicked Emotions, 2023

V3Det: Vast Vocabulary Visual Detection Dataset.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

A Dynamic Partial Reconfigurable CGRA Framework for Multi-Kernel Applications.
Proceedings of the International Conference on Field Programmable Technology, 2023

E<sup>2</sup>-ACE: An Energy-Efficient Reconfigurable Crypto-Accelerator with Agile End-to-End Toolchain.
Proceedings of the International Conference on Field Programmable Technology, 2023

PP-MET: A Real-World Personalized Prompt Based Meeting Transcription System.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
MINI: Mining Implicit Novel Instances for Few-Shot Object Detection.
CoRR, 2022

The USTC-Ximalaya System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription (M2met) Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022

TRAM: An Open-Source Template-based Reconfigurable Architecture Modeling Framework.
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

2021
WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection.
CoRR, 2021

Few-Shot Object Detection via Association and DIscrimination.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Seesaw Loss for Long-Tailed Instance Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Feature Pyramid Grids.
CoRR, 2020

Side-Aware Boundary Localization for More Precise Object Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020

Prime Sample Attention in Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Speaker Direction-of-Arrival Estimation Based on Orthogonal Dipoles.
Circuits Syst. Signal Process., 2019

MMDetection: Open MMLab Detection Toolbox and Benchmark.
CoRR, 2019

Investigation of Cost Function for Supervised Monaural Speech Separation.
Proceedings of the Interspeech 2019, 2019

2017
Speaker Direction-of-Arrival Estimation Based on Frequency-Independent Beampattern.
Proceedings of the Interspeech 2017, 2017


  Loading...