Jiaqi Wang

Orcid: 0000-0001-6877-5353

Affiliations:
  • Shanghai Artificial Intelligence Laboratory, China


According to our database1, Jiaqi Wang authored at least 45 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Long-CLIP: Unlocking the Long-Text Capability of CLIP.
CoRR, 2024

RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition.
CoRR, 2024

SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation.
CoRR, 2024

DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models.
CoRR, 2024

SepRep-Net: Multi-source Free Domain Adaptation via Model Separation And Reparameterization.
CoRR, 2024

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model.
CoRR, 2024

VIGC: Visual Instruction Generation and Correction.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
OneLLM: One Framework to Align All Modalities with Language.
CoRR, 2023

GPT4Point: A Unified Framework for Point-Language Understanding and Generation.
CoRR, 2023

OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation.
CoRR, 2023

Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization.
CoRR, 2023

ShareGPT4V: Improving Large Multi-Modal Models with Better Captions.
CoRR, 2023

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition.
CoRR, 2023

MLLM-DataEngine: An Iterative Refinement Approach for MLLM.
CoRR, 2023

WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models.
CoRR, 2023

MMBench: Is Your Multi-modal Model an All-around Player?
CoRR, 2023

OCBEV: Object-Centric BEV Transformer for Multi-View 3D Object Detection.
CoRR, 2023

CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.
CoRR, 2023

V3Det: Vast Vocabulary Visual Detection Dataset.
CoRR, 2023

HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image.
Proceedings of the SIGGRAPH Asia 2023 Conference Papers, 2023

Zero-shot Skeleton-based Action Recognition via Mutual Information Estimation and Maximization.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.
Proceedings of the International Conference on Machine Learning, 2023

Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Dense Distinct Query for End-to-End Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Multi-Level Logit Distillation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Self-Supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Semantics-Aware Dynamic Localization and Refinement for Referring Image Segmentation.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
CARAFE++: Unified Content-Aware ReAssembly of FEatures.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action Recognition.
CoRR, 2022

What Are Expected Queries in End-to-End Object Detection?
CoRR, 2022

MINI: Mining Implicit Novel Instances for Few-Shot Object Detection.
CoRR, 2022

Semi-Supervised Semantic Segmentation via Gentle Teaching Assistant.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

PYSKL: Towards Good Practices for Skeleton Action Recognition.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

LAVT: Language-Aware Vision Transformer for Referring Image Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Texture Memory-Augmented Deep Patch-Based Image Inpainting.
IEEE Trans. Image Process., 2021

Few-Shot Object Detection via Association and DIscrimination.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

MMFashion: An Open-Source Toolbox for Visual Fashion Analysis.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Seesaw Loss for Long-Tailed Instance Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Side-Aware Boundary Localization for More Precise Object Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
MMDetection: Open MMLab Detection Toolbox and Benchmark.
CoRR, 2019

CARAFE: Content-Aware ReAssembly of FEatures.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Region Proposal by Guided Anchoring.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Hybrid Task Cascade for Instance Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Optimizing Video Object Detection via a Scale-Time Lattice.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018


  Loading...