Chun Yuan

Orcid: 0000-0002-3590-6676

Affiliations:
  • Tsinghua University, Shenzhen, China (PhD 2003)


According to our database1, Chun Yuan authored at least 305 papers between 2003 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
AgentCVR: Active Multi-Agent Cross-Video Reasoning via Script-Simulated Reinforcement Learning.
CoRR, May, 2026

VCap: Hypergeometric Rewards for Weak-to-Strong Visual Captioning.
CoRR, May, 2026

Memory Grafting: Scaling Language Model Pre-training via Offline Conditional Memory.
CoRR, May, 2026

Love Me, Love My Label: Rethinking the Role of Labels in Prompt Retrieval for Visual In-Context Learning.
CoRR, April, 2026

SelaVPR++: Towards Seamless Adaptation of Foundation Models for Efficient Place Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2026

FCL-COD: Weakly Supervised Camouflaged Object Detection with Frequency-aware and Contrastive Learning.
CoRR, March, 2026

SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Editing.
CoRR, March, 2026

PromptHub: Enhancing Multi-Prompt Visual In-Context Learning with Locality-Aware Fusion, Concentration and Alignment.
CoRR, March, 2026

AdaMem: Adaptive User-Centric Memory for Long-Horizon Dialogue Agents.
CoRR, March, 2026

Enhancing Geometric Perception in VLMs via Translator-Guided Reinforcement Learning.
CoRR, February, 2026

ContextRL: Enhancing MLLM's Knowledge Discovery Efficiency with Context-Augmented RL.
CoRR, February, 2026

Learning Cross-View Object Correspondence via Cycle-Consistent Mask Prediction.
CoRR, February, 2026

PRIME: A Process-Outcome Alignment Benchmark for Verifiable Reasoning in Mathematics and Engineering.
CoRR, February, 2026

NarraScore: Bridging Visual Narrative and Musical Dynamics via Hierarchical Affective Control.
CoRR, February, 2026

R-Align: Enhancing Generative Reward Models through Rationale-Centric Meta-Judging.
CoRR, February, 2026

Task-Distributionally Robust Data-Free Meta-Learning.
IEEE Trans. Pattern Anal. Mach. Intell., January, 2026

Generation Enhances Understanding in Unified Multimodal Models via Multi-Representation Generation.
CoRR, January, 2026

What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study.
CoRR, January, 2026

High-Frequency Prioritized Sparse Attention Network for Image Restoration.
IEEE Trans. Multim., 2026

M3Time: LLM-Enhanced Multi-Modal, Multi-Scale, and Multi-Frequency Multivariate Time Series Forecasting.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

Prune&Comp: Free Lunch for Layer-Pruned LLMs via Iterative Pruning with Magnitude Compensation.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
VACoT: Rethinking Visual Data Augmentation with VLMs.
CoRR, December, 2025

ChartPoint: Guiding MLLMs with Grounding Reflection for Chart Reasoning.
CoRR, December, 2025

Accelerating Zero-Shot NAS With Feature Map-Based Proxy and Operation Scoring Function.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2025

Visual Generation Tuning.
CoRR, November, 2025

VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction.
CoRR, November, 2025

Look Where It Matters: Training-Free Ultra-HR Remote Sensing VQA via Adaptive Zoom Search.
CoRR, November, 2025

OmniAlpha: A Sequence-to-Sequence Framework for Unified Multi-Task RGBA Generation.
CoRR, November, 2025

Learning to Pose Problems: Reasoning-Driven and Solver-Adaptive Data Synthesis for Large Reasoning Models.
CoRR, November, 2025

Towards Implicit Aggregation: Robust Image Representation for Place Recognition in the Transformer Era.
CoRR, November, 2025

TextIR: A Simple Framework for Text-Based Editable Image Restoration.
IEEE Trans. Vis. Comput. Graph., October, 2025

FlashVSR: Towards Real-Time Diffusion-Based Streaming Video Super-Resolution.
CoRR, October, 2025

Mixture of Neuron Experts.
CoRR, October, 2025

DiffuSpec: Unlocking Diffusion Language Models for Speculative Decoding.
CoRR, October, 2025

EDTformer: An Efficient Decoder Transformer for Visual Place Recognition.
IEEE Trans. Circuits Syst. Video Technol., September, 2025

UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios.
CoRR, September, 2025

AlphaVAE: Unified End-to-End RGBA Image Reconstruction and Generation with Alpha-Aware Representation Learning.
CoRR, July, 2025

ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices.
CoRR, June, 2025

SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought.
CoRR, May, 2025

CLIP-AE: CLIP-assisted Cross-view Audio-Visual Enhancement for Unsupervised Temporal Action Localization.
CoRR, May, 2025

Unifying Multimodal Large Language Model Capabilities and Modalities via Model Merging.
CoRR, May, 2025

Boosting Neural Language Inference via Cascaded Interactive Reasoning.
CoRR, May, 2025

InstructEngine: Instruction-driven Text-to-Image Alignment.
CoRR, April, 2025

Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models.
CoRR, April, 2025

MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs.
CoRR, March, 2025

MIGA: Mutual Information-Guided Attack on Denoising Models for Semantic Manipulation.
CoRR, March, 2025

Zero Token-Driven Deep Thinking in LLMs: Unlocking the Full Potential of Existing Parameters via Cyclic Refinement.
CoRR, February, 2025

DiffoRA: Enabling Parameter-Efficient LLM Fine-Tuning via Differential Low-Rank Matrix Adaptation.
CoRR, February, 2025

Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent.
CoRR, January, 2025

Boosting Long-Tailed Recognition With Label Descriptor and Beyond.
IEEE Trans. Multim., 2025

Unsupervised Domain Adaptive Visual Question Answering in the Era of Multi-Modal Large Language Models.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Cobra: Efficient Line Art COlorization with BRoAder References.
Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference, 2025

Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

NaDRO: Leveraging Dual-Reward Strategies for LLMs Training on Noisy Data.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

A Simple Linear Patch Revives Layer-Pruned Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

FlatQuant: Flatness Matters for LLM Quantization.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Preference Optimization for Combinatorial Optimization Problems.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Enhancing Logits Distillation with Plug&Play Kendall's τ Ranking Loss.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Whoever Started the interference Should End It: Guiding Data-Free Model Merging via Task Vectors.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

EAV-Mamba: Efficient Audio-Visual Representation Learning for Weakly-Supervised Temporal Action Localization.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Semantic Alignment and Hard Sample Retraining for Visible-Infrared Person Re-Identification.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Open-Vocabulary Customization from CLIP via Data-Free Knowledge Distillation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

KPEE: A Two-Stage Proposal-Based Reformulation of Event Extraction.
Proceedings of the Advanced Intelligent Computing Technology and Applications, 2025

Using External Knowledge to Enhanced PLM for Semantic Matching.
Proceedings of the Advanced Intelligent Computing Technology and Applications, 2025

Visual-Semantic Dual Calibration Network for Zero-Shot Learning.
Proceedings of the Advanced Intelligent Computing Technology and Applications, 2025

ChartPoint: Guiding MLLMs with Grounding Reflection for Chart Reasoning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

UniGlyph: Unified Segmentation-Conditioned Diffusion for Precise Visual Text Synthesis.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

SparseFlex: High-Resolution and Arbitrary-Topology 3D Shape Modeling.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Text-Guided Visual Prompt DINO for Generic Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Vpr-Cloak: a First Look at Privacy Cloak Against Visual Place Recognition.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

TextureDiffusion: Target Prompt Disentangled Editing for Various Texture Transfer.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Weakly Supervised Temporal Action Localization via Dual-Prior Collaborative Learning Guided by Multimodal Large Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

LoRA Recycle: Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Rethinking Pseudo-Label Guided Learning for Weakly Supervised Temporal Action Localization from the Perspective of Noise Correction.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word Emphasis.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

Aligning Composed Query with Image via Discriminative Perception from Negative Correspondences.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Meta-Learning Without Data via Unconditional Diffusion Models.
IEEE Trans. Circuits Syst. Video Technol., November, 2024

PatchNet: Maximize the Exploration of Congeneric Semantics for Weakly Supervised Semantic Segmentation.
IEEE Trans. Neural Networks Learn. Syst., August, 2024

The Fittest Wins: A Multistage Framework Achieving New SOTA in ViZDoom Competition.
IEEE Trans. Games, March, 2024

Towards Effective Collaborative Learning in Long-Tailed Recognition.
IEEE Trans. Multim., 2024

Negative-Sensitive Framework With Semantic Enhancement for Composed Image Retrieval.
IEEE Trans. Multim., 2024

Low-Rank Correlation Learning for Unsupervised Domain Adaptation.
IEEE Trans. Multim., 2024

Efficiently Adapt to New Dynamic via Meta-Model.
J. Artif. Intell. Res., 2024

ColorFlow: Retrieval-Augmented Image Sequence Colorization.
CoRR, 2024

ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts.
CoRR, 2024

Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs.
CoRR, 2024

Multi-Task Model Merging via Adaptive Weight Disentanglement.
CoRR, 2024

Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment.
CoRR, 2024

Kendall's τ Coefficient for Logits Distillation.
CoRR, 2024

Learning to Compress Contexts for Efficient Knowledge-based Visual Question Answering.
CoRR, 2024

ChartMoE: Mixture of Expert Connector for Advanced Chart Understanding.
CoRR, 2024

Learn To Learn More Precisely.
CoRR, 2024

Supervised Fine-tuning in turn Improves Visual Foundation Models.
CoRR, 2024

Solving Continual Offline Reinforcement Learning with Decision Transformer.
CoRR, 2024

SuperVLAD: Compact and Robust Image Descriptors for Visual Place Recognition.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

CustomNet: Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Semantic Distillation from Neighborhood for Composed Image Retrieval.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Stereo Matching Method with Integrated Geometric Encoding for Disparity Refinement.
Proceedings of the International Joint Conference on Neural Networks, 2024

Noise Weighting Phased Prompt Image Editing.
Proceedings of the International Joint Conference on Neural Networks, 2024

Integrating Local & Global Features for Estimating Shortest-path Distance in Large-scale Graphs.
Proceedings of the International Joint Conference on Neural Networks, 2024

MMR: Multi-scale Motion Retargeting between Skeleton-agnostic Characters.
Proceedings of the International Joint Conference on Neural Networks, 2024

Error Bound Based Noise Schedule Design in Diffusion Models.
Proceedings of the International Joint Conference on Neural Networks, 2024

AMC-OA: Adaptive Multi-Scale Convolutional Networks with Optimized Attention for Temporal Action Localization.
Proceedings of the Pattern Recognition - 27th International Conference, 2024

Benchmarking AI in Mental Health: A Critical Examination of LLMs Across Key Performance and Ethical Metrics.
Proceedings of the Pattern Recognition - 27th International Conference, 2024

Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

DFD: Distilling the Feature Disparity Differently for Detectors.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Sparse Model Inversion: Efficient Inversion of Vision Transformers for Data-Free Applications.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Managing the Personality of NPCs with Your Interactions: A Game Design System Based on Large Language Models.
Proceedings of the HCI in Games, 2024

A Task Is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting.
Proceedings of the Computer Vision - ECCV 2024, 2024

Boosting Pose Estimators via Cross-Representation Distillation.
Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections.
Proceedings of the Computer Vision - ECCV 2024, 2024

GVGEN: Text-to-3D Generation with Volumetric Representation.
Proceedings of the Computer Vision - ECCV 2024, 2024

DreamDiffusion: High-Quality EEG-to-Image Generation with Temporal Masked Signal Modeling and CLIP Alignment.
Proceedings of the Computer Vision - ECCV 2024, 2024

Distilling Semantic Priors from SAM to Efficient Image Restoration Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ViTKD: Feature-based Knowledge Distillation for Vision Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Free: Faster and Better Data-Free Meta-Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

CricaVPR: Cross-Image Correlation-Aware Representation Learning for Visual Place Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Efficient Conditional Diffusion Model with Probability Flow Sampling for Image Super-resolution.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Mean Teacher DETR with Masked Feature Alignment: A Robust Domain Adaptive Detection Transformer Framework.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Deep Homography Estimation for Visual Place Recognition.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Blind Face Restoration under Extreme Conditions: Leveraging 3D-2D Prior Fusion for Superior Structural and Texture Recovery.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Task-Adaptive Feature Disentanglement and Hallucination for Few-Shot Classification.
IEEE Trans. Circuits Syst. Video Technol., August, 2023

Design of Fatigue Driving Behavior Detection Based on Circle Hough Transform.
Big Data, February, 2023

Weakly Supervised Instance Segmentation by Exploring Entire Object Regions.
IEEE Trans. Multim., 2023

HHF: Hashing-Guided Hinge Function for Deep Hashing Retrieval.
IEEE Trans. Multim., 2023

StrokeNet: Stroke Assisted and Hierarchical Graph Reasoning Networks.
IEEE Trans. Multim., 2023

High-Frequency Normalizing Flow for Image Rescaling.
IEEE Trans. Image Process., 2023

Efficient Multi-Goal Reinforcement Learning via Value Consistency Prioritization.
J. Artif. Intell. Res., 2023

ChartBench: A Benchmark for Complex Visual Reasoning in Charts.
CoRR, 2023

CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models.
CoRR, 2023

DreamDiffusion: Generating High-Quality Images from Brain EEG Signals.
CoRR, 2023

Neural Machine Translation with Dynamic Graph Convolutional Decoder.
CoRR, 2023

HMC: Hierarchical Mesh Coarsening for Skeleton-free Motion Retargeting.
CoRR, 2023

ITstyler: Image-optimized Text-based Style Transfer.
CoRR, 2023

Towards Arbitrary Text-driven Image Manipulation via Space Alignment.
CoRR, 2023

MeGraph: Capturing Long-Range Interactions by Alternating Local and Hierarchical Aggregation on Multi-Scaled Graph Hierarchy.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Adaptive Contrastive Learning for Learning Robust Representations under Label Noise.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Enhanced Image Deblurring: An Efficient Frequency Exploitation and Preservation Network.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

DFVSR: Directional Frequency Video Super-Resolution via Asymmetric and Enhancement Alignment Network.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

AANet: Aggregation and Alignment Network with Semi-hard Positive Sample Mining for Hierarchical Place Recognition.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.
Proceedings of the International Conference on Machine Learning, 2023

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning.
Proceedings of the International Conference on Machine Learning, 2023

MA-NeRF: Motion-Assisted Neural Radiance Fields for Face Synthesis from Sparse Images.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

PS-NeRV: Patch-Wise Stylized Neural Representations for Videos.
Proceedings of the IEEE International Conference on Image Processing, 2023

Effective Whole-body Pose Estimation with Two-stages Distillation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

When Noisy Labels Meet Long Tail Dilemmas: A Representation Calibration Method.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Accurate 3D Face Reconstruction with Facial Component Tokens.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Make Encoder Great Again in 3D GAN Inversion through Geometry and Occlusion-Aware Encoding.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Rethink Long-Tailed Recognition with Vision Transforms.
Proceedings of the IEEE International Conference on Acoustics, 2023

Frequency Reciprocal Action and Fusion for Single Image Super-Resolution.
Proceedings of the IEEE International Conference on Acoustics, 2023

Learning Imbalanced Data with Vision Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

High-fidelity Facial Avatar Reconstruction from Monocular Video with Generative Priors.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

LET: Leveraging Error Type Information for Grammatical Error Correction.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Darwinian Model Upgrades: Model Evolving with Selective Compatibility.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Truncate-Split-Contrast: A Framework for Learning from Mislabeled Videos.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Learning with Noisily-labeled Class-imbalanced Data.
CoRR, 2022

DCE: Offline Reinforcement Learning With Double Conservative Estimates.
CoRR, 2022

ViTKD: Practical Guidelines for ViT feature knowledge distillation.
CoRR, 2022

Rethinking Knowledge Distillation via Cross-Entropy.
CoRR, 2022

Improving the Latent Space of Image Style Transfer.
CoRR, 2022

Privacy-Preserving Model Upgrades with Bidirectional Compatible Training in Image Retrieval.
CoRR, 2022

Hot-Refresh Model Upgrades with Regression-Alleviating Compatible Training in Image Retrieval.
CoRR, 2022

One Model to Edit Them All: Free-Form Text-Driven Image Manipulation with Semantic Modulations.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Federated Knowledge Transfer for Heterogeneous Visual Models.
Proceedings of the 4th ACM International Conference on Multimedia in Asia, 2022

Tackling Instance-Dependent Label Noise with Dynamic Distribution Calibration.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

HyP<sup>2</sup> Loss: Beyond Hypersphere Metric Space for Multi-label Image Retrieval.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

DeViT: Deformed Vision Transformers in Video Inpainting.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

ACEs: Unsupervised Multi-label Aspect Detection with Aspect-category Experts.
Proceedings of the International Joint Conference on Neural Networks, 2022

PTS: A Prompt-based Teacher-Student Network for Weakly Supervised Aspect Detection.
Proceedings of the International Joint Conference on Neural Networks, 2022

Contrastive Learning in Wavelet Domain for Image Dehazing.
Proceedings of the International Joint Conference on Neural Networks, 2022

Towards Universal Backward-Compatible Representation Learning.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Cross-modal Representation Learning and Relation Reasoning for Bidirectional Adaptive Manipulation.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

CMS-LSTM: Context Embedding and Multi-Scale Spatiotemporal Expression LSTM for Predictive Learning.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Hot-Refresh Model Upgrades with Regression-Free Compatible Training in Image Retrieval.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Heuristic Dropout: An Efficient Regularization Method for Medical Image Segmentation Models.
Proceedings of the IEEE International Conference on Acoustics, 2022

Modernn: Towards Fine-Grained Motion Details for Spatiotemporal Predictive Learning.
Proceedings of the IEEE International Conference on Acoustics, 2022

Masked Generative Distillation.
Proceedings of the Computer Vision - ECCV 2022, 2022

Diverse Image Inpainting with Normalizing Flow.
Proceedings of the Computer Vision - ECCV 2022, 2022

Super-Resolution by Predicting Offsets: An Ultra-Efficient Super-Resolution Network for Rasterized Images.
Proceedings of the Computer Vision - ECCV 2022, 2022

REALY: Rethinking the Evaluation of 3D Face Reconstruction.
Proceedings of the Computer Vision - ECCV 2022, 2022

Semantic-Sparse Colorization Network for Deep Exemplar-Based Colorization.
Proceedings of the Computer Vision - ECCV 2022, 2022

Focal and Global Knowledge Distillation for Detectors.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

SWEM: Towards Real-Time Video Object Segmentation with Sequential Weighted Expectation-Maximization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Structural Supervision for Word Alignment and Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021
Manifold Transfer Learning via Discriminant Regression Analysis.
IEEE Trans. Multim., 2021

Guiding Query Position and Performing Similar Attention for Transformer-Based Detection Heads.
CoRR, 2021

CMS-LSTM: Context-Embedding and Multi-Scale Spatiotemporal-Expression LSTM for Video Prediction.
CoRR, 2021

Reducing the Annotation Effort for Video Object Segmentation Datasets.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

VQMG: Hierarchical Vector Quantised and Multi-hops Graph Reasoning for Explicit Representation Learning.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Deep Interactive Video Inpainting: An Invisibility Cloak for Harry Potter.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Optimizing Enhanced Cost Per Click via Reinforcement Learning Without Exploration.
Proceedings of the International Joint Conference on Neural Networks, 2021

Explore Hierarchical Relations Reasoning and Global Information Aggregation.
Proceedings of the 16th International Conference on Document Analysis and Recognition, 2021

2020
A Simple Yet Effective Method for Video Temporal Grounding with Cross-Modality Attention.
CoRR, 2020

Using an ensemble color space model to tackle adversarial examples.
CoRR, 2020

StegColNet: Steganalysis Based on an Ensemble Colorspace Approach.
Proceedings of the Structural, Syntactic, and Statistical Pattern Recognition, 2020

An Inverse Mapping with Manifold Alignment for Zero-Shot Learning.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

HOSE-Net: Higher Order Structure Embedded Network for Scene Graph Generation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

TranSlider: Transfer Ensemble Learning from Exploitation to Exploration.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

HAF-SVG: Hierarchical Stochastic Video Generation with Aligned Features.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Feature Augmented Memory with Global Attention Network for VideoQA.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Double Shot: Preserve and Erase Based Class Attention Networks for Weakly Supervised Localization (Peca-Net).
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Texture and Shape Biased Two-Stream Networks for Clothing Classification and Attribute Recognition.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Bridge the Gap: High-level Semantic Planning for Image Captioning.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Logic Enhanced Commonsense Inference with Chain Transformer.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

Self-Attention ConvLSTM for Spatiotemporal Prediction.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Learning Attentional Recurrent Neural Network for Visual Tracking.
IEEE Trans. Multim., 2019

Learning Deep Conditional Neural Network for Image Segmentation.
IEEE Trans. Multim., 2019

Low-Rank 2-D Neighborhood Preserving Projection for Enhanced Robust Image Representation.
IEEE Trans. Cybern., 2019

Horizontal and Vertical Nuclear Norm-Based 2DLDA for Image Representation.
IEEE Trans. Circuits Syst. Video Technol., 2019

Structurally Incoherent Low-Rank 2DLPP for Image Classification.
IEEE Trans. Circuits Syst. Video Technol., 2019

Multi-Frame Content Integration with a Spatio-Temporal Attention Mechanism for Person Video Motion Transfer.
CoRR, 2019

Image-to-Tree: A Tree-Structured Decoder for Image Captioning.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Stochastic Video Generation with Disentangled Representations.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Fast Registration for Cross-Source Point Clouds by using Weak Regional Affinity and Pixel-Wise Refinement.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Narratology-Based Interaction Design of 3D Reconstructed Cultural Relics.
Proceedings of the Fourth IEEE International Conference on Data Science in Cyberspace, 2019

Multi-Scale Visual Semantics Aggregation with Self-Attention for End-to-End Image-Text Matching.
Proceedings of The 11th Asian Conference on Machine Learning, 2019

Self-Supervised Mixture-of-Experts by Uncertainty Estimation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Structurally Incoherent Low-Rank Nonnegative Matrix Factorization for Image Classification.
IEEE Trans. Image Process., 2018

Learning Parts-Based and Global Representation for Image Classification.
IEEE Trans. Circuits Syst. Video Technol., 2018

A Coarse-to-Fine Algorithm for Matching and Registration in 3D Cross-Source Point Clouds.
IEEE Trans. Circuits Syst. Video Technol., 2018

FPGA-based Acceleration System for Visual Tracking.
CoRR, 2018

Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments.
CoRR, 2018

Self-Adaptive Double Bootstrapped DDPG.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

TreeNet: Learning Sentence Representations with Unconstrained Tree Structure.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Getting Rid of Night: Thermal Image Classification Based on Feature Fusion.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

Densely Stacked Generative Adversarial Networks.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Hierarchical Context Encoding for Events Captioning in Videos.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Robust Visual Tracking in Low-Resolution Sequence.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Conditional Kronecker Batch Normalization for Compositional Reasoning.
Proceedings of the British Machine Vision Conference 2018, 2018

Efficient Multi-level Correlating for Visual Tracking.
Proceedings of the Computer Vision - ACCV 2018, 2018

ColorNet: Investigating the Importance of Color Spaces for Image Classification.
Proceedings of the Computer Vision - ACCV 2018, 2018

2017
Nuclear Norm-Based 2DLPP for Image Classification.
IEEE Trans. Multim., 2017

A Systematic Approach for Cross-Source Point Cloud Registration by Preserving Macro and Micro Structures.
IEEE Trans. Image Process., 2017

Superimposed Sparse Parameter Classifiers for Face Recognition.
IEEE Trans. Cybern., 2017

Nonnegative Discriminant Matrix Factorization.
IEEE Trans. Circuits Syst. Video Technol., 2017

Depth map super-resolution via low-resolution depth guided joint trilateral up-sampling.
J. Vis. Commun. Image Represent., 2017

Improving Object Detection with Convolutional Neural Network via Iterative Mechanism.
Proceedings of the Neural Information Processing - 24th International Conference, 2017

Learning attentional recurrent neural network for visual tracking.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Gate function based structure-aware convolution for scene semantic segmentation.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

2016
Low-Rank Preserving Projections.
IEEE Trans. Cybern., 2016

Fast Intra prediction algorithm for quality scalable video coding.
Signal Image Video Process., 2016

基于卷积神经网络的多标签图像自动标注 (Multi-label Image Annotation Based on Convolutional Neural Network).
计算机科学, 2016

Projective robust nonnegative factorization.
Inf. Sci., 2016

Enhanced Joint Trilateral Up-sampling for Super-Resolution.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

Object Detection Based on Scene Understanding and Enhanced Proposals.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

Facial Expression Recognition with Multi-scale Convolution Neural Network.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

Automatic Image Annotation Using Adaptive Weighted Distance in Improved K Nearest Neighbors Framework.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

A Very Deep Sequences Learning Approach for Human Action Recognition.
Proceedings of the MultiMedia Modeling - 22nd International Conference, 2016

Learning Boltzmann machine with EM-like method.
Proceedings of the 2016 International Joint Conference on Neural Networks, 2016

Deep conditional neural network for image segmentation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2016

Video object segmentation by Multi-Scale Pyramidal Multi-Dimensional LSTM with generated depth context.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Convolutional neural network using multi-scale information for stereo matching cost computation.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

A Coarse-to-Fine Algorithm for Registration in 3D Street-View Cross-Source Point Clouds.
Proceedings of the 2016 International Conference on Digital Image Computing: Techniques and Applications, 2016

Real Time Complete Dense Depth Reconstruction for a Monocular Camera.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016

Multi-linear regression coefficient classifier for recognition.
Proceedings of the IEEE Congress on Evolutionary Computation, 2016

2015
Saliency Detection based on Depth and Sparse Features.
Proceedings of the VISAPP 2015, 2015

Human Parsing via Shape Boltzmann Machine Networks.
Proceedings of the Advances in Multimedia Information Processing - PCM 2015, 2015

Graph Cuts Stereo Matching Based on Patch-Match and Ground Control Points Constraint.
Proceedings of the Advances in Multimedia Information Processing - PCM 2015, 2015

FANet: Factor Analysis Neural Network.
Proceedings of the Neural Information Processing - 22nd International Conference, 2015

Weighted-PCANet for Face Recognition.
Proceedings of the Neural Information Processing - 22nd International Conference, 2015

Center-based weighted kernel linear regression for image classification.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Dense Correspondence Using Non-Local DAISY Forest.
Proceedings of the 2015 International Conference on Digital Image Computing: Techniques and Applications, 2015

2014
A fast mode decision algorithm applied to Coarse-Grain quality Scalable Video Coding.
J. Vis. Commun. Image Represent., 2014

Fast Mode and Depth Decision Algorithm for Intra Prediction of Quality SHVC.
Proceedings of the Intelligent Computing Theory - 10th International Conference, 2014

2013
Online Allocation of Communication and Computation Resources for Real-Time Multimedia Services.
IEEE Trans. Multim., 2013

Fast and Robust Edge-Guided Exemplar-Based Image Inpainting.
Proceedings of the Image Analysis and Processing - ICIAP 2013, 2013

2012
Virtual mixer: Real-time audio mixing across clients and the cloud for multiparty conferencing.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Segmentation of Color Image Based on Partial Differential Equations.
Proceedings of the 4th International Symposium on Computational Intelligence and Design, 2011

An Efficient Peer-to-Peer Digital Resource Management System for Video Content.
Proceedings of the 4th International Symposium on Computational Intelligence and Design, 2011

Fast mode decision algorithm for enhancement layer of spatial and CGS scalable video coding.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

2010
Differentially Private Data Release through Multidimensional Partitioning.
Proceedings of the Secure Data Management, 7th VLDB Workshop, SDM 2010, Singapore, 2010

Novel Early Mode Decision Algorithm for Enhancement Layers in H.264 Scalable Video Coding.
Proceedings of the Advances in Multimedia Information Processing - PCM 2010, 2010

2009
Trace and revoke systems with short ciphertexts.
Proceedings of the 2nd International Conference on Security of Information and Networks, 2009

A Classified P2P Overlay Scheme Using SVC for Video Streaming.
Proceedings of the Advances in Multimedia Information Processing, 2009

Broadcast Encryption-Based P2P DRM without Central License Server.
Proceedings of the Advances in Multimedia Information Processing, 2009

Reducing the Motion-Compensated Temporal Interpolation Noise of DVC Side Information by ODWT.
Proceedings of the Advances in Multimedia Information Processing, 2009

Feature Extraction Methods for Handwritten Character Recognition Based on Acceleration.
Proceedings of the 2009 International Conference on Image Processing, 2009

Non-subsampled Contourlet Transform Based Seismic Signal De-noising.
Proceedings of the CSIE 2009, 2009 WRI World Congress on Computer Science and Information Engineering, March 31, 2009

2008
A Novel Hierarchical Mode Selection Algorithm for P-Slices in H.264/AVC.
Proceedings of the Advances in Multimedia Information Processing, 2008

Handwritten character recognition using orientation quantization based on 3D accelerometer.
Proceedings of the 5th Annual International Conference on Mobile and Ubiquitous Systems: Computing, 2008

Self-Defined Gesture Recognition on Keyless Handheld Devices using MEMS 3D Accelerometer.
Proceedings of the Fourth International Conference on Natural Computation, 2008

Reliable and Efficient Adaptive Streaming Mechanism for Multi-user SVC VoD System over GPRS/EDGE Network.
Proceedings of the International Conference on Computer Science and Software Engineering, 2008

A GOP-Adaptive Priority-Based Rate-Distortion Optimization Bitstream Extraction Algorithm for Scalable Video Coding.
Proceedings of the International Conference on Computer Science and Software Engineering, 2008

2007
Implementing DRM over Peer-to-Peer Networks with Broadcast Encryption.
Proceedings of the Advances in Multimedia Information Processing, 2007

Implementing Digital Right Management in P2P Content Sharing System.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2007

2006
2D/3D Web Visualization on Mobile Devices.
Proceedings of the Web Information Systems, 2006

A Quality-Controllable Encryption for H.264/AVC Video Coding.
Proceedings of the Advances in Multimedia Information Processing, 2006

2005
Scalable protection for MPEG-4 fine granularity scalability.
IEEE Trans. Multim., 2005

2003
Efficient and fully scalable encryption for MPEG-4 FGS.
Proceedings of the 2003 International Symposium on Circuits and Systems, 2003

Layered access control for MPEG-4 FGS video.
Proceedings of the 2003 International Conference on Image Processing, 2003


  Loading...