Yuxin Peng

This page is a disambiguation page, it actually contains mutiple papers from persons of the same or a similar name.

Known people with the same name:

Bibliography

2025
SPHERE: Semantic-PHysical Engaged REpresentation for 3D Semantic Scene Completion.
CoRR, September, 2025

Identity-Preserving Text-to-Video Generation via Training-Free Prompt, Image, and Guidance Enhancement.
CoRR, September, 2025

TRKT: Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring.
CoRR, August, 2025

Hierarchical Event Memory for Accurate and Low-latency Online Video Temporal Grounding.
CoRR, August, 2025

Open-Vocabulary HOI Detection with Interaction-aware Prompt and Concept Calibration.
CoRR, August, 2025

UPP: Unified Point-Level Prompting for Robust Point Cloud Analysis.
CoRR, July, 2025

Balancing Preservation and Modification: A Region and Semantic Aware Metric for Instruction-Based Image Editing.
CoRR, June, 2025

SphereDrag: Spherical Geometry-Aware Panoramic Image Editing.
CoRR, June, 2025

Evaluating for Evidence of Sociodemographic Bias in Conversational AI for Mental Health Support.
Cyberpsychology Behav. Soc. Netw., 2025

Scan-and-Print: Patch-level Data Summarization and Augmentation for Content-aware Layout Generation in Poster Design.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MAI: A Multi-turn Aggregation-Iteration Model for Composed Image Retrieval.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SCAP: Transductive Test-Time Adaptation via Supportive Clique-based Attribute Prompting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

DyFo: A Training-Free Dynamic Focus Visual Search for Enhancing LMMs in Fine-Grained Visual Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

DKC: Differentiated Knowledge Consolidation for Cloth-Hybrid Lifelong Person Re-identification.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Selective Visual Prompting in Vision Mamba.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

DASK: Distribution Rehearsing via Adaptive Style Kernel Learning for Exemplar-Free Lifelong Person Re-Identification.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Compositional Prompting for Anti-Forgetting in Domain Incremental Learning.
Int. J. Comput. Vis., December, 2024

Exemplar-Free Lifelong Person Re-identification via Prompt-Guided Adaptive Knowledge Consolidation.
Int. J. Comput. Vis., November, 2024

SIA-OVD: Shape-Invariant Adapter for Bridging the Image-Region Gap in Open-Vocabulary Detection.
CoRR, 2024

An Interference Mitigation Method via Variable Separation Angle between LEO Constellations.
Proceedings of the 16th International Conference on Wireless Communications and Signal Processing, 2024

CountMamba: Exploring Multi-directional Selective State-Space Models for Plant Counting.
Proceedings of the Pattern Recognition and Computer Vision - 7th Chinese Conference, 2024

Self-supervised Edge Structure Learning for Multi-view Stereo and Parallel Optimization.
Proceedings of the MultiMedia Modeling - 30th International Conference, 2024

FE-VAD: High-Low Frequency Enhanced Weakly Supervised Video Anomaly Detection.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

FineSports: A Multi-Person Hierarchical Sports Video Dataset for Fine-Grained Action Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

FineParser: A Fine-Grained Spatio-Temporal Action Parser for Human-Centric Action Quality Assessment.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

FCS: Feature Calibration and Separation for Non-Exemplar Class Incremental Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Learning Continual Compatible Representation for Re-indexing Free Lifelong Person Re-identification.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Continual Vision-Language Retrieval via Dynamic Knowledge Rectification.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Smart Public Transportation Sensing: Enhancing Perception and Data Management for Efficient and Safety Operations.
Sensors, November, 2023

Multi Hybrid Extractor Network for 3D Human Pose Estimation.
Proceedings of the IEEE International Conference on Image Processing, 2023

Uncover the Body: Occluded Person Re-identification via Masked Image Modeling.
Proceedings of the Image and Graphics - 12th International Conference, 2023

2022
Learning conditional photometric stereo with high-resolution features.
Comput. Vis. Media, 2022

Global Contextual Complementary Network for Multi-View Stereo.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021
Hierarchical Visual-Textual Knowledge Distillation for Life-Long Correlation Learning.
Int. J. Comput. Vis., 2021

2020
Sequential Cross-Modal Hashing Learning via Multi-scale Correlation Mining.
ACM Trans. Multim. Comput. Commun. Appl., 2020

Guest Editorial Introduction to the Special Section on Representation Learning for Visual Content Understanding.
IEEE Trans. Circuits Syst. Video Technol., 2020

A Smart User Authentication Approach using Sensing Seat.
Proceedings of the 16th IEEE International Conference on Automation Science and Engineering, 2020

2019
CM-GANs: Cross-modal Generative Adversarial Networks for Common Representation Learning.
ACM Trans. Multim. Comput. Commun. Appl., 2019

2018
Modality-Specific Cross-Modal Similarity Measurement With Recurrent Attention Network.
IEEE Trans. Image Process., 2018

IEEE Access Special Section Editorial: Recent Advantages of Computer Vision.
IEEE Access, 2018

Cost-Sensitive Deep Metric Learning for Fine-Grained Image Classification.
Proceedings of the MultiMedia Modeling - 24th International Conference, 2018

Recursive Pyramid Network with Joint Attention for Cross-Media Retrieval.
Proceedings of the MultiMedia Modeling - 24th International Conference, 2018

Text-to-image Synthesis via Symmetrical Distillation Networks.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Multi-Scale Correlation for Sequential Cross-modal Hashing Learning.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Life-long Cross-media Correlation Learning.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Coarse Label Refined Knowledge Reasoning for Fine-Grained Visual Categorization.
Proceedings of the Intelligence Science and Big Data Engineering, 2018

Better and Faster: Knowledge Transfer from Multiple Self-supervised Learning Tasks via Graph Distillation for Video Classification.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Visual Data Synthesis via GAN for Zero-Shot Video Classification.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Cross-media Multi-level Alignment with Relation Attention Network.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Cross-modal Bidirectional Translation via Reinforcement Learning.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Dual Adversarial Networks for Zero-shot Cross-media Retrieval.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Stacking VAE and GAN for Context-aware Text-to-Image Generation.
Proceedings of the Fourth IEEE International Conference on Multimedia Big Data, 2018

2017
Cross-media analysis and reasoning: advances and directions.
Frontiers Inf. Technol. Electron. Eng., 2017

Discriminative latent semantic feature learning for pedestrian detection.
Neurocomputing, 2017

Exploiting distinctive topological constraint of local feature matching for logo image recognition.
Neurocomputing, 2017

Cross-media retrieval by exploiting fine-grained correlation at entity level.
Neurocomputing, 2017

CM-GANs: Cross-modal Generative Adversarial Networks for Common Representation Learning.
CoRR, 2017

Saliency-guided video classification via adaptively weighted learning.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

2016
Logo Recognition via Improved Topological Constraint.
Proceedings of the MultiMedia Modeling - 22nd International Conference, 2016

Cross-Media Retrieval via Semantic Entity Projection.
Proceedings of the MultiMedia Modeling - 22nd International Conference, 2016

Group Cost-Sensitive Boosting for Multi-Resolution Pedestrian Detection.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
The application of two-level attention models in deep convolutional neural network for fine-grained image classification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

A Boosted Multi-Task Model for Pedestrian Detection with Occlusion Handling.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

Adaptive Sampling with Optimal Cost for Class-Imbalance Learning.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
Learning Cross-Media Joint Representation With Sparse and Semisupervised Regularization.
IEEE Trans. Circuits Syst. Video Technol., 2014

Graph-based multimodal semi-supervised image classification.
Neurocomputing, 2014

Weakly-Supervised Image Parsing via Constructing Semantic Graphs and Hypergraphs.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Error-Driven Incremental Learning in Deep Convolutional Neural Network for Large-Scale Image Classification.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Semantic Graph Construction for Weakly-Supervised Image Parsing.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

Cross-View Feature Learning for Scalable Social Image Analysis.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013
Exploiting Semantic and Visual Context for Effective Video Annotation.
IEEE Trans. Multim., 2013

Latent semantic learning with structured sparse representation for human action recognition.
Pattern Recognit., 2013

Cross-media retrieval by intra-media and inter-media correlation mining.
Multim. Syst., 2013

L<sub>1</sub>-graph construction using structured sparsity.
Neurocomputing, 2013

Vocabulary hierarchy optimisation based on spatial context and category information.
Int. J. Multim. Intell. Secur., 2013

Exhaustive and Efficient Constraint Propagation: A Graph-Based Learning Approach and Its Applications.
Int. J. Comput. Vis., 2013

A temporal context model for boosting video annotation.
Sci. China Inf. Sci., 2013

Learning Descriptive Visual Representation by Semantic Regularized Matrix Factorization.
Proceedings of the IJCAI 2013, 2013

Multimodal semi-supervised image classification by combining tag refinement, graph-based learning and support vector regression.
Proceedings of the IEEE International Conference on Image Processing, 2013

Cross-media retrieval by cluster-based correlation analysis.
Proceedings of the IEEE International Conference on Image Processing, 2013

Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval.
Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

Unified Constraint Propagation on Multi-View Data.
Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

2012
Effective Heterogeneous Similarity Measure with Nearest Neighbors for Cross-Media Retrieval.
Proceedings of the Advances in Multimedia Modeling - 18th International Conference, 2012

Visual Vocabulary Optimization with Spatial Context for Image Annotation and Classification.
Proceedings of the Advances in Multimedia Modeling - 18th International Conference, 2012

PDSS: patch-descriptor-similarity space for effective face verification.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Image annotation by semantic sparse recoding of visual content.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Tri-space and ranking based heterogeneous similarity measure for cross-media retrieval.
Proceedings of the 21st International Conference on Pattern Recognition, 2012

Heterogeneous Constraint Propagation with Constrained Sparse Representation.
Proceedings of the 12th IEEE International Conference on Data Mining, 2012

Cross-modality correlation propagation for cross-media retrieval.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Contextual Kernel and Spectral Methods for Learning the Semantics of Images.
IEEE Trans. Image Process., 2011

Combining multiple clusterings using fast simulated annealing.
Pattern Recognit. Lett., 2011

Robust Image Analysis by L1-Norm Semi-supervised Learning
CoRR, 2011

Exhaustive and Efficient Constraint Propagation: A Semi-Supervised Learning Perspective and Its Applications
CoRR, 2011

Mining concept relationship in temporal context for effective video annotation.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Combining latent semantic learning and reduced hypergraph learning for semi-supervised image categorization.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Web video search by mutual boosting between the inside and outside text of video.
Proceedings of the 2011 Joint International Conference on Digital Libraries, 2011

Spectral learning of latent semantics for action recognition.
Proceedings of the IEEE International Conference on Computer Vision, 2011

Latent Semantic Learning by Efficient Sparse Coding with Hypergraph Regularization.
Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

Symmetric Graph Regularized Constraint Propagation.
Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

2010
Gaussian mixture learning via robust competitive agglomeration.
Pattern Recognit. Lett., 2010

Image categorization via robust pLSA.
Pattern Recognit. Lett., 2010

Story-Based Retrieval by Learning and Measuring the Concept-Based and Content-Based Similarity.
Proceedings of the Advances in Multimedia Modeling, 2010

Refining video annotation by exploiting inter-shot context.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

AdaOUBoost: adaptive over-sampling and under-sampling to boost the concept learning in large scale imbalanced data sets.
Proceedings of the 11th ACM SIGMM International Conference on Multimedia Information Retrieval, 2010

Effective Multi-level Image Representation for Image Categorization.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

2009
PKU-ICST at TRECVID2009: High Level Feature Extraction and Search.
Proceedings of the TRECVID 2009 workshop participants notebook papers, 2009

Semantic concept annotation based on audio PLSA model.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Audio retrieval by segment-based manifold-ranking.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Using Multiple Frame Integration for the Text Recognition of Video.
Proceedings of the 10th International Conference on Document Analysis and Recognition, 2009

2008
A Semi-supervised Learning Algorithm on Gaussian Mixture with Automatic Model Selection.
Neural Process. Lett., 2008

Peking University at TRECVID 2008: High Level Feature Extraction.
Proceedings of the TRECVID 2008 workshop participants notebook papers, 2008

Unsupervised learning of finite mixtures using entropy regularization and its application to image segmentation.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

From Comparing Clusterings to Combining Clusterings.
Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

2007
OM-based video shot retrieval by one-to-one matching.
Multim. Tools Appl., 2007

Color-Based Text Extraction for the Image.
Proceedings of the Advances in Multimedia Information Processing, 2007

Color-based clustering for text detection and extraction in image.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

A Theoretical Approach to Construct Highly Discriminative Features with Application in AdaBoost.
Proceedings of the Computer Vision, 2007

2006
Clip-based similarity measure for query-dependent clip retrieval and video summarization.
IEEE Trans. Circuits Syst. Video Technol., 2006

Using Earth Mover's Distance for Audio Clip Retrieval.
Proceedings of the Advances in Multimedia Information Processing, 2006

Audio similarity measure by graph modeling and matching.
Proceedings of the 14th ACM International Conference on Multimedia, 2006

2005
A New Retrieval Model Based on TextTiling for Document Similarity Search.
J. Comput. Sci. Technol., 2005

A New Re-ranking Method for Generic Chinese Text Summarization and Its Evaluation.
Proceedings of the Digital Libraries: Implementing Strategies and Sharing Experiences, 2005

Hot Event Detection and Summarization by Graph Modeling and Matching.
Proceedings of the Image and Video Retrieval, 4th International Conference, 2005

EMD-Based Video Clip Retrieval by Many-to-Many Matching.
Proceedings of the Image and Video Retrieval, 4th International Conference, 2005

The earth mover's distance as a semantic measure for document similarity.
Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management, Bremen, Germany, October 31, 2005

2004
Clip-based similarity measure for hierarchical video retrieval.
Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2004

A Measure Based on Optimal Matching in Graph Theory for Document Similarity.
Proceedings of the Information Retrieval Technology, Asia Information Retrieval Symposium, 2004

2003
Video clip retrieval by maximal matching and optimal matching in graph theory.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003


  Loading...