We stand with Ukraine

We stand with Ukraine

Yuxin Peng

Orcid: 0000-0001-7658-3845

Affiliations:

Peking University, Beijing, China

According to our database¹, Yuxin Peng authored at least 143 papers between 2012 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

on orcid.org

On csauthors.net:

Bibliography

2025

Distribution-Aware Knowledge Aligning and Prototyping for Non-Exemplar Lifelong Person Re-Identification.

[BibT_eX]

[DOI]

,

,

,

,

IEEE Trans. Pattern Anal. Mach. Intell., December, 2025

Long Short-Term Knowledge Decomposition and Consolidation for Lifelong Person Re-Identification.

[BibT_eX]

[DOI]

,

,

,

,

IEEE Trans. Pattern Anal. Mach. Intell., September, 2025

Identity-Preserving Text-to-Video Generation via Training-Free Prompt, Image, and Guidance Enhancement.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, September, 2025

Human-Centric Fine-Grained Action Quality Assessment.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Pattern Anal. Mach. Intell., August, 2025

Interact-Custom: Customized Human Object Interaction Image Generation.

[BibT_eX]

[DOI]

,

,

,

CoRR, August, 2025

Investigating Domain Gaps for Indoor 3D Object Detection.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, August, 2025

Advancing 3D Scene Understanding with MV-ScanQA Multi-View Reasoning Evaluation and TripAlign Pre-training Dataset.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, August, 2025

TRKT: Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, August, 2025

Hierarchical Event Memory for Accurate and Low-latency Online Video Temporal Grounding.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, August, 2025

Open-Vocabulary HOI Detection with Interaction-aware Prompt and Concept Calibration.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, August, 2025

RaT2IGen: Relation-aware Text-to-image Generation via Learnable Prompt.

[BibT_eX]

[DOI]

,

,

ACM Trans. Multim. Comput. Commun. Appl., May, 2025

Learning Comprehensive Visual Grounding for Video Captioning.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE Trans. Circuits Syst. Video Technol., April, 2025

Benchmarking Large Vision-Language Models on Fine-Grained Image Tasks: A Comprehensive Evaluation.

[BibT_eX]

[DOI]

,

,

,

Serge J. Belongie

CoRR, April, 2025

Learning Guided Implicit Depth Function With Scale-Aware Feature Fusion.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

IEEE Trans. Image Process., 2025

Balancing Preservation and Modification: A Region and Semantic Aware Metric for Instruction-Based Image Editing.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Forty-second International Conference on Machine Learning, 2025

STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

EOGT: Video Anomaly Detection with Enhanced Object Information and Global Temporal Dependency.

[BibT_eX]

[DOI]

,

,

,

ACM Trans. Multim. Comput. Commun. Appl., October, 2024

SPIRIT: Style-guided Patch Interaction for Fashion Image Retrieval with Text Feedback.

[BibT_eX]

[DOI]

,

,

ACM Trans. Multim. Comput. Commun. Appl., June, 2024

I2C: Invertible Continuous Codec for High-Fidelity Variable-Rate Image Compression.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

IEEE Trans. Pattern Anal. Mach. Intell., June, 2024

Negatives Make a Positive: An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning.

[BibT_eX]

[DOI]

,

,

,

,

IEEE Trans. Pattern Anal. Mach. Intell., April, 2024

Decoupled domain-specific and domain-conditional representation learning for cross-domain recommendation.

[BibT_eX]

[DOI]

,

,

,

,

Inf. Process. Manag., March, 2024

MAAN: Memory-Augmented Auto-Regressive Network for Text-Driven 3D Indoor Scene Generation.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Multim., 2024

HCL: Hierarchical Consistency Learning for Webly Supervised Fine-Grained Recognition.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Multim., 2024

Image Super-Resolution via Efficient Transformer Embedding Frequency Decomposition With Restart.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE Trans. Image Process., 2024

Toward Video Anomaly Retrieval From Video Anomaly Detection: New Benchmarks and Model.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE Trans. Image Process., 2024

MECOM: A Meta-Completion Network for Fine-Grained Recognition With Incomplete Multi-Modalities.

[BibT_eX]

[DOI]

,

,

,

,

IEEE Trans. Image Process., 2024

SIM-OFE: Structure Information Mining and Object-Aware Feature Enhancement for Fine-Grained Visual Categorization.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Image Process., 2024

DMA: Dual Modality-Aware Alignment for Visible-Infrared Person Re-Identification.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Inf. Forensics Secur., 2024

ResVG: Enhancing Relation and Semantic Understanding in Multiple Instances for Visual Grounding.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

RelScene: A Benchmark and baseline for Spatial Relations in text-driven 3D Scene Generation.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Mitigate Catastrophic Remembering via Continual Knowledge Purification for Noisy Lifelong Person Re-Identification.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

SIA-OVD: Shape-Invariant Adapter for Bridging the Image-Region Gap in Open-Vocabulary Detection.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

InsVP: Efficient Instance Visual Prompting from Image Itself.

[BibT_eX]

[DOI]

,

,

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Progressive Prototype Evolving for Dual-Forgetting Mitigation in Non-Exemplar Online Continual Learning.

[BibT_eX]

[DOI]

,

,

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

FineFMPL: Fine-grained Feature Mining Prompt Learning for Few-Shot Class Incremental Learning.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Semantic-Aware Human Object Interaction Image Generation.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Firzen: Firing Strict Cold-Start Items with Frozen Heterogeneous and Homogeneous Graphs for Recommendation.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

Training-Free Video Temporal Grounding Using Large-Scale Pre-trained Models.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

Exploring Conditional Multi-modal Prompts for Zero-Shot HOI Detection.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

Distribution-Aware Knowledge Prototyping for Non-Exemplar Lifelong Person Re-Identification.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DART: Dual-Modal Adaptive Online Prompting and Knowledge Retention for Test-Time Adaptation.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Comprehensive Visual Grounding for Video Description.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

FashionERN: Enhance-and-Refine Network for Composed Fashion Image Retrieval.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

LFR-GAN: Local Feature Refinement based Generative Adversarial Network for Text-to-Image Generation.

[BibT_eX]

[DOI]

,

,

ACM Trans. Multim. Comput. Commun. Appl., November, 2023

Attribute-Aware Deep Hashing With Self-Consistency for Large-Scale Fine-Grained Image Retrieval.

[BibT_eX]

[DOI]

,

,

,

,

IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

Disentangled Graph Neural Networks for Session-Based Recommendation.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE Trans. Knowl. Data Eng., August, 2023

DCR-ReID: Deep Component Reconstruction for Cloth-Changing Person Re-Identification.

[BibT_eX]

[DOI]

,

,

,

,

IEEE Trans. Circuits Syst. Video Technol., August, 2023

CAT: a coarse-to-fine attention tree for semantic change detection.

[BibT_eX]

[DOI]

,

,

,

,

Vis. Intell., 2023

MKVSE: Multimodal Knowledge Enhanced Visual-semantic Embedding for Image-text Retrieval.

[BibT_eX]

[DOI]

,

,

ACM Trans. Multim. Comput. Commun. Appl., 2023

Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2023

MB-HGCN: A Hierarchical Graph Convolutional Network for Multi-behavior Recommendation.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2023

Enhancing Unsupervised Audio Representation Learning via Adversarial Sample Generation.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2023

Multi-Behavior Recommendation with Cascading Graph Convolution Networks.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the ACM Web Conference 2023, 2023

MV-Diffusion: Motion-aware Video Diffusion Model.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Efficiency-optimized Video Diffusion Models.

[BibT_eX]

[DOI]

,

,

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Real20M: A Large-scale E-commerce Dataset for Cross-domain Retrieval.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Fine-Grained Visual Prompt Learning of Vision-Language Models for Image Recognition.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 31st ACM International Conference on Multimedia, 2023

DensityLayout: Density-Conditioned Layout GAN for Visual-Textual Presentation Designs.

[BibT_eX]

[DOI]

,

,

Proceedings of the Image and Graphics - 12th International Conference, 2023

Masked Retraining Teacher-Student Framework for Domain Adaptive Object Detection.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Confidence-aware Pseudo-label Learning for Weakly Supervised Visual Grounding.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Efficient Adaptive Human-Object Interaction Detection with Concept-guided Memory.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

PosterLayout: A New Benchmark and Approach for Content-Aware Visual-Textual Presentation Layout.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Generating Structured Pseudo Labels for Noise-resistant Zero-shot Video Sentence Localization.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Phrase-Level Temporal Relationship Mining for Temporal Sentence Localization.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Dual-View 3D Reconstruction via Learning Correspondence and Dependency of Point Cloud Regions.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE Trans. Image Process., 2022

Unsupervised Visual-Textual Correlation Learning With Fine-Grained Semantic Alignment.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Cybern., 2022

MARS: Learning Modality-Agnostic Representation for Scalable Cross-Media Retrieval.

[BibT_eX]

[DOI]

,

IEEE Trans. Circuits Syst. Video Technol., 2022

Fine-Grained Image Analysis With Deep Learning: A Survey.

[BibT_eX]

[DOI]

,

,

Oisin Mac Aodha

,

,

,

,

,

Serge J. Belongie

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Semantic association enhancement transformer with relative position for image captioning.

[BibT_eX]

[DOI]

,

,

,

Multim. Tools Appl., 2022

Team PKU-WICT-MIPL PIC Makeup Temporal Video Grounding Challenge 2022 Technical Report.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2022

Prototype-based classifier learning for long-tailed visual recognition.

[BibT_eX]

[DOI]

,

,

,

,

Sci. China Inf. Sci., 2022

Learn from Unlabeled Videos for Near-duplicate Video Retrieval.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Weakly Supervised Video Anomaly Detection with Temporal and Abnormal Information.

[BibT_eX]

[DOI]

,

,

Proceedings of the Pattern Recognition and Computer Vision - 5th Chinese Conference, 2022

An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

SIM-Trans: Structure Information Modeling Transformer for Fine-grained Visual Categorization.

[BibT_eX]

[DOI]

,

,

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Weakly Supervised Temporal Sentence Grounding with Gaussian-based Contrastive Proposal Learning.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Visual-Textual Hybrid Sequence Matching for Joint Reasoning.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Cybern., 2021

Multi-Level Knowledge Injecting for Visual Commonsense Reasoning.

[BibT_eX]

[DOI]

,

IEEE Trans. Circuits Syst. Video Technol., 2021

2020

RCE-HIL: Recognizing Cross-media Entailment with Heterogeneous Interactive Learning.

[BibT_eX]

[DOI]

,

,

ACM Trans. Multim. Comput. Commun. Appl., 2020

Multi-Pathway Generative Adversarial Hashing for Unsupervised Cross-Modal Retrieval.

[BibT_eX]

[DOI]

,

IEEE Trans. Multim., 2020

CKD: Cross-Task Knowledge Distillation for Text-to-Image Synthesis.

[BibT_eX]

[DOI]

,

IEEE Trans. Multim., 2020

Deep Reinforcement Learning for Image Hashing.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Multim., 2020

Video Captioning With Object-Aware Spatio-Temporal Correlation and Aggregation.

[BibT_eX]

[DOI]

,

IEEE Trans. Image Process., 2020

MAVA: Multi-Level Adaptive Visual-Textual Alignment by Cross-Media Bi-Attention Mechanism.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Image Process., 2020

SCH-GAN: Semi-Supervised Cross-Modal Hashing by Generative Adversarial Network.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Cybern., 2020

MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Cybern., 2020

Bridge-GAN: Interpretable Representation Learning for Text-to-Image Synthesis.

[BibT_eX]

[DOI]

,

IEEE Trans. Circuits Syst. Video Technol., 2020

Quintuple-Media Joint Correlation Learning With Deep Compression and Regularization.

[BibT_eX]

[DOI]

,

IEEE Trans. Circuits Syst. Video Technol., 2020

Reinforced Cross-Media Correlation Learning by Context-Aware Bidirectional Translation.

[BibT_eX]

[DOI]

,

IEEE Trans. Circuits Syst. Video Technol., 2020

Unsupervised Cross-Media Retrieval Using Domain Adaptation With Scene Graph.

[BibT_eX]

[DOI]

,

IEEE Trans. Circuits Syst. Video Technol., 2020

Fine-Grained Visual-Textual Representation Learning.

[BibT_eX]

[DOI]

,

IEEE Trans. Circuits Syst. Video Technol., 2020

Zero-Shot Cross-Media Embedding Learning With Dual Adversarial Distribution Network.

[BibT_eX]

[DOI]

,

IEEE Trans. Circuits Syst. Video Technol., 2020

DV-Net: Dual-view network for 3D reconstruction by fusing multiple sets of gated control point clouds.

[BibT_eX]

[DOI]

,

,

,

,

Pattern Recognit. Lett., 2020

Attribute hierarchy based multi-task learning for fine-grained image classification.

[BibT_eX]

[DOI]

,

,

Neurocomputing, 2020

PKU_WICT at TRECVID 2020: Instance Search Task.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 2020 TREC Video Retrieval Evaluation, 2020

2019

Show and Tell in the Loop: Cross-Modal Circular Correlation Learning.

[BibT_eX]

[DOI]

,

IEEE Trans. Multim., 2019

TPCKT: Two-Level Progressive Cross-Media Knowledge Transfer.

[BibT_eX]

[DOI]

,

IEEE Trans. Multim., 2019

SSDH: Semi-Supervised Deep Hashing for Large Scale Image Retrieval.

[BibT_eX]

[DOI]

,

IEEE Trans. Circuits Syst. Video Technol., 2019

Two-Stream Collaborative Learning With Spatial-Temporal Attention for Video Classification.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Circuits Syst. Video Technol., 2019

Fast Fine-Grained Image Classification via Weakly Supervised Discriminative Localization.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Circuits Syst. Video Technol., 2019

Which and How Many Regions to Gaze: Focus Discriminative Regions for Fine-Grained Visual Categorization.

[BibT_eX]

[DOI]

,

,

Int. J. Comput. Vis., 2019

PKU_ICST at TRECVID 2019: Instance Search Task.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 2019 TREC Video Retrieval Evaluation, 2019

Hierarchical Vision-Language Alignment for Video Captioning.

[BibT_eX]

[DOI]

,

Proceedings of the MultiMedia Modeling - 25th International Conference, 2019

A New Benchmark and Approach for Fine-grained Cross-media Retrieval.

[BibT_eX]

[DOI]

,

,

Proceedings of the 27th ACM International Conference on Multimedia, 2019

IRC-GAN: Introspective Recurrent Convolutional GAN for Text-to-video Generation.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Object-Aware Aggregation With Bidirectional Temporal Graph for Video Captioning.

[BibT_eX]

[DOI]

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Query-Adaptive Image Retrieval by Deep-Weighted Hashing.

[BibT_eX]

[DOI]

,

IEEE Trans. Multim., 2018

CCL: Cross-modal Correlation Learning With Multigrained Fusion by Hierarchical Network.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Multim., 2018

Object-Part Attention Model for Fine-Grained Image Classification.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Image Process., 2018

An Overview of Cross-Media Retrieval: Concepts, Methodologies, Benchmarks, and Challenges.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Circuits Syst. Video Technol., 2018

IEEE Access Special Section Editorial: Recent Advantages of Computer Vision.

[BibT_eX]

[DOI]

,

,

,

Timothy M. Hospedales

,

,

,

IEEE Access, 2018

PKU_ICST at TRECVID 2018: Instance Search Task.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 2018 TREC Video Retrieval Evaluation, 2018

Multi-attention Guided Activation Propagation in CNNs.

[BibT_eX]

[DOI]

,

Proceedings of the Pattern Recognition and Computer Vision - First Chinese Conference, 2018

Only Learn One Sample: Fine-Grained Visual Categorization with One Sample Training.

[BibT_eX]

[DOI]

,

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

StackDRL: Stacked Deep Reinforcement Learning for Fine-grained Visual Categorization.

[BibT_eX]

[DOI]

,

,

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Progressive Cross-Media Correlation Learning.

[BibT_eX]

[DOI]

,

Proceedings of the Image and Graphics Technologies and Applications, 2018

Deep Cross-Media Knowledge Transfer.

[BibT_eX]

[DOI]

,

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Unsupervised Generative Adversarial Cross-Modal Hashing.

[BibT_eX]

[DOI]

,

,

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Cross-media similarity metric learning with unified deep networks.

[BibT_eX]

[DOI]

,

,

Multim. Tools Appl., 2017

Visual-textual Attention Driven Fine-grained Representation Learning.

[BibT_eX]

[DOI]

,

CoRR, 2017

CCL: Cross-modal Correlation Learning with Multi-grained Fusion by Hierarchical Network.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2017

Object-Part Attention Driven Discriminative Localization for Fine-grained Image Classification.

[BibT_eX]

[DOI]

,

,

CoRR, 2017

Fine-graind Image Classification via Combining Vision and Language.

[BibT_eX]

[DOI]

,

CoRR, 2017

PKU_ICST at TRECVID 2017: Instance Search Task.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 2017 TREC Video Retrieval Evaluation, 2017

Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2017 ACM on Multimedia Conference, 2017

Cross-modal Common Representation Learning by Hybrid Transfer Network.

[BibT_eX]

[DOI]

,

,

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Cross-modal deep metric learning with multi-task regularization.

[BibT_eX]

[DOI]

,

Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Zero-Shot Cross-Media Retrieval with External Knowledge.

[BibT_eX]

[DOI]

,

,

Proceedings of the Internet Multimedia Computing and Service, 2017

Attention-Sharing Correlation Learning for Cross-Media Retrieval.

[BibT_eX]

[DOI]

,

,

Proceedings of the Image and Graphics - 9th International Conference, 2017

Fine-Grained Image Classification via Combining Vision and Language.

[BibT_eX]

[DOI]

,

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Weakly Supervised Learning of Part Selection Model with Spatial Constraints for Fine-Grained Image Classification.

[BibT_eX]

[DOI]

,

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Semi-Supervised Cross-Media Feature Learning With Unified Patch Graph Regularization.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Circuits Syst. Video Technol., 2016

Query-adaptive Image Retrieval by Deep Weighted Hashing.

[BibT_eX]

[DOI]

,

,

CoRR, 2016

SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval.

[BibT_eX]

[DOI]

,

,

CoRR, 2016

PKU-ICST at TRECVID 2016: Instance Search Task.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 2016 TREC Video Retrieval Evaluation, 2016

Cross-Media Shared Representation by Hierarchical Learning with Multiple Deep Networks.

[BibT_eX]

[DOI]

,

,

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Cross-Media Retrieval by Multimodal Representation Fusion with Deep Networks.

[BibT_eX]

[DOI]

,

,

Proceedings of the Digital TV and Wireless Multimedia Communication, 2016

2015

PKU-ICST at TRECVID 2015: Instance Search Task.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 2015 TREC Video Retrieval Evaluation, 2015

2014

PKU-ICST at TRECVID 2014: Instance Search Task.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 2014 TREC Video Retrieval Evaluation, 2014

2013

PKU_ICST at TRECVID2013 : Instance Search Task.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the 2013 TREC Video Retrieval Evaluation, 2013

2012

PKU-ICST @TRECVID2012: Known-item Search Task.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 2012 TREC Video Retrieval Evaluation, 2012

Loading...