Yalong Bai

Orcid: 0000-0002-8416-9027

According to our database¹, Yalong Bai authored at least 50 papers between 2013 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

MRT: Masked Region Transformer for Layered Image Generation and Editing at Scale.

[BibT_eX]

[DOI]

CoRR, May, 2026

Pareto-Guided Optimal Transport for Multi-Reward Alignment.

[BibT_eX]

[DOI]

CoRR, May, 2026

Premier: Personalized Preference Modulation with Learnable User Embedding in Text-to-Image Generation.

[BibT_eX]

[DOI]

CoRR, March, 2026

2025

PrefGen: Multimodal Preference Learning for Preference-Conditioned Image Generation.

[BibT_eX]

[DOI]

CoRR, December, 2025

PreferThinker: Reasoning-based Personalized Image Preference Assessment.

[BibT_eX]

[DOI]

CoRR, November, 2025

Interactive Conversational Head Generation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., August, 2025

Learning User Preferences for Image Generation Model.

[BibT_eX]

[DOI]

CoRR, August, 2025

StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., May, 2025

Teaching Masked Autoencoder With Strong Augmentations.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., May, 2025

V2Flow: Unifying Visual Tokenization and Large Language Model Vocabularies for Autoregressive Image Generation.

[BibT_eX]

[DOI]

CoRR, March, 2025

Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Enhancing Reward Models for High-Quality Image Generation: Beyond Text-Image Alignment.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

2024

Visualizing and Understanding Patch Interactions in Vision Transformer.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., October, 2024

STAR: Scale-wise Text-to-image generation via Auto-Regressive representations.

[BibT_eX]

[DOI]

CoRR, 2024

StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models.

[BibT_eX]

[DOI]

Yalong Bai

Mohan Zhou

Qing Yang

CoRR, 2024

CAMEL: CAusal Motion Enhancement Tailored for Lifting Text-Driven Video Editing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Dynamic Prompt Optimizing for Text-to-Image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Augmentation Pathways Network for Visual Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., August, 2023

Boosting Generic Visual-Linguistic Representation With Dynamic Contexts.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Deep Equilibrium Multimodal Fusion.

[BibT_eX]

[DOI]

CoRR, 2023

Visual-Aware Text-to-Speech.

[BibT_eX]

[DOI]

CoRR, 2023

Learning and Evaluating Human Preferences for Conversational Head Generation.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Visual-Aware Text-to-Speech<sup>*</sup>.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Freeform Body Motion Generation from Speech.

[BibT_eX]

[DOI]

CoRR, 2022

Responsive Listening Head Generation: A Benchmark Dataset and Baseline.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Directional Self-supervised Learning for Heavy Image Augmentations.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Responsive Listening Head Generation: A Benchmark Dataset and Baseline.

[BibT_eX]

[DOI]

CoRR, 2021

Directional Self-supervised Learning for Risky Image Augmentations.

[BibT_eX]

[DOI]

CoRR, 2021

Augmentation Pathways Network for Visual Recognition.

[BibT_eX]

[DOI]

CoRR, 2021

Flat and Shallow: Understanding Fake Image Detection Models by Architecture Profiling.

[BibT_eX]

[DOI]

Proceedings of the MMAsia '21: ACM Multimedia Asia, Gold Coast, Australia, December 1, 2021

Exploiting Relationship for Complex-scene Image Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Products-10K: A Large-scale Product Recognition Dataset.

[BibT_eX]

[DOI]

CoRR, 2020

Look-Into-Object: Self-Supervised Structure Modeling for Object Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Rethinking Visual Relationships for High-level Image Understanding.

[BibT_eX]

[DOI]

CoRR, 2019

VrR-VG: Refocusing Visually-Relevant Relationships.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Destruction and Construction Learning for Fine-Grained Image Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Automatic Data Augmentation from Massive Web Images for Deep Visual Recognition.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2018

Deep Attention Neural Tensor Network for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

2017

Automatic Dataset Augmentation.

[BibT_eX]

[DOI]

CoRR, 2017

Convolutional neural networks for posed and spontaneous expression recognition.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

2016

Improve dog recognition by mining more information from both click-through logs and pre-trained models.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Multimedia & Expo Workshops, 2016

2015

Learning Cross Space Mapping via DNN Using Large Scale Click-Through Logs.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2015

Automatic Image Dataset Construction from Click-through Logs Using Deep Neural Network.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

2014

Visualizing and Comparing Convolutional Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2014

Learning High-level Image Representation for Image Retrieval via Multi-Task DNN using Clickthrough Data.

[BibT_eX]

[DOI]

Proceedings of the 2nd International Conference on Learning Representations, 2014

Bag-of-Words Based Deep Neural Network for Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

RC-NET: A General Framework for Incorporating Knowledge into Word Representations.

[BibT_eX]

[DOI]

Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 2014

DNN Flow: DNN Feature Pyramid based Image Matching.

[BibT_eX]

[DOI]

Proceedings of the British Machine Vision Conference, 2014

2013

Learning Domain Differences Automatically for Dependency Parsing Adaptation.

[BibT_eX]

[DOI]

Mo Yu

Tiejun Zhao

Yalong Bai

Proceedings of the IJCAI 2013, 2013

Cross-lingual Projections between Languages from Different Families.

[BibT_eX]

[DOI]

Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

Yalong Bai

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...