Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

NoteLLM-2: Multimodal Large Representation Models for Recommendation.

[BibT_eX]

[DOI]

Chao Zhang

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025

Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025

SNS-Bench: Defining, Building, and Assessing Capabilities of Large Language Models in Social Networking Services.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

A Sanity Check for AI-generated Image Detection.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

DynaPrompt: Dynamic Test-Time Prompt Tuning.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

ZigZagKV: Dynamic KV Cache Compression for Long-context Modeling based on Layer Uncertainty.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

Towards the Law of Capacity Gap in Distilling Language Models.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

OV-VIS: Open-Vocabulary Video Instance Segmentation.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., November, 2024

OV-DAR: Open-Vocabulary Object Detection and Attributes Recognition.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., November, 2024

TOMGPT: Reliable Text-Only Training Approach for Cost-Effective Multi-modal Large Language Model.

[BibT_eX]

[DOI]

ACM Trans. Knowl. Discov. Data, August, 2024

PiClick: Picking the desired mask from multiple candidates in click-based interactive segmentation.

[BibT_eX]

[DOI]

Neurocomputing, 2024

Single Trajectory Distillation for Accelerating Image and Video Style Transfer.

[BibT_eX]

[DOI]

CoRR, 2024

LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant.

[BibT_eX]

[DOI]

CoRR, 2024

ScalingNote: Scaling up Retrievers with Large Language Models for Real-World Dense Retrieval.

[BibT_eX]

[DOI]

CoRR, 2024

GPRec: Bi-level User Modeling for Deep Recommenders.

[BibT_eX]

[DOI]

CoRR, 2024

Benchmarking Large Language Models for Conversational Question Answering in Multi-instructional Documents.

[BibT_eX]

[DOI]

CoRR, 2024

P4Q: Learning to Prompt for Quantization in Visual-language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance.

[BibT_eX]

[DOI]

CoRR, 2024

Mining Open Semantics from CLIP: A Relation Transition Perspective for Few-Shot Learning.

[BibT_eX]

[DOI]

CoRR, 2024

From a Social Cognitive Perspective: Context-aware Visual Social Relationship Recognition.

[BibT_eX]

[DOI]

CoRR, 2024

NoteLLM-2: Multimodal Large Representation Models for Recommendation.

[BibT_eX]

[DOI]

CoRR, 2024

From Image to Video, what do we need in multimodal LLMs?

[BibT_eX]

[DOI]

CoRR, 2024

Agent Group Chat: An Interactive Group Chat Simulacra For Better Eliciting Collective Emergent Behavior.

[BibT_eX]

[DOI]

CoRR, 2024

StableGarment: Garment-Centric Generation via Stable Diffusion.

[BibT_eX]

[DOI]

CoRR, 2024

Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model.

[BibT_eX]

[DOI]

CoRR, 2024

NoteLLM: A Retrievable Large Language Model for Note Recommendation.

[BibT_eX]

[DOI]

CoRR, 2024

NoteLLM: A Retrievable Large Language Model for Note Recommendation.

[BibT_eX]

[DOI]

Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, 2024

Vript: A Video Is Worth Thousands of Words.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Small-loss Adaptive Regret for Online Convex Optimization.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Efficient Stochastic Approximation of Minimax Excess Risk Optimization.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Knowledge-Enhanced Multi-perspective Incongruity Perception Network for Multimodal Sarcasm Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Caseg: Clip-Based Action Segmentation With Learnable Text Prompt.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2024

Bi-Level User Modeling for Deep Recommenders.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Data Mining, 2024

VISA: Reasoning Video Object Segmentation via Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ZONE: Zero-Shot Instruction-Guided Local Editing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Controllable Mind Visual Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

AQ-DETR: Low-Bit Quantized Detection Transformer with Auxiliary Queries.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Optimizing traffic efficiency via a reinforcement learning approach based on time allocation.

[BibT_eX]

[DOI]

Int. J. Mach. Learn. Cybern., October, 2023

ZONE: Zero-Shot Instruction-Guided Local Editing.

[BibT_eX]

[DOI]

CoRR, 2023

PiClick: Picking the desired mask in click-based interactive segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

MVP-SEG: Multi-View Prompt Learning for Open-Vocabulary Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Open-Vocabulary Video Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2INER: Instructive and In-Context Learning on Few-Shot Named Entity Recognition.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

OvarNet: Towards Open-Vocabulary Object Attribute Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Horizontal-to-Vertical Video Conversion.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

End-to-End Temporal Action Detection With Transformer.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

Occluded Video Instance Segmentation: A Benchmark.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2022

Non-stationary Dueling Bandits for Online Learning to Rank.

[BibT_eX]

[DOI]

Proceedings of the Web and Big Data - 6th International Joint Conference, 2022

2021

Socializing the Videos: A Multimodal Approach for Social Relation Recognition.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2021

End-to-end Temporal Action Detection with Transformer.

[BibT_eX]

[DOI]

CoRR, 2021

Occluded Video Instance Segmentation.

[BibT_eX]

[DOI]

CoRR, 2021

Pyramid Self-attention for Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition and Computer Vision - 4th Chinese Conference, 2021

Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Linking the Characters: Video-oriented Social Graph Generation via Hierarchical-cumulative GCN.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Decoupled IoU Regression for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Deep Interactive Video Inpainting: An Invisibility Cloak for Harry Potter.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Salient Object Ranking with Position-Preserved Attention.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

SwiftNet: Real-Time Video Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Multi-Shot Temporal Event Localization: A Benchmark.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Stochastic Bandits with Graph Feedback in Non-Stationary Environments.

[BibT_eX]

[DOI]

Shiyin Lu

Yao Hu

Lijun Zhang

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

LAMP: Label Augmented Multimodal Pretraining.

[BibT_eX]

[DOI]

CoRR, 2020

Spherical Knowledge Distillation.

[BibT_eX]

[DOI]

CoRR, 2020

Adapting to Smoothness: A More Universal Algorithm for Online Convex Optimization.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Correlation Maximized Structural Similarity Loss for Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2019

Multi-Objective Generalized Linear Bandits.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Optimal Algorithms for Lipschitz Bandits with Heavy-tailed Rewards.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

2016

Atom Decomposition with Adaptive Basis Selection Strategy for Matrix Completion.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2016

Online robust principal component analysis via truncated nuclear norm regularization.

[BibT_eX]

[DOI]

Neurocomputing, 2016

Atom Decomposition Based Subgradient Descent for matrix classification.

[BibT_eX]

[DOI]

Neurocomputing, 2016

2015

Large scale multi-class classification with truncated nuclear norm regularization.

[BibT_eX]

[DOI]

Neurocomputing, 2015

Event Recovery by Faster Truncated Nuclear Norm Minimization.

[BibT_eX]

[DOI]

Proceedings of the Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques, 2015

2014

Fast and Accurate Hashing Via Iterative Nearest Neighbors Expansion.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., 2014

Matrix Completion for Cross-view Pairwise Constraint Propagation.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Iterative Multi-View Hashing for Cross Media Indexing.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Sparse Learning for Stochastic Composite Optimization.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013

Fast and Accurate Matrix Completion via Truncated Nuclear Norm Regularization.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2013

Salient Object Detection via Fast Iterative Truncated Nuclear Norm Recovery.

[BibT_eX]

[DOI]

Proceedings of the Intelligence Science and Big Data Engineering, 2013

A Unified Approximate Nearest Neighbor Search Scheme by Combining Data Structure and Hashing.

[BibT_eX]

[DOI]

Proceedings of the IJCAI 2013, 2013

Active Learning Based on Local Representation.

[BibT_eX]

[DOI]

Proceedings of the IJCAI 2013, 2013

Complementary Projection Hashing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2013

2012

Accelerated singular value thresholding for matrix completion.

[BibT_eX]

[DOI]

Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2012

Matrix completion by Truncated Nuclear Norm Regularization.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Yao Hu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...