Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

CogLM: Tracking Cognitive Development of Large Language Models.

[BibT_eX]

[DOI]

Xinglin Wang

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Single Trajectory Distillation for Accelerating Image and Video Style Transfer.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

NoteLLM-2: Multimodal Large Representation Models for Recommendation.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025

Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025

SNS-Bench: Defining, Building, and Assessing Capabilities of Large Language Models in Social Networking Services.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

UniCBE: An Uniformity-driven Comparing Based Evaluation Framework with Unified Multi-Objective Optimization.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

A Sanity Check for AI-generated Image Detection.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

DynaPrompt: Dynamic Test-Time Prompt Tuning.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Object-Centric Video Question Answering with Visual Grounding and Referring.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

DynamicFace: High-Quality and Consistent Face Swapping for Image and Video Using Composable 3D Facial Priors.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

VLRMBench: A Comprehensive and Challenging Benchmark for Vision-Language Reward Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

RealBench: A Chinese Multi-image Understanding Benchmark Close to Real-world Scenarios.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Think-Search-Patch: A Retrieval-Augmented Reasoning Framework for Repository-Level Code Repair.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

InsBank: Evolving Instruction Subset for Ongoing Alignment.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

EcoLANG: Efficient and Effective Agent Communication Language Induction for Social Simulation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Speculative Decoding for Multi-Sample Inference.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

DecEx-RAG: Boosting Agentic Retrieval-Augmented Generation with Decision and Execution Optimization via Process Supervision.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

SelfRACG: Enabling LLMs to Self-Express and Retrieve for Code Generation.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

ZigZagKV: Dynamic KV Cache Compression for Long-context Modeling based on Layer Uncertainty.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

Towards the Law of Capacity Gap in Distilling Language Models.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Beyond One-Size-Fits-All: Tailored Benchmarks for Efficient Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MarkerGen.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

iPET: An Interactive Emotional Companion Dialogue System with LLM-Powered Virtual Pet World Simulation.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), 2025

Revisiting Self-Consistency from Dynamic Distributional Alignment Perspective on Answer Aggregation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

OV-VIS: Open-Vocabulary Video Instance Segmentation.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., November, 2024

OV-DAR: Open-Vocabulary Object Detection and Attributes Recognition.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., November, 2024

TOMGPT: Reliable Text-Only Training Approach for Cost-Effective Multi-modal Large Language Model.

[BibT_eX]

[DOI]

ACM Trans. Knowl. Discov. Data, August, 2024

PiClick: Picking the desired mask from multiple candidates in click-based interactive segmentation.

[BibT_eX]

[DOI]

Neurocomputing, 2024

LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant.

[BibT_eX]

[DOI]

CoRR, 2024

ScalingNote: Scaling up Retrievers with Large Language Models for Real-World Dense Retrieval.

[BibT_eX]

[DOI]

CoRR, 2024

GPRec: Bi-level User Modeling for Deep Recommenders.

[BibT_eX]

[DOI]

CoRR, 2024

Benchmarking Large Language Models for Conversational Question Answering in Multi-instructional Documents.

[BibT_eX]

[DOI]

CoRR, 2024

P4Q: Learning to Prompt for Quantization in Visual-language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance.

[BibT_eX]

[DOI]

CoRR, 2024

Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning.

[BibT_eX]

[DOI]

CoRR, 2024

Mining Open Semantics from CLIP: A Relation Transition Perspective for Few-Shot Learning.

[BibT_eX]

[DOI]

CoRR, 2024

From a Social Cognitive Perspective: Context-aware Visual Social Relationship Recognition.

[BibT_eX]

[DOI]

CoRR, 2024

NoteLLM-2: Multimodal Large Representation Models for Recommendation.

[BibT_eX]

[DOI]

CoRR, 2024

From Image to Video, what do we need in multimodal LLMs?

[BibT_eX]

[DOI]

CoRR, 2024

Agent Group Chat: An Interactive Group Chat Simulacra For Better Eliciting Collective Emergent Behavior.

[BibT_eX]

[DOI]

CoRR, 2024

StableGarment: Garment-Centric Generation via Stable Diffusion.

[BibT_eX]

[DOI]

CoRR, 2024

Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model.

[BibT_eX]

[DOI]

CoRR, 2024

NoteLLM: A Retrievable Large Language Model for Note Recommendation.

[BibT_eX]

[DOI]

CoRR, 2024

NoteLLM: A Retrievable Large Language Model for Note Recommendation.

[BibT_eX]

[DOI]

Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, 2024

Vript: A Video Is Worth Thousands of Words.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Instruction Embedding: Latent Representations of Instructions Towards Task Identification.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Small-loss Adaptive Regret for Online Convex Optimization.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Efficient Stochastic Approximation of Minimax Excess Risk Optimization.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Knowledge-Enhanced Multi-perspective Incongruity Perception Network for Multimodal Sarcasm Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Caseg: Clip-Based Action Segmentation With Learnable Text Prompt.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2024

Bi-Level User Modeling for Deep Recommenders.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Data Mining, 2024

Focused Large Language Models are Stable Many-Shot Learners.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

VISA: Reasoning Video Object Segmentation via Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ZONE: Zero-Shot Instruction-Guided Local Editing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

BatchEval: Towards Human-like Text Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Poor-Supervised Evaluation for SuperLLM via Mutual Consistency.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Controllable Mind Visual Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

AQ-DETR: Low-Bit Quantized Detection Transformer with Auxiliary Queries.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Optimizing traffic efficiency via a reinforcement learning approach based on time allocation.

[BibT_eX]

[DOI]

Int. J. Mach. Learn. Cybern., October, 2023

ZONE: Zero-Shot Instruction-Guided Local Editing.

[BibT_eX]

[DOI]

CoRR, 2023

PiClick: Picking the desired mask in click-based interactive segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

MVP-SEG: Multi-View Prompt Learning for Open-Vocabulary Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Open-Vocabulary Video Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2INER: Instructive and In-Context Learning on Few-Shot Named Entity Recognition.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

OvarNet: Towards Open-Vocabulary Object Attribute Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Horizontal-to-Vertical Video Conversion.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

End-to-End Temporal Action Detection With Transformer.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

Occluded Video Instance Segmentation: A Benchmark.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2022

Non-stationary Dueling Bandits for Online Learning to Rank.

[BibT_eX]

[DOI]

Proceedings of the Web and Big Data - 6th International Joint Conference, 2022

2021

Socializing the Videos: A Multimodal Approach for Social Relation Recognition.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2021

Boosting Entity-aware Image Captioning with Multi-modal Knowledge Graph.

[BibT_eX]

[DOI]

CoRR, 2021

End-to-end Temporal Action Detection with Transformer.

[BibT_eX]

[DOI]

CoRR, 2021

Occluded Video Instance Segmentation.

[BibT_eX]

[DOI]

CoRR, 2021

Pyramid Self-attention for Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition and Computer Vision - 4th Chinese Conference, 2021

Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Linking the Characters: Video-oriented Social Graph Generation via Hierarchical-cumulative GCN.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Decoupled IoU Regression for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Deep Interactive Video Inpainting: An Invisibility Cloak for Harry Potter.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Salient Object Ranking with Position-Preserved Attention.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

SwiftNet: Real-Time Video Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Multi-Shot Temporal Event Localization: A Benchmark.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Stochastic Bandits with Graph Feedback in Non-Stationary Environments.

[BibT_eX]

[DOI]

Shiyin Lu

Yao Hu

Lijun Zhang

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Spatial-temporal Causal Inference for Partial Image-to-video Adaptation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

LAMP: Label Augmented Multimodal Pretraining.

[BibT_eX]

[DOI]

CoRR, 2020

Spherical Knowledge Distillation.

[BibT_eX]

[DOI]

CoRR, 2020

Multi-label Zero-shot Classification by Learning to Transfer from External Knowledge.

[BibT_eX]

[DOI]

CoRR, 2020

Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework.

[BibT_eX]

[DOI]

Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020

Feature-Induced Manifold Disambiguation for Multi-View Partial Multi-label Learning.

[BibT_eX]

[DOI]

Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Adapting to Smoothness: A More Universal Algorithm for Online Convex Optimization.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Deep Time-Stream Framework for Click-through Rate Prediction by Tracking Interest Evolution.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Uncertainty Aware Graph Gaussian Process for Semi-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Multi-View Partial Multi-Label Learning with Graph-Based Disambiguation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Correlation Maximized Structural Similarity Loss for Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2019

Multi-View Multi-Label Learning with View-Specific Information Extraction.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Multi-Objective Generalized Linear Bandits.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Multi-View Active Learning for Video Recommendation.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Optimal Algorithms for Lipschitz Bandits with Heavy-tailed Rewards.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

2016

Atom Decomposition with Adaptive Basis Selection Strategy for Matrix Completion.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2016

Online robust principal component analysis via truncated nuclear norm regularization.

[BibT_eX]

[DOI]

Neurocomputing, 2016

Atom Decomposition Based Subgradient Descent for matrix classification.

[BibT_eX]

[DOI]

Neurocomputing, 2016

2015

Large scale multi-class classification with truncated nuclear norm regularization.

[BibT_eX]

[DOI]

Neurocomputing, 2015

Event Recovery by Faster Truncated Nuclear Norm Minimization.

[BibT_eX]

[DOI]

Proceedings of the Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques, 2015

2014

Fast and Accurate Hashing Via Iterative Nearest Neighbors Expansion.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., 2014

Matrix Completion for Cross-view Pairwise Constraint Propagation.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Iterative Multi-View Hashing for Cross Media Indexing.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Sparse Learning for Stochastic Composite Optimization.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013

Fast and Accurate Matrix Completion via Truncated Nuclear Norm Regularization.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2013

Salient Object Detection via Fast Iterative Truncated Nuclear Norm Recovery.

[BibT_eX]

[DOI]

Proceedings of the Intelligence Science and Big Data Engineering, 2013

A Unified Approximate Nearest Neighbor Search Scheme by Combining Data Structure and Hashing.

[BibT_eX]

[DOI]

Proceedings of the IJCAI 2013, 2013

Active Learning Based on Local Representation.

[BibT_eX]

[DOI]

Proceedings of the IJCAI 2013, 2013

Complementary Projection Hashing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2013

2012

Accelerated singular value thresholding for matrix completion.

[BibT_eX]

[DOI]

Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2012

Matrix completion by Truncated Nuclear Norm Regularization.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Yao Hu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...