Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward.

[BibT_eX]

[DOI]

Ruohong Zhang

Alexander G. Hauptmann

Yonatan Bisk

Yiming Yang

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Longhorn: State Space Models are Amortized Online Learners.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Structured Policy Optimization: Enhance Large Vision-Language Model via Self-Referenced Dialogue.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Text2Data: Low-Resource Data Generation with Textual Control.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding.

[BibT_eX]

[DOI]

CoRR, 2024

xLAM: A Family of Large Action Models to Empower AI Agent Systems.

[BibT_eX]

[DOI]

CoRR, 2024

Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents.

[BibT_eX]

[DOI]

CoRR, 2024

APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets.

[BibT_eX]

[DOI]

CoRR, 2024

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning.

[BibT_eX]

[DOI]

Tulika Manoj Awalgaonkar

CoRR, 2024

APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Datasets.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Demand Prediction by Incorporating Internet-of-Things Data: A Case of Automobile Repair and Maintenance Service.

[BibT_eX]

[DOI]

Jieyi Zhang

Cenying Yang

Yihao Feng

Proceedings of the 57th Hawaii International Conference on System Sciences, 2024

xGen-VideoSyn-1: High-Fidelity Text-to-Video Synthesis with Compressed Representations.

[BibT_eX]

[DOI]

Can Qin

Congying Xia

Krithika Ramakrishnan

Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

HIVE: Harnessing Human Feedback for Instructional Visual Editing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents.

[BibT_eX]

[DOI]

CoRR, 2023

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization.

[BibT_eX]

[DOI]

CoRR, 2023

REX: Rapid Exploration and eXploitation for AI Agents.

[BibT_eX]

[DOI]

CoRR, 2023

Preference-grounded Token-level Guidance for Language Model Fine-tuning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

FAMO: Fast Adaptive Multitask Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Metric Residual Network for Sample Efficient Goal-Conditioned Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Metric Residual Networks for Sample Efficient Goal-conditioned Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2022

A Regularized Implicit Policy for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Operator Deep Q-Learning: Zero-Shot Reward Transferring in Reinforcement Learning.

[BibT_eX]

[DOI]

Ziyang Tang

Yihao Feng

Qiang Liu

CoRR, 2022

A Unified Framework for Alternating Offline Model Training and Policy Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

2021

Incremental Few-shot Text Classification with Multi-round New Classes: Formulation, Dataset and System.

[BibT_eX]

[DOI]

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Unsupervised Out-of-Domain Detection via Pre-trained Transformers.

[BibT_eX]

[DOI]

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020

Off-Policy Interval Estimation with Lipschitz Value Iteration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Accountable Off-Policy Evaluation With Kernel Bellman Statistics.

[BibT_eX]