Aviral Kumar

According to our database¹, Aviral Kumar authored at least 65 papers between 2017 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters.

[BibT_eX]

[DOI]

CoRR, 2024

Recursive Introspection: Teaching Language Model Agents How to Self-Improve.

[BibT_eX]

[DOI]

CoRR, 2024

RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold.

[BibT_eX]

[DOI]

CoRR, 2024

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2024

Is Value Learning Really the Main Bottleneck in Offline RL?

[BibT_eX]

[DOI]

CoRR, 2024

Unfamiliar Finetuning Examples Control How Language Models Hallucinate.

[BibT_eX]

[DOI]

CoRR, 2024

Vision-Language Models Provide Promptable Representations for Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2024

Robotic Offline RL from Internet Videos via Value-Function Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Zero-Shot Robotic Manipulation with Pre-Trained Image-Editing Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2023

Latent Conservative Objective Models for Data-Driven Crystal Structure Prediction.

[BibT_eX]

[DOI]

CoRR, 2023

Robotic Offline RL from Internet Videos via Value-Function Pre-Training.

[BibT_eX]

[DOI]

CoRR, 2023

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning.

[BibT_eX]

[DOI]

CoRR, 2023

Pre-Training for Robots: Offline RL Enables Learning New Tasks in a Handful of Trials.

[BibT_eX]

[DOI]

Proceedings of the Robotics: Science and Systems XIX, Daegu, 2023

ReDS: Offline RL With Heteroskedastic Datasets via Support Constraints.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Efficient Deep Reinforcement Learning Requires Regulating Overfitting.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Confidence-Conditioned Value Functions for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Joey Hong

Aviral Kumar

Sergey Levine

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning.

[BibT_eX]

[DOI]

Proceedings of the Conference on Robot Learning, 2023

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions.

[BibT_eX]

[DOI]

Proceedings of the Conference on Robot Learning, 2023

2022

Dual Generator Offline Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints.

[BibT_eX]

[DOI]

CoRR, 2022

Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials.

[BibT_eX]

[DOI]

CoRR, 2022

When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?

[BibT_eX]

[DOI]

CoRR, 2022

Off-Policy Actor-critic for Recommender Systems.

[BibT_eX]

[DOI]

Proceedings of the RecSys '22: Sixteenth ACM Conference on Recommender Systems, Seattle, WA, USA, September 18, 2022

DASCO: Dual-Generator Adversarial Support Constrained Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Data-Driven Offline Decision-Making via Invariant Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

How to Leverage Unlabeled Data in Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Data-Driven Offline Optimization for Architecting Hardware Accelerators.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Should I Run Offline Reinforcement Learning or Behavioral Cloning?

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Conference on Robot Learning, 2022

2021

COMBO: Conservative Offline Model-Based Policy Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Conservative Data Sharing for Multi-Task Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Conservative Objective Models for Effective Offline Model-Based Optimization.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Benchmarks for Deep Off-Policy Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Conservative Safety Critics for Exploration.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

A Workflow for Offline Model-Free Robotic Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

2020

COG: Connecting New Skills to Past Experience with Offline Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems.

[BibT_eX]

[DOI]

CoRR, 2020

D4RL: Datasets for Deep Data-Driven Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Conservative Q-Learning for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Model Inversion Networks for Model-Based Optimization.

[BibT_eX]

[DOI]

Aviral Kumar

Sergey Levine

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction.

[BibT_eX]

[DOI]

Aviral Kumar

Abhishek Gupta

Sergey Levine

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Chaining Behaviors from Data with Model-Free Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 4th Conference on Robot Learning, 2020

2019

Reward-Conditioned Policies.

[BibT_eX]

[DOI]

Aviral Kumar

Xue Bin Peng

Sergey Levine

CoRR, 2019

Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction.

[BibT_eX]

[DOI]

CoRR, 2019

Calibration of Encoder Decoder Models for Neural Machine Translation.

[BibT_eX]

[DOI]

Aviral Kumar

Sunita Sarawagi

CoRR, 2019

Graph Normalizing Flows.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Diagnosing Bottlenecks in Deep Q-learning Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

2018

Trainable Calibration Measures For Neural Networks From Kernel Mean Embeddings.

[BibT_eX]

[DOI]

Aviral Kumar

Sunita Sarawagi

Ujjwal Jain

Proceedings of the 35th International Conference on Machine Learning, 2018

2017

Challenges and Tool Implementation of Hybrid Rapidly-Exploring Random Trees.

[BibT_eX]

[DOI]

Proceedings of the Numerical Software Verification - 10th International Workshop, 2017

The Reach-Avoid Problem for Constant-Rate Multi-mode Systems.

[BibT_eX]

[DOI]

Shankara Narayanan Krishna

Proceedings of the Automated Technology for Verification and Analysis, 2017

Aviral Kumar

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...