Yuandong Tian

Orcid: 0000-0003-4202-4847

Affiliations:
  • Carnegie Mellon University


According to our database1, Yuandong Tian authored at least 137 papers between 2006 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection.
CoRR, 2024

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases.
CoRR, 2024

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping.
CoRR, 2024

Diffusion World Model.
CoRR, 2024

TravelPlanner: A Benchmark for Real-World Planning with Language Agents.
CoRR, 2024

Image Classifier Based Generative Method for Planar Antenna Design.
CoRR, 2024

2023
H-GAP: Humanoid Control with a Generalist Planner.
CoRR, 2023

End-to-end Story Plot Generator.
CoRR, 2023

Learning Personalized Story Evaluation.
CoRR, 2023

GenCO: Generating Diverse Solutions to Design Problems with Combinatorial Nature.
CoRR, 2023

JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention.
CoRR, 2023

Efficient Streaming Language Models with Attention Sinks.
CoRR, 2023

RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment.
CoRR, 2023

Extending Context Window of Large Language Models via Positional Interpolation.
CoRR, 2023

H<sub>2</sub>O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
CoRR, 2023

Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models.
CoRR, 2023

A Cookbook of Self-Supervised Learning.
CoRR, 2023

Modeling Scattering Coefficients using Self-Attentive Complex Polynomials with Image-based Representation.
CoRR, 2023

Klotski: Efficient and Safe Network Migration of Large Production Datacenters.
Proceedings of the ACM SIGCOMM 2023 Conference, 2023

DyFormer : A Scalable Dynamic Graph Transformer with Provable Benefits on Generalization Ability.
Proceedings of the 2023 SIAM International Conference on Data Mining, 2023

Landscape Surrogate: Learning Decision Losses for Mathematical Optimization Under Partial Information.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time.
Proceedings of the International Conference on Machine Learning, 2023

Learning Compiler Pass Orders using Coreset and Normalized Value Prediction.
Proceedings of the International Conference on Machine Learning, 2023

Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning.
Proceedings of the International Conference on Machine Learning, 2023

SurCo: Learning Linear SURrogates for COmbinatorial Nonlinear Optimization Problems.
Proceedings of the International Conference on Machine Learning, 2023

Understanding the Role of Nonlinearity in Training Dynamics of Contrastive Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Efficient Planning in a Compact Latent Action Space.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

MACTA: A Multi-agent Reinforcement Learning Approach for Cache Timing Attacks and Detection.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

AutoCAT: Reinforcement Learning for Automated Exploration of Cache-Timing Attacks.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

Local Branching Relaxation Heuristics for Integer Linear Programs.
Proceedings of the Integration of Constraint Programming, Artificial Intelligence, and Operations Research, 2023

DOC: Improving Long Story Coherence With Detailed Outline Control.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Sample-Efficient Neural Architecture Search by Learning Actions for Monte Carlo Tree Search.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

EurNet: Efficient Multi-Range Relational Modeling of Spatial Multi-Relational Data.
CoRR, 2022

AutoCAT: Reinforcement Learning for Automated Exploration of Cache Timing-Channel Attacks.
CoRR, 2022

Understanding Curriculum Learning in Policy Optimization for Solving Combinatorial Optimization Problems.
CoRR, 2022

Deep Contrastive Learning is Provably (almost) Principal Component Analysis.
CoRR, 2022

DreamShard: Generalizable Embedding Table Placement for Recommender Systems.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Understanding Deep Contrastive Learning via Coordinate-wise Optimization.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

AutoShard: Automated Embedding Table Sharding for Recommender Systems.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Denoised MDPs: Learning World Models Better Than the World Itself.
Proceedings of the International Conference on Machine Learning, 2022

Multi-objective Optimization by Learning Space Partition.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Understanding Dimensional Collapse in Contrastive Self-supervised Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Re3: Generating Longer Stories With Recursive Reprompting and Revision.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

On the Importance of Asymmetry for Siamese Representation Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

CompilerGym: Robust, Performant Compiler Optimization Environments for AI Research.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2022

Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Learning Bounded Context-Free-Grammar via LSTM and the Transformer: Difference and the Explanations.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Q-gym: An Equality Saturation Framework for DNN Inference Exploiting Weight Repetition.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021
Planning in Learned Latent Action Spaces for Generalizable Legged Locomotion.
IEEE Robotics Autom. Lett., 2021

Dynamic Graph Representation Learning via Graph Transformer Networks.
CoRR, 2021

Towards Demystifying Representation Learning with Non-contrastive Self-supervision.
CoRR, 2021

Multi-objective Optimization by Learning Space Partitions.
CoRR, 2021

Provably Efficient Policy Gradient Methods for Two-Player Zero-Sum Markov Games.
CoRR, 2021

Network planning with deep reinforcement learning.
Proceedings of the ACM SIGCOMM 2021 Conference, Virtual Event, USA, August 23-27, 2021., 2021

NovelD: A Simple yet Effective Exploration Criterion.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

MADE: Exploration via Maximizing Deviation from Explored Regions.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning Space Partitions for Path Planning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Few-Shot Neural Architecture Search.
Proceedings of the 38th International Conference on Machine Learning, 2021

Understanding self-supervised learning dynamics without contrastive pairs.
Proceedings of the 38th International Conference on Machine Learning, 2021

Learn-to-Share: A Hardware-friendly Transfer Learning Framework Exploiting Computation and Parameter Sharing.
Proceedings of the 38th International Conference on Machine Learning, 2021

FP-NAS: Fast Probabilistic Neural Architecture Search.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

FBNetV3: Joint Architecture-Recipe Search Using Predictor Pretraining.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Understanding Robustness in Teacher-Student Setting: A New Perspective.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020
Enhancing Model Parallelism in Neural Architecture Search for Multidevice System.
IEEE Micro, 2020

BeBold: Exploration Beyond the Boundary of Explored Regions.
CoRR, 2020

Multi-Agent Collaboration via Reward Attribution Decomposition.
CoRR, 2020

Understanding Self-supervised Learning with Dual Deep Networks.
CoRR, 2020

Real-world Video Adaptation with Reinforcement Learning.
CoRR, 2020

Joint Policy Search for Multi-agent Collaboration with Imperfect Information.
CoRR, 2020

FBNetV3: Joint Architecture-Recipe Search using Neural Acquisition Function.
CoRR, 2020

Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Joint Policy Search for Multi-agent Collaboration with Imperfect Information.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Towards Automated Neural Interaction Discovery for Click-Through Rate Prediction.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Student Specialization in Deep Rectified Networks With Finite Width and Input Dimension.
Proceedings of the 37th International Conference on Machine Learning, 2020

Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP.
Proceedings of the 8th International Conference on Learning Representations, 2020

Deep Symbolic Superoptimization Without Human Knowledge.
Proceedings of the 8th International Conference on Learning Representations, 2020

FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Neural Architecture Search Using Deep Neural Networks and Monte Carlo Tree Search.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Over-parameterization as a Catalyst for Better Generalization of Deep ReLU network.
CoRR, 2019

A Neural-based Program Decompiler.
CoRR, 2019

Sample-Efficient Neural Architecture Search by Learning Action Space.
CoRR, 2019

Luck Matters: Understanding Training Dynamics of Deep ReLU Networks.
CoRR, 2019

AlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search.
CoRR, 2019

One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Hierarchical Decision Making by Generating and Following Natural Language Instructions.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Coda: An End-to-End Neural Program Decompiler.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Learning to Perform Local Rewriting for Combinatorial Optimization.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

ELF OpenGo: an analysis and open reimplementation of AlphaZero.
Proceedings of the 36th International Conference on Machine Learning, 2019

M^3RL: Mind-aware Multi-agent Management Reinforcement Learning.
Proceedings of the 7th International Conference on Learning Representations, 2019

Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees.
Proceedings of the 7th International Conference on Learning Representations, 2019

Bayesian Relational Memory for Semantic Visual Navigation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Guest Editorial Special Issue on Deep/Reinforcement Learning and Games.
IEEE Trans. Games, 2018

3D Interpreter Networks for Viewer-Centered Wireframe Modeling.
Int. J. Comput. Vis., 2018

Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search.
CoRR, 2018

Learning to Progressively Plan.
CoRR, 2018

Learning and Planning with a Semantic Model.
CoRR, 2018

A theoretical framework for deep locally connected ReLU network.
CoRR, 2018

Algorithmic Framework for Model-based Reinforcement Learning with Theoretical Guarantees.
CoRR, 2018

Channel-Recurrent Autoencoding for Image Modeling.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima.
Proceedings of the 35th International Conference on Machine Learning, 2018

Building Generalizable Agents with a Realistic and Rich 3D Environment.
Proceedings of the 6th International Conference on Learning Representations, 2018

When is a Convolutional Filter Easy to Learn?
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
CoDraw: Visual Dialog for Collaborative Drawing.
CoRR, 2017

Channel-Recurrent Variational Autoencoders.
CoRR, 2017

ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis.
Proceedings of the 34th International Conference on Machine Learning, 2017

Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning.
Proceedings of the 5th International Conference on Learning Representations, 2017

Symmetry-Breaking Convergence Analysis of Certain Two-layered Neural Networks with ReLU nonlinearity.
Proceedings of the 5th International Conference on Learning Representations, 2017

Semantic Amodal Segmentation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Better Computer Go Player with Neural Network and Long-term Prediction.
Proceedings of the 4th International Conference on Learning Representations, 2016

Single Image 3D Interpreter Network.
Proceedings of the Computer Vision - ECCV 2016, 2016

2015
Theory and Practice of Hierarchical Data-driven Descent for Optimal Deformation Estimation.
Int. J. Comput. Vis., 2015

Simple Baseline for Visual Question Answering.
CoRR, 2015

Scale-invariant learning and convolutional networks.
CoRR, 2015

2013
Theory and Practice of Globally Optimal Deformation Estimation.
PhD thesis, 2013

Hierarchical Data-Driven Descent for Efficient Optimal Deformation Estimation.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Integrating Perceptual Learning with External World Knowledge in a Simulated Student.
Proceedings of the Artificial Intelligence in Education - 16th International Conference, 2013

2012
Globally Optimal Estimation of Nonrigid Image Distortion.
Int. J. Comput. Vis., 2012

A Combined Theory of Defocused Illumination and Global Light Transport.
Int. J. Comput. Vis., 2012

Learning from crowds in the presence of schools of thought.
Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2012

Exploring the Spatial Hierarchy of Mixture Models for Human Pose Estimation.
Proceedings of the Computer Vision - ECCV 2012, 2012

Depth from optical turbulence.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
Rectification and 3D reconstruction of curved document images.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Local isomorphism to solve the pre-image problem in kernel methods.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010
A globally optimal data-driven approach for image distortion estimation.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009
Seeing through water: Image restoration using model-based tracking.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

(De) focusing on global light transport for active scene recovery.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

2008
Easytoon: an easy and quick tool to personalize a cartoon storyboard using family photo album.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

2007
A Face Annotation Framework with Partial Clustering and Interactive Labeling.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

EasyAlbum: an interactive photo annotation system based on face clustering and re-ranking.
Proceedings of the 2007 Conference on Human Factors in Computing Systems, 2007

2006
Joint Boosting Feature Selection for Robust Face Recognition.
Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 2006


  Loading...