Denny Zhou

According to our database1, Denny Zhou authored at least 71 papers between 2015 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems.
CoRR, 2024

Chain-of-Thought Reasoning Without Prompting.
CoRR, 2024

Transformers Can Achieve Length Generalization But Not Robustly.
CoRR, 2024

Premise Order Matters in Reasoning with Large Language Models.
CoRR, 2024

Self-Discover: Large Language Models Self-Compose Reasoning Structures.
CoRR, 2024

2023
PaLM: Scaling Language Modeling with Pathways.
J. Mach. Learn. Res., 2023

Universal Self-Consistency for Large Language Model Generation.
CoRR, 2023

Instruction-Following Evaluation for Large Language Models.
CoRR, 2023

Large Language Models can Learn Rules.
CoRR, 2023

Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models.
CoRR, 2023

FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation.
CoRR, 2023

Large Language Models Cannot Self-Correct Reasoning Yet.
CoRR, 2023

Large Language Models as Analogical Reasoners.
CoRR, 2023

Large Language Models as Optimizers.
CoRR, 2023

Simple synthetic data reduces sycophancy in large language models.
CoRR, 2023

Large Language Models as Tool Makers.
CoRR, 2023

Training Socially Aligned Language Models in Simulated Human Society.
CoRR, 2023

Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts.
CoRR, 2023

A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity.
CoRR, 2023

Teaching Large Language Models to Self-Debug.
CoRR, 2023

Larger language models do in-context learning differently.
CoRR, 2023

Large Language Models Can Be Easily Distracted by Irrelevant Context.
Proceedings of the International Conference on Machine Learning, 2023

Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization.
Proceedings of the International Conference on Machine Learning, 2023

The Flan Collection: Designing Data and Methods for Effective Instruction Tuning.
Proceedings of the International Conference on Machine Learning, 2023

Least-to-Most Prompting Enables Complex Reasoning in Large Language Models.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

TEMPERA: Test-Time Prompt Editing via Reinforcement Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

UL2: Unifying Language Learning Paradigms.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Recitation-Augmented Language Models.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Language models are multilingual chain-of-thought reasoners.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Mind's Eye: Grounded Language Model Reasoning through Simulation.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Compositional Semantic Parsing with Large Language Models.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

What learning algorithm is in-context learning? Investigations with linear models.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Self-Consistency Improves Chain of Thought Reasoning in Language Models.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Symbol tuning improves in-context learning in language models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Transcending Scaling Laws with 0.1% Extra Compute.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Emergent Abilities of Large Language Models.
Trans. Mach. Learn. Res., 2022

TEMPERA: Test-Time Prompting via Reinforcement Learning.
CoRR, 2022

Scaling Instruction-Finetuned Language Models.
CoRR, 2022

Rationale-Augmented Ensembles in Language Models.
CoRR, 2022

Least-to-Most Prompting Enables Complex Reasoning in Large Language Models.
CoRR, 2022

Self-Consistency Improves Chain of Thought Reasoning in Language Models.
CoRR, 2022

DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection.
CoRR, 2022

Chain of Thought Prompting Elicits Reasoning in Large Language Models.
CoRR, 2022

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Back Razor: Memory-Efficient Transfer Learning by Self-Sparsified Backpropagation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Performance.
Proceedings of the International Conference on Machine Learning, 2022

Auto-scaling Vision Transformers without Training.
Proceedings of the Tenth International Conference on Learning Representations, 2022

A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Token Dropping for Efficient BERT Pretraining.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation.
CoRR, 2021

Speeding up Deep Model Training by Sharing Weights and Then Unsharing.
CoRR, 2021

LEGO: Latent Execution-Guided Reasoning for Multi-Hop Question Answering on Knowledge Graphs.
Proceedings of the 38th International Conference on Machine Learning, 2021

SpreadsheetCoder: Formula Prediction from Semi-structured Context.
Proceedings of the 38th International Conference on Machine Learning, 2021

Fast WordPiece Tokenization.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Extremely Small BERT Models from Mixed-Vocabulary Training.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

2020
Linear-Time WordPiece Tokenization.
CoRR, 2020

Compositional Generalization via Neural-Symbolic Stack Machines.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Deep State-Space Generative Model For Correlated Time-to-Event Predictions.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks.
Proceedings of the 37th International Conference on Machine Learning, 2020

Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection.
Proceedings of the 37th International Conference on Machine Learning, 2020

Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension.
Proceedings of the 8th International Conference on Learning Representations, 2020

MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Deep Physiological State Space Model for Clinical Forecasting.
CoRR, 2019

Extreme Language Model Compression with Optimal Subwords and Shared Projections.
CoRR, 2019

Doubly Sparse: Sparse Mixture of Sparse Experts for Efficient Softmax Inference.
CoRR, 2019

Neural Logic Machines.
Proceedings of the 7th International Conference on Learning Representations, 2019

2015
Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015


  Loading...