Di He

Affiliations:

Peking University, School of Intelligence Science and Technology, National Key Laboratory of General Artificial Intelligence, Beijing, China
Microsoft Research Asia, Machine Learning Group, Beijing, China
Peking University, School of EECS, Key Laboratory of Machine Perception, Beijing, China (PhD)

According to our database¹, Di He authored at least 93 papers between 2013 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

Do Efficient Transformers Really Save Computation?

[BibT_eX]

[DOI]

CoRR, 2024

Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2024

DOF: Accelerating High-order Differential Operators with Forward Propagation.

[BibT_eX]

[DOI]

CoRR, 2024

Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation.

[BibT_eX]

[DOI]

CoRR, 2024

Beyond Weisfeiler-Lehman: A Quantitative Framework for GNN Expressiveness.

[BibT_eX]

[DOI]

CoRR, 2024

End-to-End Crystal Structure Prediction from Powder X-Ray Diffraction.

[BibT_eX]

[DOI]

CoRR, 2024

2023

REST: Retrieval-Based Speculative Decoding.

[BibT_eX]

[DOI]

CoRR, 2023

CORE: Common Random Reconstruction for Distributed Optimization with Provable Low Communication Complexity.

[BibT_eX]

[DOI]

CoRR, 2023

Forward Laplacian: A New Computational Framework for Neural Network-based Variational Monte Carlo.

[BibT_eX]

[DOI]

CoRR, 2023

Highly Accurate Quantum Chemical Property Prediction with Uni-Mol+.

[BibT_eX]

[DOI]

CoRR, 2023

3D Molecular Generation via Virtual Dynamics.

[BibT_eX]

[DOI]

CoRR, 2023

Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers.

[BibT_eX]

[DOI]

Krzysztof Marcin Choromanski

Shanda Li

Valerii Likhosherstov

CoRR, 2023

Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Complete Expressiveness Hierarchy for Subgraph GNNs via Subgraph Weisfeiler-Lehman Tests.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Rethinking the Expressive Power of GNNs via Graph Biconnectivity.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Denoising Masked Autoencoders Help Robust Classification.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

One Transformer Can Understand Both 2D & 3D Molecular Data.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Learning Physics-Informed Neural Networks without Stacked Back-propagation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Robustness-Aware Word Embedding Improves Certified Robustness to Adversarial Word Substitutions.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022

Denoising Masked AutoEncoders are Certifiable Robust Vision Learners.

[BibT_eX]

[DOI]

CoRR, 2022

Is L2 Physics-Informed Loss Always Suitable for Training Physics-Informed Neural Network?

[BibT_eX]

[DOI]

CoRR, 2022

METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals.

[BibT_eX]

[DOI]

CoRR, 2022

Benchmarking Graphormer on Large-Scale Molecular Modeling Datasets.

[BibT_eX]

[DOI]

CoRR, 2022

Rethinking Lipschitz Neural Networks and Certified Robustness: A Boolean Function Perspective.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Online Training Through Time for Spiking Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Is $L^2$ Physics Informed Loss Always Suitable for Training Physics Informed Neural Network?

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Your Transformer May Not be as Powerful as You Expect.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

HousE: Knowledge Graph Embedding with Householder Parameterization.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Boosting the Certified Robustness of L-infinity Distance Nets.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Two Coupled Rejection Metrics Can Tell Adversarial Examples Apart.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Finding the Dominant Winning Ticket in Pre-Trained Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021

Can Vision Transformers Perform Convolution?

[BibT_eX]

[DOI]

CoRR, 2021

First Place Solution of KDD Cup 2021 & OGB Large-Scale Challenge Graph Prediction Track.

[BibT_eX]

[DOI]

CoRR, 2021

Do Transformers Really Perform Bad for Graph Representation?

[BibT_eX]

[DOI]

CoRR, 2021

Adversarial Training with Rectified Rejection.

[BibT_eX]

[DOI]

CoRR, 2021

Transformers with Competitive Ensembles of Independent Mechanisms.

[BibT_eX]

[DOI]

CoRR, 2021

LazyFormer: Self Attention with Lazy Update.

[BibT_eX]

[DOI]

CoRR, 2021

Less is More: Pre-training a Strong Siamese Encoder Using a Weak Decoder.

[BibT_eX]

[DOI]

CoRR, 2021

Revisiting Language Encoding in Learning Multilingual Representations.

[BibT_eX]

[DOI]

CoRR, 2021

Towards Certifying 𝓁∞ Robustness using Neural Networks with 𝓁∞-dist Neurons.

[BibT_eX]

[DOI]

CoRR, 2021

Do Transformers Really Perform Badly for Graph Representation?

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

The Open Catalyst Challenge 2021: Competition Report.

[BibT_eX]

[DOI]

Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, 2021

Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

How could Neural Networks understand Programs?

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Rethinking Positional Encoding in Language Pre-training.

[BibT_eX]

[DOI]

Guolin Ke

Di He

Tie-Yan Liu

Proceedings of the 9th International Conference on Learning Representations, 2021

Taking Notes on the Fly Helps Language Pre-Training.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Less is More: Pretrain a Strong Siamese Encoder for Dense Text Retrieval Using a Weak Decoder.

[BibT_eX]

[DOI]

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

2020

Taking Notes on the Fly Helps BERT Pre-training.

[BibT_eX]

[DOI]

CoRR, 2020

Transferred Discrepancy: Quantifying the Difference Between Representations.

[BibT_eX]

[DOI]

CoRR, 2020

MC-BERT: Efficient Language Pre-Training via a Meta Controller.

[BibT_eX]

[DOI]

CoRR, 2020

I4R: Promoting Deep Reinforcement Learning by the Indicator for Expressive Representations.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

On Layer Normalization in the Transformer Architecture.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Incorporating BERT into Neural Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Invertible Image Rescaling.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

2019

Defective Convolutional Layers Learn Robust CNNs.

[BibT_eX]

[DOI]

CoRR, 2019

On the Anomalous Generalization of GANs.

[BibT_eX]

[DOI]

CoRR, 2019

Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View.

[BibT_eX]

[DOI]

CoRR, 2019

Adversarially Robust Generalization Just Requires More Unlabeled Data.

[BibT_eX]

[DOI]

CoRR, 2019

A Gram-Gauss-Newton Method Learning Overparameterized Deep Neural Networks for Regression Problems.

[BibT_eX]

[DOI]

CoRR, 2019

Microsoft Research Asia's Systems for WMT19.

[BibT_eX]

[DOI]

Proceedings of the Fourth Conference on Machine Translation, 2019

Fast Structured Decoding for Sequence Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Deliberation Learning for Image-to-Image Translation.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Towards a Deep and Unified Understanding of Deep Neural Models in NLP.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Efficient Training of BERT by Progressively Stacking.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Multilingual Neural Machine Translation with Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Representation Degeneration Problem in Training Natural Language Generation Models.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Machine Translation With Weakly Paired Documents.

[BibT_eX]

[DOI]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Multilingual Neural Machine Translation with Language Clustering.

[BibT_eX]

[DOI]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Hint-Based Training for Non-Autoregressive Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Tied Transformers: Neural Machine Translation with Shared Encoder and Decoder.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Non-Autoregressive Machine Translation with Auxiliary Regularization.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Sentence-Wise Smooth Regularization for Sequence to Sequence Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

FRAGE: Frequency-Agnostic Word Representation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Dense Information Flow for Neural Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Towards Binary-Valued Gates for Robust LSTM Training.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Beyond Error Propagation in Neural Machine Translation: Characteristics of Language Also Matter.

[BibT_eX]

[DOI]

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Double Path Networks for Sequence to Sequence Learning.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Computational Linguistics, 2018

2017

Scale Effects in Web Search.

[BibT_eX]

[DOI]

Proceedings of the Web and Internet Economics - 13th International Conference, 2017

Decoding with Value Networks for Neural Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

2016

Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves.

[BibT_eX]

[DOI]

CoRR, 2016

Dual Learning for Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

2014

Generalized second price auction with probabilistic broad match.

[BibT_eX]

[DOI]

Proceedings of the ACM Conference on Economics and Computation, 2014

2013

Online learning for auction mechanism in bandit setting.

[BibT_eX]

[DOI]

Decis. Support Syst., 2013

A Theoretical Analysis of NDCG Type Ranking Measures

[BibT_eX]

[DOI]

CoRR, 2013

A Game-Theoretic Machine Learning Approach for Revenue Maximization in Sponsored Search.

[BibT_eX]

[DOI]

Proceedings of the IJCAI 2013, 2013

A Theoretical Analysis of NDCG Type Ranking Measures.

[BibT_eX]

[DOI]

Proceedings of the COLT 2013, 2013

Di He

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...