Weizhu Chen

According to our database1, Weizhu Chen authored at least 98 papers between 2007 and 2022.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2022
CERT: Continual Pre-Training on Sketches for Library-Oriented Code Generation.
CoRR, 2022

On the Advance of Making Language Models Better Reasoners.
CoRR, 2022

Diffusion-GAN: Training GANs with Diffusion.
CoRR, 2022

A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation.
CoRR, 2022

ALLSH: Active Learning Guided by Local Sensitivity and Hardness.
CoRR, 2022

MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation.
CoRR, 2022

Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer.
CoRR, 2022

Input-Tuning: Adapting Unfamiliar Inputs to Frozen Pretrained Models.
CoRR, 2022

Truncated Diffusion Probabilistic Models.
CoRR, 2022

Mixing and Shifting: Exploiting Global and Local Dependencies in Vision MLPs.
CoRR, 2022

No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models.
CoRR, 2022

Reasoning Like Program Executors.
CoRR, 2022

CodeRetriever: Unimodal and Bimodal Contrastive Learning.
CoRR, 2022

Virtual information core optimization for collaborative filtering recommendation based on clustering and evolutionary algorithms.
Appl. Soft Comput., 2022

Controllable Natural Language Generation with Contrastive Prefixes.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Finding the Dominant Winning Ticket in Pre-Trained Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

What Makes Good In-Context Examples for GPT-3?
Proceedings of Deep Learning Inside Out: The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, 2022

2021
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing.
CoRR, 2021

DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models.
CoRR, 2021

Adversarial Retriever-Ranker for dense text retrieval.
CoRR, 2021

XLM-K: Improving Cross-Lingual Language Model Pre-Training with Multilingual Knowledge.
CoRR, 2021

LoRA: Low-Rank Adaptation of Large Language Models.
CoRR, 2021

HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalization.
CoRR, 2021

A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation.
CoRR, 2021

Adversarial Training as Stackelberg Game: An Unrolled Optimization Approach.
CoRR, 2021

Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Contextual Bandit Applications in a Customer Support Bot.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Poolingformer: Long Document Modeling with Pooling Attention.
Proceedings of the 38th International Conference on Machine Learning, 2021

BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining.
Proceedings of the 38th International Conference on Machine Learning, 2021

CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding.
Proceedings of the 9th International Conference on Learning Representations, 2021

MixKD: Towards Efficient Distillation of Large-scale Language Models.
Proceedings of the 9th International Conference on Learning Representations, 2021

Deberta: decoding-Enhanced Bert with Disentangled Attention.
Proceedings of the 9th International Conference on Learning Representations, 2021

Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

ARCH: Efficient Adversarial Regularized Training with Caching.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Token-wise Curriculum Learning for Neural Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Finetuning Pretrained Transformers into RNNs.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Few-Shot Named Entity Recognition: An Empirical Baseline Study.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Memory-Efficient Differentiable Transformer Architecture Search.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Reader-Guided Passage Reranking for Open-Domain Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Generation-Augmented Retrieval for Open-Domain Question Answering.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

GLGE: A New General Language Generation Evaluation Benchmark.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalizability.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

UnitedQA: A Hybrid Approach for Open Domain Question Answering.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Few-Shot Named Entity Recognition: A Comprehensive Study.
CoRR, 2020

Improving Self-supervised Pre-training via a Fully-Explored Masked Language Model.
CoRR, 2020

A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation.
CoRR, 2020

Example-Based Named Entity Recognition.
CoRR, 2020

Adversarial Training for Large Neural Language Models.
CoRR, 2020

Conditional Self-Attention for Query-based Summarization.
CoRR, 2020


On the Variance of the Adaptive Learning Rate and Beyond.
Proceedings of the 8th International Conference on Learning Representations, 2020

Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Understanding the Difficulty of Training Transformers.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization.
J. Mach. Learn. Res., 2019

X-SQL: reinforce schema representation with context.
CoRR, 2019

A Hybrid Neural Network Model for Commonsense Reasoning.
CoRR, 2019

Lessons from Real-World Reinforcement Learning in a Customer Support Bot.
CoRR, 2019

Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding.
CoRR, 2019

Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Parameter-free Sentence Embedding via Orthogonal Basis.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Multi-Task Deep Neural Networks for Natural Language Understanding.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Zero-training Sentence Embedding via Orthogonal Basis.
CoRR, 2018

IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles.
CoRR, 2018

Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Scientific Question Answering.
CoRR, 2018

FusionNet: Fusing via Fully-aware Attention with Application to Machine Comprehension.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
Limited-memory Common-directions Method for Distributed Optimization and its Application on Empirical Risk Minimization.
Proceedings of the 2017 SIAM International Conference on Data Mining, 2017

ReasoNet: Learning to Stop Reading in Machine Comprehension.
Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13, 2017

2014
Large-scale L-BFGS using MapReduce.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Transfer Understanding from Head Queries to Tail Queries.
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 2014

2012
Personalized click model through collaborative filtering.
Proceedings of the Fifth International Conference on Web Search and Web Data Mining, 2012

A noise-aware click model for web search.
Proceedings of the Fifth International Conference on Web Search and Web Data Mining, 2012

Beyond ten blue links: enabling user click modeling in federated web search.
Proceedings of the Fifth International Conference on Web Search and Web Data Mining, 2012

2011
Characterizing search intent diversity into click models.
Proceedings of the 20th International Conference on World Wide Web, 2011

Action prediction and identification from mining temporal user behaviors.
Proceedings of the Forth International Conference on Web Search and Web Data Mining, 2011

User-click modeling for understanding and predicting search-behavior.
Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2011

Short Text Conceptualization Using a Probabilistic Knowledgebase.
Proceedings of the IJCAI 2011, 2011

Characterizing Inverse Time Dependency in Multi-class Learning.
Proceedings of the 11th IEEE International Conference on Data Mining, 2011

A Whole Page Click Model to Better Interpret Search Engine Click Data.
Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

2010
Co-optimization of multiple relevance metrics in web search.
Proceedings of the 19th International Conference on World Wide Web, 2010

A novel click model and its applications to online advertising.
Proceedings of the Third International Conference on Web Search and Web Data Mining, 2010

Incorporating post-click behaviors into a click model.
Proceedings of the Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010

Learning click models via probit bayesian inference.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

Explore click models for search ranking.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

2009
Inverse Time Dependency in Convex Regularized Learning.
Proceedings of the ICDM 2009, 2009

P-packSVM: Parallel Primal grAdient desCent Kernel SVM.
Proceedings of the ICDM 2009, 2009

A general magnitude-preserving boosting algorithm for search ranking.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

To divide and conquer search ranking by learning query difficulty.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

2008
Web query translation via web log mining.
Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2008

Mining Translations of Web Queries from Web Click-through Data.
Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

2007
Document Transformation for Multi-label Feature Selection in Text Categorization.
Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), 2007

Mining Web Query Hierarchies from Clickthrough Data.
Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007


  Loading...