We stand with Ukraine

We stand with Ukraine

Wenhui Wang

Affiliations:

Microsoft Research, Beijing, China

According to our database¹, Wenhui Wang authored at least 32 papers between 2017 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Fine-tuning pretrained transformer encoders for sequence-to-sequence learning.

[BibT_eX]

[DOI]

,

,

,

,

,

Int. J. Mach. Learn. Cybern., May, 2024

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, 2024

2023

When an Image is Worth 1, 024 x 1, 024 Words: A Case Study in Computational Pathology.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2023

Kosmos-2.5: A Multimodal Literate Model.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2023

LongNet: Scaling Transformers to 1, 000, 000, 000 Tokens.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2023

Kosmos-2: Grounding Multimodal Large Language Models to the World.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2023

Language Is Not All You Need: Aligning Perception with Language Models.

[BibT_eX]

[DOI]

,

,

,

,

Saksham Singhal

,

,

,

,

Owais Khan Mohammed

,

,

,

,

,

Nils Johan Bertil Bjorck

,

Vishrav Chaudhary

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Magneto: A Foundation Transformer.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Saksham Singhal

,

,

,

,

Vishrav Chaudhary

,

,

Proceedings of the International Conference on Machine Learning, 2023

Image as a Foreign Language: BEIT Pretraining for Vision and Vision-Language Tasks.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Owais Khan Mohammed

,

Saksham Singhal

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

TorchScale: Transformers at Scale.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Vishrav Chaudhary

,

,

CoRR, 2022

Foundation Transformers.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Saksham Singhal

,

,

,

,

Vishrav Chaudhary

,

,

CoRR, 2022

Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Owais Khan Mohammed

,

Saksham Singhal

,

,

CoRR, 2022

Language Models are General-Purpose Interfaces.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2022

VL-BEiT: Generative Vision-Language Pretraining.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2022

AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models.

[BibT_eX]

[DOI]

,

Subhabrata Mukherjee

,

,

,

,

,

Ahmed Hassan Awadallah

,

CoRR, 2022

VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts.

[BibT_eX]

[DOI]

,

,

,

,

Owais Khan Mohammed

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Distilled Dual-Encoder Model for Vision-Language Understanding.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021

VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2021

s2s-ft: Fine-Tuning Pretrained Transformer Encoders for Sequence-to-Sequence Learning.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2021

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training.

[BibT_eX]

[DOI]

,

,

,

,

Saksham Singhal

,

,

,

,

,

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Consistency Regularization for Cross-Lingual Fine-Tuning.

[BibT_eX]

[DOI]

,

,

,

,

,

Saksham Singhal

,

,

,

,

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020

MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the 37th International Conference on Machine Learning, 2020

Harvesting and Refining Question-Answer Pairs for Unsupervised QA.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Cross-Lingual Natural Language Generation via Pre-Training.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Unified Language Model Pre-training for Natural Language Understanding and Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Learning to Ask Unanswerable Questions for Machine Reading Comprehension.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Reading Comprehension.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the 2nd Workshop on Machine Reading for Question Answering, 2019

2018

Multiway Attention Networks for Modeling Sentence Pairs.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

2017

Gated Self-Matching Networks for Reading Comprehension and Question Answering.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

Loading...