We stand with Ukraine

We stand with Ukraine

Sheng Shen

Affiliations:

University of California Berkeley, CA, USA
Peking University, MoE Key Laboratory of High Confidence Software Technologies, Beijing, China (former)

According to our database¹, Sheng Shen authored at least 51 papers between 2017 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Efficient and Scalable Large Multimodal Models

[BibT_eX]

[DOI]

PhD thesis, 2024

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions.

[BibT_eX]

[DOI]

,

,

,

,

,

Senthil Purushwalkam

,

,

,

,

,

Etash Kumar Guha

,

Silvio Savarese

,

,

,

,

CoRR, 2024

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens.

[BibT_eX]

[DOI]

,

,

,

,

,

Etash Kumar Guha

,

,

,

Mohamed Awadalla

,

Silvio Savarese

,

,

,

,

CoRR, 2024

Enhancing Large Vision Language Models with Self-Training on Image Comprehension.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2024

RAFT: Adapting Language Model to Domain Specific RAG.

[BibT_eX]

[DOI]

,

Shishir G. Patil

,

,

,

,

,

Joseph E. Gonzalez

CoRR, 2024

Multitask Vision-Language Prompt Tuning.

[BibT_eX]

[DOI]

,

,

,

,

Joseph E. Gonzalez

,

,

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Enhancing Large Vision Language Models with Self-Training on Image Comprehension.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens.

[BibT_eX]

[DOI]

,

,

,

,

,

Etash Kumar Guha

,

,

Mohamed Awadalla

,

Silvio Savarese

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

SqueezeLLM: Dense-and-Sparse Quantization.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Michael W. Mahoney

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Mixture-of-Experts Meets Instruction Tuning: A Winning Combination for Large Language Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Hyung Won Chung

,

,

,

,

,

,

,

,

,

Vincent Y. Zhao

,

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

AgentBench: Evaluating LLMs as Agents.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Aligning Large Multimodal Models with Factually Augmented RLHF.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics, 2024

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement.

[BibT_eX]

[DOI]

,

Thanakul Wattanawong

,

,

Karttikeya Mangalam

,

,

Gopala Anumanchipalli

,

Michael W. Mahoney

,

,

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

From Text to Tactic: Evaluating LLMs Playing the Game of Avalon.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2023

HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, 2023

Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Hyung Won Chung

,

,

,

,

,

,

,

,

,

Vincent Y. Zhao

,

,

,

,

CoRR, 2023

Large Language Models are Visual Reasoning Coordinators.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Poisoning Language Models During Instruction Tuning.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the International Conference on Machine Learning, 2023

Scaling Vision-Language Models with Sparse Mixture of Experts.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Crosslingual Generalization through Multitask Finetuning.

[BibT_eX]

[DOI]

Niklas Muennighoff

,

,

Lintang Sutawika

,

,

Stella Biderman

,

,

,

,

,

Hailey Schoelkopf

,

,

,

Alham Fikri Aji

,

Khalid Almubarak

,

,

,

,

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

K-LITE: Learning Transferable Visual Models with External Knowledge.

[BibT_eX]

[DOI]

,

,

,

,

,

Pengchuan Zhang

,

,

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Exposing the Limits of Video-Text Models through Contrast Sets.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Learned Token Pruning for Transformers.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Staged Training for Transformer Language Models.

[BibT_eX]

[DOI]

,

,

,

,

Matthew E. Peters

,

Proceedings of the International Conference on Machine Learning, 2022

How Much Can CLIP Benefit Vision-and-Language Tasks?

[BibT_eX]

[DOI]

,

Liunian Harold Li

,

,

,

,

,

,

Proceedings of the Tenth International Conference on Learning Representations, 2022

Multitask Prompted Training Enables Zero-Shot Task Generalization.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

What Language Model to Train if You Have One Million GPU Hours?

[BibT_eX]

[DOI]

,

,

,

Lucile Saulnier

,

,

,

Stella Biderman

,

,

Niklas Muennighoff

,

,

,

,

,

,

Lintang Sutawika

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021

Multitask Prompted Training Enables Zero-Shot Task Generalization.

[BibT_eX]

[DOI]

CoRR, 2021

Learned Token Pruning for Transformers.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2021

MLPruning: A Multilevel Structured Pruning Framework for Transformer-based Models.

[BibT_eX]

[DOI]

,

,

,

,

Michael W. Mahoney

CoRR, 2021

Noisy Self-Knowledge Distillation for Text Summarization.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Discovering Non-monotonic Autoregressive Orderings with Variational Inference.

[BibT_eX]

[DOI]

,

Brandon Trabucco

,

,

,

,

,

Proceedings of the 9th International Conference on Learning Representations, 2021

What's Hidden in a One-layer Randomly Weighted Transformer?

[BibT_eX]

[DOI]

,

,

,

,

Michael W. Mahoney

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Reservoir Transformers.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning.

[BibT_eX]

[DOI]

,

,

,

Mustafa Mustafa

,

,

Michael W. Mahoney

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Reservoir Transformer.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2020

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning.

[BibT_eX]

[DOI]

,

,

,

,

Michael W. Mahoney

CoRR, 2020

Rethinking Batch Normalization in Transformers.

[BibT_eX]

[DOI]

,

,

,

Michael W. Mahoney

,

CoRR, 2020

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Joseph E. Gonzalez

CoRR, 2020

Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification (Extended Abstract).

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

PowerNorm: Rethinking Batch Normalization in Transformers.

[BibT_eX]

[DOI]

,

,

,

Michael W. Mahoney

,

Proceedings of the 37th International Conference on Machine Learning, 2020

Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 37th International Conference on Machine Learning, 2020

MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding.

[BibT_eX]

[DOI]

,

,

,

Michael W. Mahoney

,

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

On the Generation of Medical Question-Answer Pairs.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Xingzheng Liang

,

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Michael W. Mahoney

,

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the World Wide Web Conference, 2019

Pragmatically Informative Text Generation.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

An annotated dataset of literary entities.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

2018

Ermes: Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2018

2017

Towards Release Strategy Optimization for Apps in Google Play.

[BibT_eX]

[DOI]

,

,

CoRR, 2017

Towards Release Strategy Optimization for Apps in Google Play.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 9th Asia-Pacific Symposium on Internetware, 2017

Loading...