Sheng Shen

Affiliations:
  • University of California Berkeley, CA, USA
  • Peking University, MoE Key Laboratory of High Confidence Software Technologies, Beijing, China (former)


According to our database1, Sheng Shen authored at least 51 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Efficient and Scalable Large Multimodal Models
PhD thesis, 2024

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions.
CoRR, 2024

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens.
CoRR, 2024

Enhancing Large Vision Language Models with Self-Training on Image Comprehension.
CoRR, 2024

RAFT: Adapting Language Model to Domain Specific RAG.
CoRR, 2024

Multitask Vision-Language Prompt Tuning.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Enhancing Large Vision Language Models with Self-Training on Image Comprehension.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

SqueezeLLM: Dense-and-Sparse Quantization.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Mixture-of-Experts Meets Instruction Tuning: A Winning Combination for Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

AgentBench: Evaluating LLMs as Agents.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Aligning Large Multimodal Models with Factually Augmented RLHF.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
From Text to Tactic: Evaluating LLMs Playing the Game of Avalon.
CoRR, 2023

HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption.
CoRR, 2023

Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts.
CoRR, 2023

Large Language Models are Visual Reasoning Coordinators.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Poisoning Language Models During Instruction Tuning.
Proceedings of the International Conference on Machine Learning, 2023

Scaling Vision-Language Models with Sparse Mixture of Experts.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Crosslingual Generalization through Multitask Finetuning.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
K-LITE: Learning Transferable Visual Models with External Knowledge.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Exposing the Limits of Video-Text Models through Contrast Sets.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Learned Token Pruning for Transformers.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Staged Training for Transformer Language Models.
Proceedings of the International Conference on Machine Learning, 2022

How Much Can CLIP Benefit Vision-and-Language Tasks?
Proceedings of the Tenth International Conference on Learning Representations, 2022


What Language Model to Train if You Have One Million GPU Hours?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021
Multitask Prompted Training Enables Zero-Shot Task Generalization.
CoRR, 2021

Learned Token Pruning for Transformers.
CoRR, 2021

MLPruning: A Multilevel Structured Pruning Framework for Transformer-based Models.
CoRR, 2021

Noisy Self-Knowledge Distillation for Text Summarization.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Discovering Non-monotonic Autoregressive Orderings with Variational Inference.
Proceedings of the 9th International Conference on Learning Representations, 2021

What's Hidden in a One-layer Randomly Weighted Transformer?
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Reservoir Transformers.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Reservoir Transformer.
CoRR, 2020

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning.
CoRR, 2020

Rethinking Batch Normalization in Transformers.
CoRR, 2020

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers.
CoRR, 2020

Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification (Extended Abstract).
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

PowerNorm: Rethinking Batch Normalization in Transformers.
Proceedings of the 37th International Conference on Machine Learning, 2020

Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers.
Proceedings of the 37th International Conference on Machine Learning, 2020

MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

On the Generation of Medical Question-Answer Pairs.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification.
Proceedings of the World Wide Web Conference, 2019

Pragmatically Informative Text Generation.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

An annotated dataset of literary entities.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

2018
Ermes: Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification.
CoRR, 2018

2017
Towards Release Strategy Optimization for Apps in Google Play.
CoRR, 2017

Towards Release Strategy Optimization for Apps in Google Play.
Proceedings of the 9th Asia-Pacific Symposium on Internetware, 2017


  Loading...