Lei Cui

Affiliations:
  • Microsoft Research, Beijing, China
  • Harbin Institute of Technology, Harbin, China (former)


According to our database1, Lei Cui authored at least 48 papers between 2010 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering.
CoRR, 2023

Kosmos-2.5: A Multimodal Literate Model.
CoRR, 2023

Language Is Not All You Need: Aligning Perception with Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

TextDiffuser: Diffusion Models as Text Painters.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

TrOCR: Transformer-Based Optical Character Recognition with Pre-trained Models.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
DiT: Self-supervised Pre-training for Document Image Transformer.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

A Simple yet Effective Learnable Positional Encoding Method for Improving Document Transformer Model.
Proceedings of the Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, 2022

XDoc: Unified Pre-training for Cross-Format Document Understanding.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

XFUND: A Benchmark Dataset for Multilingual Visually Rich Form Understanding.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

MarkupLM: Pre-training of Text and Markup Language for Visually Rich Document Understanding.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Document AI: Benchmarks, Models and Applications.
CoRR, 2021

TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models.
CoRR, 2021

VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and Summarization.
CoRR, 2021

LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding.
CoRR, 2021

LayoutReader: Pre-training of Text and Layout for Reading Order Detection.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
TableBank: Table Benchmark for Image-based Table Detection and Recognition.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

LayoutLM: Pre-training of Text and Layout for Document Image Understanding.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Multimodal Matching Transformer for Live Commenting.
Proceedings of the ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020, 2020

DocBank: A Benchmark Dataset for Document Layout Analysis.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Unsupervised Fine-tuning for Text Clustering.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

2019
Neural Melody Composition from Lyrics.
Proceedings of the Natural Language Processing and Chinese Computing, 2019

Retrieval-Enhanced Adversarial Training for Neural Response Generation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Reading Comprehension.
Proceedings of the 2nd Workshop on Machine Reading for Question Answering, 2019

LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Unsupervised Machine Commenting with Neural Variational Topic Model.
CoRR, 2018

Retrieval-Enhanced Adversarial Training for Neural Response Generation.
CoRR, 2018

SeRI: A Dataset for Sub-event Relation Inference from an Encyclopedia.
Proceedings of the Natural Language Processing and Chinese Computing, 2018

EventWiki: A Knowledge Base of Major Events.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Attention-Fused Deep Matching Network for Natural Language Inference.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Fine-grained Coordinated Cross-lingual Text Stream Alignment for Endless Language Knowledge Acquisition.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Neural Open Information Extraction.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
SuperAgent: A Customer Service Chatbot for E-commerce Websites.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
Aligning Coordinated Text Streams through Burst Information Network Construction and Decipherment.
CoRR, 2016

Discovering Concept-Level Event Associations from a Text Stream.
Proceedings of the Natural Language Understanding and Intelligent Applications, 2016

News Stream Summarization using Burst Information Networks.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Event Detection with Burst Information Networks.
Proceedings of the COLING 2016, 2016

2014
Learning Topic Representation for SMT with Neural Networks.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Machine Translation with Real-Time Web Search.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013
Collective Corpus Weighting and Phrase Scoring for SMT Using Graph-Based Random Walk.
Proceedings of the Natural Language Processing and Chinese Computing, 2013

Multi-Domain Adaptation for SMT Using Multi-Task Learning.
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013

Bilingual Data Cleaning for SMT using Graph-based Random Walk.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

2011
Improving Phrase Extraction via MBR Phrase Scoring and Pruning.
Proceedings of Machine Translation Summit XIII: Papers, 2011

Function Word Generation in Statistical Machine Translation Systems.
Proceedings of Machine Translation Summit XIII: Papers, 2011

2010
The MSRA machine translation system for IWSLT 2010.
Proceedings of the 2010 International Workshop on Spoken Language Translation, 2010

Hybrid Decoding: Decoding with Partial Hypotheses Combination over Multiple SMT Systems.
Proceedings of the COLING 2010, 2010

A Joint Rule Selection Model for Hierarchical Phrase-Based Translation.
Proceedings of the ACL 2010, 2010


  Loading...