Damai Dai

According to our database1, Damai Dai authored at least 41 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Large Language Models Are Unconscious of Unreasonability in Math Problems.
CoRR, 2024

PeriodicLoRA: Breaking the Low-Rank Bottleneck in LoRA Optimization.
CoRR, 2024

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models.
CoRR, 2024

Language Models Understand Numbers, at Least Partially.
CoRR, 2024

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism.
CoRR, 2024

2023
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations.
CoRR, 2023

Bi-Drop: Generalizable Fine-tuning for Pre-trained Language Models via Adaptive Subnetwork Optimization.
CoRR, 2023

A Survey for In-context Learning.
CoRR, 2023

Coarse-to-Fine Entity Representations for Document-Level Relation Extraction.
Proceedings of the Natural Language Processing and Chinese Computing, 2023

Mixture-of-Experts for Biomedical Question Answering.
Proceedings of the Natural Language Processing and Chinese Computing, 2023

Neural Knowledge Bank for Pretrained Transformers.
Proceedings of the Natural Language Processing and Chinese Computing, 2023

Not All Demonstration Examples are Equally Beneficial: Reweighting Demonstration Examples for In-Context Learning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net Estimation and Optimization.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Denoising Bottleneck with Mutual Information Maximization for Video Multimodal Fusion.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers.
CoRR, 2022

Calibrating Factual Knowledge in Pretrained Language Models.
CoRR, 2022

Neural Knowledge Bank for Pretrained Transformers.
CoRR, 2022

On the Representation Collapse of Sparse Mixture of Experts.
CoRR, 2022

Mixture of Experts for Biomedical Question Answering.
CoRR, 2022

Plug-and-Play Module for Commonsense Reasoning in Machine Reading Comprehension.
Proceedings of the Natural Language Processing and Chinese Computing, 2022

On the Representation Collapse of Sparse Mixture of Experts.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Robust Fine-tuning via Perturbation and Interpolation from In-batch Instances.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Calibrating Factual Knowledge in Pretrained Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Hierarchical Curriculum Learning for AMR Parsing.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2022

Knowledge Neurons in Pretrained Transformers.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

StableMoE: Stable Routing Strategy for Mixture of Experts.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Explicit Interaction Network for Aspect Sentiment Triplet Extraction.
CoRR, 2021

Knowledge Neurons in Pretrained Transformers.
CoRR, 2021

Incorporating Connections Beyond Knowledge Embeddings: A Plug-and-Play Module to Enhance Commonsense Reasoning in Machine Reading Comprehension.
CoRR, 2021

Inductively Representing Out-of-Knowledge-Graph Entities by Optimal Estimation Under Translational Assumptions.
Proceedings of the 6th Workshop on Representation Learning for NLP, 2021

Decompose, Fuse and Generate: A Formation-Informed Method for Chinese Definition Generation.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Leveraging Word-Formation Knowledge for Chinese Word Sense Disambiguation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Behind the Scenes: An Exploration of Trigger Biases Problem in Few-Shot Event Classification.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

2020
Inductively Representing Out-of-Knowledge-Graph Entities by Optimal Estimation Under Translational Assumptions.
CoRR, 2020

2019
Learning to Control the Fine-grained Sentiment for Story Ending Generation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Sememe Prediction: Learning Semantic Knowledge from Unstructured Textual Wiki Descriptions.
CoRR, 2018

Live Video Comment Generation Based on Surrounding Frames and Live Comments.
CoRR, 2018

2017
FISF: Better User Experience using Smaller Bandwidth for Panoramic Virtual Reality Video.
CoRR, 2017


  Loading...