Philipp Dufter

According to our database1, Philipp Dufter authored at least 25 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training.
CoRR, 2024

2022
Position Information in Transformers: An Overview.
Comput. Linguistics, 2022

An Information-Theoretic Approach and Dataset for Probing Gender Stereotypes in Multilingual Masked Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

Towards a Broad Coverage Named Entity Resource: A Data-Efficient Approach for Many Diverse Languages.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

2021
BERT Cannot Align Characters.
CoRR, 2021

Locating Language-Specific Information in Contextualized Embeddings.
CoRR, 2021

Semantic Text Segment Classification of Structured Technical Content.
Proceedings of the Natural Language Processing and Information Systems, 2021

Static Embeddings as Efficient Knowledge Bases?
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Wine is not v i n. On the Compatibility of Tokenizations across Languages.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Graph Algorithms for Multiparallel Word Alignment.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

ParCourE: A Parallel Corpus Explorer for a Massively Multilingual Corpus.
Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Subword Sampling for Low Resource Word Alignment.
CoRR, 2020

Modeling Graph Structure via Relative Position for Better Text Generation from Knowledge Graphs.
CoRR, 2020

Identifying Necessary Elements for BERT's Multilinguality.
CoRR, 2020

SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings.
CoRR, 2020

Quantifying the Contextualization of Word Representations with Semantic Class Probing.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Identifying Elements Essential for BERT's Multilinguality.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Monolingual and Multilingual Reduction of Gender Bias in Contextualized Representations.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Increasing Learning Efficiency of Self-Attention Networks through Direct Position Interactions, Learnable Temperature, and Convoluted Attention.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

2019
Analytical Methods for Interpretable Ultradense Word Embeddings.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2018
A Stronger Baseline for Multilingual Word Embeddings.
CoRR, 2018

A Universal Semantic Space.
CoRR, 2018

Embedding Learning Through Multilingual Concept Induction.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018


  Loading...