Serge Sharoff

Orcid: 0000-0002-4877-0210

According to our database1, Serge Sharoff authored at least 85 papers between 1996 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Fine-tuning language models to recognize semantic relations.
Lang. Resour. Evaluation, December, 2023

Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report Generation.
CoRR, 2023

Syntactic Knowledge via Graph Attention with BERT in Machine Translation.
CoRR, 2023

GATology for Linguistics: What Syntactic Dependencies It Knows.
CoRR, 2023

FTD at SemEval-2023 Task 3: News Genre and Propaganda Detection by Comparing Mono- and Multilingual Models with Fine-tuning on Additional Data.
Proceedings of the The 17th International Workshop on Semantic Evaluation, 2023

BERT Goes Off-Topic: Investigating the Domain Transfer Challenge using Genre Classification.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

2022
Estimating Confidence of Predictions of Individual Classifiers and Their Ensembles for the Genre Classification Task.
CoRR, 2022

Towards Arabic Sentence Simplification via Classification and Generative Approaches.
CoRR, 2022

Towards Arabic Sentence Simplification via Classification and Generative Approaches.
Proceedings of the The Seventh Arabic Natural Language Processing Workshop, 2022

Multimodal Pipeline for Collection of Misinformation Data from Telegram.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Estimating Confidence of Predictions of Individual Classifiers and TheirEnsembles for the Genre Classification Task.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

BERTology for Machine Translation: What BERT Knows about Linguistic Difficulties for Translation.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Applying Natural Annotation and Curriculum Learning to Named Entity Recognition for Under-Resourced Languages.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

2021
Experiments with adversarial attacks on text genres.
CoRR, 2021

Automatic Difficulty Classification of Arabic Sentences.
Proceedings of the Sixth Arabic Natural Language Processing Workshop, 2021

2020
Finding next of kin: Cross-lingual embedding spaces for related languages.
Nat. Lang. Eng., 2020

Sentence Level Human Translation Quality Estimation with Attention-based Neural Networks.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Know thy Corpus! Robust Methods for Digital Curation of Web corpora.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Recognizing Semantic Relations by Combining Transformers and Fully Connected Models.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Recognizing Semantic Relations: Attention-Based Transformers vs. Recurrent Models.
Proceedings of the Advances in Information Retrieval, 2020

Overview of the Fourth BUCC Shared Task: Bilingual Dictionary Induction from Comparable Corpora.
Proceedings of the 13th Workshop on Building and Using Comparable Corpora, 2020

2019
New Areas of Application of Comparable Corpora.
Proceedings of the Using Comparable Corpora for Under-Resourced Areas of Machine Translation, 2019

Towards Functionally Similar Corpus Resources for Translation.
Proceedings of the International Conference on Recent Advances in Natural Language Processing, 2019

2018
Overview of the Third BUCC Shared Task: Spotting Parallel Sentences in Comparable Corpora.
Proceedings of the 11th Workshop on Building and Using Comparable Corpora, 2018

A Multilingual Dataset for Evaluating Parallel Sentence Extraction from Comparable Corpora.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Investigating the Influence of Bilingual MWU on Trainee Translation Quality.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Cross-lingual Terminology Extraction for Translation Quality Estimation.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Language adaptation experiments via cross-lingual embeddings for related languages.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

2017
Overview of the Second BUCC Shared Task: Spotting Parallel Sentences in Comparable Corpora.
Proceedings of the 10th Workshop on Building and Using Comparable Corpora, 2017

Toward Pan-Slavic NLP: Some Experiments with Language Adaptation.
Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing, 2017

2016
Language Adaptation for Extending Post-Editing Estimates for Closely Related Languages.
Prague Bull. Math. Linguistics, 2016

Recent advances in machine translation using comparable corpora.
Nat. Lang. Eng., 2016

Preface.
Nat. Lang. Eng., 2016

Crowdsourcing for web genre annotation.
Lang. Resour. Evaluation, 2016

MoBiL: A Hybrid Feature Set for Automatic Human Translation Quality Assessment.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Adam Kilgarriff's Legacy to Computational Linguistics and Beyond.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2016

Genre classification for a corpus of academic webpages.
Proceedings of the 10th Web as Corpus Workshop, 2016

2015
Web Corpus Construction Roland Schäfer and Felix Bildhauer (Freie Universität Berlin) Morgan & Claypool (Synthesis Lectures on Human Language Technologies, edited by Graeme Hirst, volume 22), 2013, 145 pages, paper-bound, ISBN 9781608459834, doi: 10.2200/S00508ED1V01Y201305HLT022.
Comput. Linguistics, 2015

Large Scale Translation Quality Estimation.
Proceedings of the 1st Deep Machine Translation Workshop, 2015

BUCC Shared Task: Cross-Language Document Similarity.
Proceedings of the Eighth Workshop on Building and Using Comparable Corpora, 2015

Obtaining SMT dictionaries for related languages.
Proceedings of the Eighth Workshop on Building and Using Comparable Corpora, 2015

Applying Multi-Dimensional Analysis to a Russian Webcorpus: Searching for Evidence of Genres.
Proceedings of the 5th Workshop on Balto-Slavic Natural Language Processing, 2015

2014
Introduction to the special issue on Resources and Tools for Language Learners.
Lang. Resour. Evaluation, 2014

Corpus-based vocabulary lists for language learners for nine languages.
Lang. Resour. Evaluation, 2014

Document dissimilarity within and across languages: A benchmarking study.
Lit. Linguistic Comput., 2014

Semi-supervised Graph-based Genre Classification for Web Pages.
Proceedings of TextGraphs@EMNLP 2014: the 9th Workshop on Graph-based Methods for Natural Language Processing, 2014

Designing and Evaluating a Reliable Corpus of Web Genres via Crowd-Sourcing.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Extracting Multiword Translations from Aligned Comparable Documents.
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation, 2014

Multiple views as aid to linguistic annotation error analysis.
Proceedings of the 8th Linguistic Annotation Workshop, 2014

2013
SentiML: functional annotation for multilingual sentiment analysis.
Proceedings of the 1st International Workshop on Collaborative Annotations in Shared Environment, 2013

English-to-Russian MT evaluation campaign.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

Overviewing Important Aspects of the Last Twenty Years of Research in Comparable Corpora.
Proceedings of the Building and Using Comparable Corpora., 2013

Measuring the Distance Between Comparable Corpora Between Languages.
Proceedings of the Building and Using Comparable Corpora., 2013

2012
Identifying Word Translations from Comparable Documents Without a Seed Lexicon.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Design of a hybrid high quality machine translation system.
Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation HyTra@EACL 2012, 2012

2011
In the Garden and in the Jungle.
Proceedings of the Genres on the Web, 2011

Any Land in Sight?
Proceedings of the Genres on the Web, 2011

Riding the Rough Waves of Genre on the Web.
Proceedings of the Genres on the Web, 2011

2010
Multiword expressions: hard going or plain sailing?
Lang. Resour. Evaluation, 2010

Using an integrated feature set to generalize and justify the Chinese-to-English transferring rule of the 'ZHE' aspect.
J. Zhejiang Univ. Sci. C, 2010

Advanced Corpus Solutions for Humanities Researchers.
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation, 2010

The Web Library of Babel: evaluating genre collections.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

Fine-Grained Genre Classification Using Structural Learning Algorithms.
Proceedings of the ACL 2010, 2010

2009
'Irrefragable answers' using comparable corpora to retrieve translation equivalents.
Lang. Resour. Evaluation, 2009

Web Genre Benchmark Under Construction.
J. Lang. Technol. Comput. Linguistics, 2009

Evaluation-Guided Pre-Editing of Source Text: Improving MT-Tractability of Light Verb Constructions.
Proceedings of the 13th Annual conference of the European Association for Machine Translation, 2009

2008
Designing and Evaluating a Russian Tagset.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

Corpus-Based Tools for Computer-Assisted Acquisition of Reading Abilities in Cognate Languages.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

Cleaneval: a Competition for Cleaning Web Pages.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

Generalising Lexical Translation Strategies for MT Using Comparable Corpora.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

2007
Linguistic support for concept selection decisions.
Artif. Intell. Eng. Des. Anal. Manuf., 2007

Translating from under-resourced languages: comparing direct transfer against pivot translation.
Proceedings of Machine Translation Summit XI: Papers, 2007

Assisting Translators in Indirect Lexical Transfer.
Proceedings of the ACL 2007, 2007

2006
Using collocations from comparable corpora to find translation equivalents.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

A Uniform Interface to Large-Scale Linguistic Resources.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Using Richly Annotated Trilingual Language Resources for Acquiring Reading Skills in a Foreign Language.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

ASSIST: Automated Semantic Assistance for Translators.
Proceedings of the EACL 2006, 2006

Using Comparable Corpora to Solve Problems Difficult for Human Translators.
Proceedings of the ACL 2006, 2006

2004
Towards Basic Categories for Describing Properties of Texts in a Corpus.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

2002
Meaning as use: exploitation of aligned corpora for the contrastive study of lexical semantics.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

2001
Concordancing for parallel spoken language corpora.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000
Resources for Multilingual Text Generation in Three Slavic Languages.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

Multilinguality in a Text Generation System For Three Slavic Languages.
Proceedings of the COLING 2000, 18th International Conference on Computational Linguistics, Proceedings of the Conference, 2 Volumes, July 31, 2000

1999
Register-domain Separation as a Methodology for Development of Natural Language Interfaces to Databases.
Proceedings of the Human-Computer Interaction INTERACT '99: IFIP TC13 International Conference on Human-Computer Interaction, 1999

1996
Understanding Short Texts with Integration of Knowledge Representation Methods.
Proceedings of the Perspectives of System Informatics, 1996


  Loading...