Raj Dabre

Orcid: 0000-0003-0664-3421

According to our database1, Raj Dabre authored at least 100 papers between 2012 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Do Not Worry if You Do Not Have Data: Building Pretrained Language Models Using Translationese.
CoRR, 2024

IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages.
CoRR, 2024

Airavata: Introducing Hindi Instruction-tuned LLM.
CoRR, 2024

RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models models via Romanization.
CoRR, 2024

MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction.
CoRR, 2024

An Empirical Analysis of In-context Learning Abilities of LLMs for MT.
CoRR, 2024

PUB: A Pragmatics Understanding Benchmark for Assessing LLMs' Pragmatics Capabilities.
CoRR, 2024

Natural Language Processing for Dialects of a Language: A Survey.
CoRR, 2024

A Comprehensive Analysis of Adapter Efficiency.
Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD), 2024

2023
SelfSeg: A Self-supervised Sub-word Segmentation Method for Neural Machine Translation.
ACM Trans. Asian Low Resour. Lang. Inf. Process., August, 2023

Low-resource Multilingual Neural Translation Using Linguistic Feature-based Relevance Mechanisms.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2023

Bilingual Corpus Mining and Multistage Fine-Tuning for Improving Machine Translation of Lecture Transcripts.
CoRR, 2023

CreoleVal: Multilingual Multitask Benchmarks for Creoles.
CoRR, 2023

Turning Whisper into Real-Time Transcription System.
CoRR, 2023

IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages.
CoRR, 2023

In-context Example Selection for Machine Translation Using Multiple Features.
CoRR, 2023

Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models.
CoRR, 2023

Variable-length Neural Interlingua Representations for Zero-shot Neural Machine Translation.
CoRR, 2023

NICT-AI4B's Submission to the Indic MT Shared Task in WMT 2023.
Proceedings of the Eighth Conference on Machine Translation, 2023

MT Metrics Correlate with Human Ratings of Simultaneous Speech Translation.
Proceedings of the 20th International Conference on Spoken Language Translation, 2023

DecoMT: Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

CTQScorer: Combining Multiple Features for In-context Example Selection for Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

An Empirical Study of Leveraging Knowledge Distillation for Compressing Multilingual Neural Machine Translation Models.
Proceedings of the 24th Annual Conference of the European Association for Machine Translation, 2023

Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

Robustness of Multi-Source MT to Transcription Errors.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

YANMTT: Yet Another Neural Machine Translation Toolkit.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation Metrics for Indian Languages.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning.
Proceedings of the Workshop on Scientific Document Understanding co-located with 37th AAAI Conference on Artificial Inteligence (AAAI 2023), 2023

2022
MorisienMT: A Dataset for Mauritian Creole Machine Translation.
CoRR, 2022

IndicNLG Suite: Multilingual Datasets for Diverse NLG Tasks in Indic Languages.
CoRR, 2022

NICT at MixMT 2022: Synthetic Code-Mixed Pre-training and Multi-way Fine-tuning for Hinglish-English Translation.
Proceedings of the Seventh Conference on Machine Translation, 2022

When do Contrastive Word Alignments Improve Many-to-many Neural Machine Translation?
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

Fusion of Self-supervised Learned Models for MOS Prediction.
Proceedings of the Interspeech 2022, 2022

BERTSeg: BERT Based Unsupervised Subword Segmentation for Neural Machine Translation.
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022

KreolMorisienMT: A Dataset for Mauritian Creole Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, 2022

A Multilingual Multiway Evaluation Data Set for Structured Document Translation of Asian Languages.
Proceedings of the Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, 2022

IndicNLG Benchmark: Multilingual Datasets for Diverse NLG Tasks in Indic Languages.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

FeatureBART: Feature Based Sequence-to-Sequence Pre-Training for Low-Resource NMT.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Overview of the 9th Workshop on Asian Translation.
Proceedings of the 9th Workshop on Asian Translation, 2022

NICT's Submission to the WAT 2022 Structured Document Translation Task.
Proceedings of the 9th Workshop on Asian Translation, 2022

IndicBART: A Pre-trained Model for Indic Natural Language Generation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021
A Survey of Multilingual Neural Machine Translation.
ACM Comput. Surv., 2021

IndicBART: A Pre-trained Model for Natural Language Generation of Indic Languages.
CoRR, 2021

YANMTT: Yet Another Neural Machine Translation Toolkit.
CoRR, 2021

Recurrent Stacking of Layers in Neural Networks: An Application to Neural Machine Translation.
CoRR, 2021

Simultaneous Multi-Pivot Neural Machine Translation.
CoRR, 2021

Studying The Impact Of Document-level Context On Simultaneous Neural Machine Translation.
Proceedings of the 18th Biennial Machine Translation Summit - Volume 1: Research Track, 2021

Investigating Softmax Tempering for Training Neural Machine Translation Models.
Proceedings of the 18th Biennial Machine Translation Summit - Volume 1: Research Track, 2021


NICT-5's Submission To WAT 2021: MBART Pre-training And In-Domain Fine Tuning For Indic Languages.
Proceedings of the 8th Workshop on Asian Translation, 2021

2020
Extremely low-resource neural machine translation for Asian languages.
Mach. Transl., 2020

Softmax Tempering for Training Neural Machine Translation Models.
CoRR, 2020

Pre-training via Leveraging Assisting Languages and Data Selection for Neural Machine Translation.
CoRR, 2020

A Comprehensive Survey of Multilingual Neural Machine Translation.
CoRR, 2020

Combining Sequence Distillation and Transfer Learning for Efficient Low-Resource Neural Machine Translation Models.
Proceedings of the Fifth Conference on Machine Translation, 2020

Joint Training End-to-End Speech Recognition Systems with Speaker Attributes.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Harnessing Cross-lingual Features to Improve Cognate Detection for Low-resource Languages.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Multilingual Neural Machine Translation.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Improving Low-Resource NMT through Relevance Based Linguistic Features Incorporation.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Overview of the 7th Workshop on Asian Translation.
Proceedings of the 7th Workshop on Asian Translation, 2020

NICT's Submission To WAT 2020: How Effective Are Simple Many-To-Many Neural Machine Translation Models?
Proceedings of the 7th Workshop on Asian Translation, 2020

Balancing Cost and Benefit with Tied-Multi Transformers.
Proceedings of the Fourth Workshop on Neural Generation and Translation, 2020

Pre-training via Leveraging Assisting Languages for Neural Machine Translation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, 2020

2019
Multi-Layer Softmaxing during Training Neural Machine Translation for Flexible Decoding with Fewer Layers.
CoRR, 2019

Multilingual Multi-Domain Adaptation Approaches for Neural Machine Translation.
CoRR, 2019

NICT's Machine Translation Systems for the WMT19 Similar Language Translation Task.
Proceedings of the Fourth Conference on Machine Translation, 2019

NICT's Supervised Neural Machine Translation Systems for the WMT19 Translation Robustness Task.
Proceedings of the Fourth Conference on Machine Translation, 2019

NICT's Supervised Neural Machine Translation Systems for the WMT19 News Translation Task.
Proceedings of the Fourth Conference on Machine Translation, 2019

Exploiting Out-of-Domain Parallel Data through Multilingual Transfer Learning for Low-Resource Neural Machine Translation.
Proceedings of Machine Translation Summit XVII Volume 1: Research Track, 2019

Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation.
Proceedings of the Interspeech 2019, 2019

Exploiting Multilingualism through Multistage Fine-Tuning for Low-Resource Neural Machine Translation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Overview of the 6th Workshop on Asian Translation.
Proceedings of the 6th Workshop on Asian Translation, 2019

NICT's participation to WAT 2019: Multilingualism and Multi-step Fine-Tuning for Low Resource NMT.
Proceedings of the 6th Workshop on Asian Translation, 2019

Recurrent Stacking of Layers for Compact Neural Machine Translation Models.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Exploiting Multilingual Corpora Simply and Efficiently in Neural Machine Translation.
J. Inf. Process., 2018

A Comprehensive Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation.
J. Inf. Process., 2018

Overview of the 5th Workshop on Asian Translation.
Proceedings of the 32nd Pacific Asia Conference on Language, 2018

NICT's Participation in WAT 2018: Approaches Using Multilingualism and Recurrently Stacked Layers.
Proceedings of the 32nd Pacific Asia Conference on Language, 2018

2017
MMCR4NLP: Multilingual Multiway Corpora Repository for Natural Language Processing.
CoRR, 2017

An Empirical Comparison of Simple Domain Adaptation Methods for Neural Machine Translation.
CoRR, 2017

An Empirical Study of Language Relatedness for Transfer Learning in Neural Machine Translation.
Proceedings of the 31st Pacific Asia Conference on Language, Information and Computation, 2017

Enabling Multi-Source Neural Machine Translation By Concatenating Source Sentences In Multiple Languages.
Proceedings of Machine Translation Summit XVI, Volume 1: Research Track, 2017

Kyoto University MT System Description for IWSLT 2017.
Proceedings of the 14th International Conference on Spoken Language Translation, 2017

Neural Machine Translation: Basics, Practical Aspects and Recent Trends.
Proceedings of the IJCNLP 2017, Taipei, Taiwan, November 27, 2017

Kyoto University Participation to WAT 2017.
Proceedings of the 4th Workshop on Asian Translation, 2017

An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
Sophisticated Lexical Databases - Simplified Usage: Mobile Applications and Browser Plugins For Wordnets.
Proceedings of the 8th Global WordNet Conference, 2016

The Kyoto University Cross-Lingual Pronoun Translation System.
Proceedings of the First Conference on Machine Translation, 2016

Parallel Sentence Extraction from Comparable Corpora with Neural Network Features.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

2015
Large-scale Dictionary Construction via Pivot-based Statistical Machine Translation with Significance Pruning and Neural Network Features.
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation, 2015

Leveraging Small Multilingual Corpora for SMT Using Many Pivot Languages.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Augmenting Pivot based SMT with word segmentation.
Proceedings of the 12th International Conference on Natural Language Processing, 2015

KyotoEBMT System Description for the 2nd Workshop on Asian Translation.
Proceedings of the 2nd Workshop on Asian Translation, 2015

2014
Do not do processing, when you can look up: Towards a Discrimination Net for WSD.
Proceedings of the Seventh Global Wordnet Conference, 2014

PaCMan : Parallel Corpus Management Workbench.
Proceedings of the 11th International Conference on Natural Language Processing, 2014

Anou Tradir: Experiences In Building Statistical Machine Translation Systems For Mauritian Languages - Creole, English, French.
Proceedings of the 11th International Conference on Natural Language Processing, 2014

Tackling Close Cousins: Experiences In Developing Statistical Machine Translation Systems For Marathi And Hindi.
Proceedings of the 11th International Conference on Natural Language Processing, 2014

2012
Morphological Analyzer for Affix Stacking Languages: A Case Study of Marathi.
Proceedings of the COLING 2012, 2012


  Loading...