Nizar Habash

Orcid: 0000-0002-1831-3457

According to our database1, Nizar Habash authored at least 236 papers between 1998 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
ZAEBUC-Spoken: A Multilingual Multidialectal Arabic-English Speech Corpus.
CoRR, 2024

ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic.
CoRR, 2024

M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection.
CoRR, 2024

M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

Cross-Lingual Transfer from Related Languages: Treating Low-Resource Maltese as Multilingual Code-Switching.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

Computational Morphology and Lexicography Modeling of Modern Standard Arabic Nominals.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

2023
Data Augmentation Techniques for Machine Translation of Code-Switched Texts: A Comparative Study.
CoRR, 2023

Perception, performance, and detectability of conversational artificial intelligence across 32 university courses.
CoRR, 2023

CamelParser2.0: A State-of-the-Art Dependency Parser for Arabic.
Proceedings of ArabicNLP 2023, Singapore (Hybrid), December 7, 2023, 2023

NADI 2023: The Fourth Nuanced Arabic Dialect Identification Shared Task.
Proceedings of ArabicNLP 2023, Singapore (Hybrid), December 7, 2023, 2023

Boundless Conversations: AI-Powered Video Interactions across Domains, Languages, and Time.
Proceedings of the SIGGRAPH Asia 2023 Emerging Technologies, 2023

Tell Me More, Tell Me More: AI-Generated Question Suggestions for the Creation of Interactive Video Recordings.
Proceedings of the 32nd IEEE International Conference on Robot and Human Interactive Communication, 2023

Data Augmentation Techniques for Machine Translation of Code-Switched Texts: A Comparative Study.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Exploring Segmentation Approaches for Neural Machine Translation of Code-Switched Egyptian Arabic-English Text.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

2022
Unsupervised Arabic dialect segmentation for machine translation.
Nat. Lang. Eng., 2022

The Shared Task on Gender Rewriting.
CoRR, 2022

The User-Aware Arabic Gender Rewriter.
CoRR, 2022

Investigating Lexical Replacements for Arabic-English Code-Switched Data Augmentation.
CoRR, 2022

ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic-English.
Proceedings of the The Seventh Arabic Natural Language Processing Workshop, 2022

AraBART: a Pretrained Arabic Sequence-to-Sequence Model for Abstractive Summarization.
Proceedings of the The Seventh Arabic Natural Language Processing Workshop, 2022

Maknuune: A Large Open Palestinian Arabic Lexicon.
Proceedings of the The Seventh Arabic Natural Language Processing Workshop, 2022


NADI 2022: The Third Nuanced Arabic Dialect Identification Shared Task.
Proceedings of the The Seventh Arabic Natural Language Processing Workshop, 2022

Benchmarking Evaluation Metrics for Code-Switching Automatic Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

User-Centric Gender Rewriting.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

ZAEBUC: An Annotated Arabic-English Bilingual Writer Corpus.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Camel Treebank: An Open Multi-genre Arabic Dependency Treebank.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

UniMorph 4.0: Universal Morphology.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Hierarchical Aggregation of Dialectal Data for Arabic Dialect Identification.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

The Arabic Parallel Gender Corpus 2.0: Extensions and Analyses.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

The Bahrain Corpus: A Multi-genre Corpus of Bahraini Arabic.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Camelira: An Arabic Multi-Dialect Morphological Disambiguator.
Proceedings of the The 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Arabic Word-level Readability Visualization for Assisted Text Simplification.
Proceedings of the The 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Morphosyntactic Tagging with Pre-trained Language Models for Arabic and its Dialects.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021
A panoramic survey of natural language processing in the Arab world.
Commun. ACM, 2021

The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models.
Proceedings of the Sixth Arabic Natural Language Processing Workshop, 2021

Automatic Romanization of Arabic Bibliographic Records.
Proceedings of the Sixth Arabic Natural Language Processing Workshop, 2021

NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task.
Proceedings of the Sixth Arabic Natural Language Processing Workshop, 2021

A Cloud-based User-Centered Time-Offset Interaction Application.
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2021

Towards Automatic Narrative Coherence Prediction.
Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

Automatic Error Type Annotation for Arabic.
Proceedings of the 25th Conference on Computational Natural Language Learning, 2021

2020
A Link Prediction Approach for Accurately Mapping a Large-scale Arabic Lexical Resource to English WordNet.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2020

A Unified Model for Arabizi Detection and Transliteration using Sequence-to-Sequence Models.
Proceedings of the Fifth Arabic Natural Language Processing Workshop, 2020

NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task.
Proceedings of the Fifth Arabic Natural Language Processing Workshop, 2020

CAMeL Tools: An Open Source Python Toolkit for Arabic Natural Language Processing.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Morphological Analysis and Disambiguation for Gulf Arabic: The Interplay between Resources and Methods.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

A Spelling Correction Corpus for Multiple Arabic Dialects.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

The Margarita Dialogue Corpus: A Data Set for Time-Offset Interactions and Unstructured Dialogue Systems.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

A Large-Scale Leveled Readability Lexicon for Standard Arabic.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Utilizing Subword Entities in Character-Level Sequence-to-Sequence Lemmatization Models.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Multitask Easy-First Dependency Parsing: Exploiting Complementarities of Different Dependency Representations.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

An Online Readability Leveled Arabic Thesaurus.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Lost & Found in Translation: Impact of Machine Translated Results on Translingual Information Retrieval.
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers, 2020

Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

The Paradigm Discovery Problem.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
A Survey of Opinion Mining in Arabic: A Comprehensive System Perspective Covering Challenges and Advances in Tools, Resources, Models, Applications, and Visualizations.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2019

Simple Automatic Post-editing for Arabic-Japanese Machine Translation.
CoRR, 2019

An Arabic Dependency Treebank in the Travel Domain.
CoRR, 2019

The MADAR Shared Task on Arabic Fine-Grained Dialect Identification.
Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019

Morphologically Annotated Corpora for Seven Arabic Dialects: Taizi, Sanaani, Najdi, Jordanian, Syrian, Iraqi and Moroccan.
Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019

ADIDA: Automatic Dialect Identification for Arabic.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

The Impact of Preprocessing on Arabic-English Statistical and Neural Machine Translation.
Proceedings of Machine Translation Summit XVII Volume 1: Research Track, 2019

Towards Variability Resistant Dialectal Speech Evaluation.
Proceedings of the Interspeech 2019, 2019

Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

The Effectiveness of Simple Hybrid Systems for Hypernym Discovery.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
A Bilingual Interactive Human Avatar Dialogue System.
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, 2018

Noise-Robust Morphological Disambiguation for Dialectal Arabic.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

CoNLL-UL: Universal Morphological Lattices for Universal Dependency Parsing.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

A Morphologically Annotated Corpus of Emirati Arabic.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Palmyra: A Platform Independent Dependency Annotation Tool for Morphologically Rich Languages.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

A Parallel Corpus of Arabic-Japanese News Articles.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Unified Guidelines and Resources for Arabic Dialect Orthography.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

The MADAR Arabic Dialect Corpus and Lexicon.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

A Leveled Reading Corpus of Modern Standard Arabic.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Utilizing Character and Word Embeddings for Text Normalization with Sequence-to-Sequence Models.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

A Cross-lingual Messenger with Keyword Searchable Phrases for the Travel Domain.
Proceedings of the COLING 2018, 2018

Improving Domain Independent Question Parsing with Synthetic Treebanks.
Proceedings of the Joint Workshop on Linguistic Annotation, 2018

Addressing Noise in Multidialectal Word Embeddings.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

Feature Optimization for Predicting Readability of Arabic L1 and L2.
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications, 2018

2017
A Sentiment Treebank and Morphologically Enriched Recursive Deep Models for Effective Sentiment Analysis in Arabic.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2017

Optimizing Tokenization Choice for Machine Translation across Multiple Target Languages.
Prague Bull. Math. Linguistics, 2017

Curras: an annotated corpus for the Palestinian Arabic dialect.
Lang. Resour. Evaluation, 2017

Robust Dictionary Lookup in Multiple Noisy Orthographies.
Proceedings of the Third Arabic Natural Language Processing Workshop, 2017

Universal Dependencies for Arabic.
Proceedings of the Third Arabic Natural Language Processing Workshop, 2017

A Morphological Analyzer for Gulf Arabic Verbs.
Proceedings of the Third Arabic Natural Language Processing Workshop, 2017

A Characterization Study of Arabic Twitter Data with a Benchmarking for State-of-the-Art Opinion Mining Models.
Proceedings of the Third Arabic Natural Language Processing Workshop, 2017

OMAM at SemEval-2017 Task 4: English Sentiment Analysis with Conditional Random Fields.
Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

OMAM at SemEval-2017 Task 4: Evaluation of English State-of-the-Art Sentiment Analysis Models for Arabic and a New Topic-based Model.
Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

Low Resourced Machine Translation via Morpho-syntactic Modeling: The Case of Dialectal Arabic.
Proceedings of Machine Translation Summit XVI, Volume 1: Research Track, 2017

Don't Throw Those Morphological Analyzers Away Just Yet: Neural Morphological Disambiguation for Arabic.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

A Parallel Corpus for Evaluating Machine Translation between Arabic and European Languages.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017


Simplification of Arabic Masterpieces for Extensive Reading: A Project Overview.
Proceedings of the Third International Conference On Arabic Computational Linguistics, 2017

2016
Egyptian Arabic to English Statistical Machine Translation System for NIST OpenMT'2015.
CoRR, 2016

First Result on Arabic Neural Machine Translation.
CoRR, 2016

The Columbia University - New York University Abu Dhabi SIGMORPHON 2016 Morphological Reinflection Shared Task Submission.
Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, 2016

Building an Arabic Machine Translation Post-Edited Corpus: Guidelines and Annotation.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Arabic Corpora for Credibility Analysis.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Applying the Cognitive Machine Translation Evaluation Approach to Arabic.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

A Large Scale Corpus of Gulf Arabic.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

DALILA: The Dialectal Arabic Linguistic Learning Assistant.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Exploiting Arabic Diacritization for High Quality Automatic Annotation.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Morphologically Annotated Corpora and Morphological Analyzers for Moroccan and Sanaani Yemeni Arabic.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

SPLIT: Smart Preprocessing (Quasi) Language Independent Tool.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

CamelParser: A system for Arabic Syntactic Analysis and Morphological Disambiguation.
Proceedings of the COLING 2016, 2016

YAMAMA: Yet Another Multi-Dialect Arabic Morphological Analyzer.
Proceedings of the COLING 2016, 2016

Machine Translation Evaluation for Arabic using Morphologically-enriched Embeddings.
Proceedings of the COLING 2016, 2016

Creating Resources for Dialectal Arabic from a Single Annotation: A Case Study on Egyptian and Levantine.
Proceedings of the COLING 2016, 2016

Botta: An Arabic Dialect Chatbot.
Proceedings of the COLING 2016, 2016

Analysis of Foreign Language Teaching Methods: An Automatic Readability Approach.
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications, 2016

2015
A Conventional Orthography for Algerian Arabic.
Proceedings of the Second Workshop on Arabic Natural Language Processing, 2015

The Second QALB Shared Task on Automatic Text Correction for Arabic.
Proceedings of the Second Workshop on Arabic Natural Language Processing, 2015

POS-tagging of Tunisian Dialect Using Standard Arabic Resources and Tools.
Proceedings of the Second Workshop on Arabic Natural Language Processing, 2015

Annotating Targets of Opinions in Arabic using Crowdsourcing.
Proceedings of the Second Workshop on Arabic Natural Language Processing, 2015

Morphological constraints for phrase pivot statistical machine translation.
Proceedings of Machine Translation Summit XV: Papers, 2015

Improving Arabic Diacritization through Syntactic Analysis.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Predicting the Structure of Cooking Recipes.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Arabic Transliteration of Romanized Tunisian Dialect Text: A Preliminary Investigation.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2015

Correction Annotation for Non-Native Arabic Texts: Guidelines and Corpus.
Proceedings of The 9th Linguistic Annotation Workshop, 2015

2014
Linguistic Introduction: The Orthography, Morphology and Syntax of Semitic Languages.
Proceedings of the Natural Language Processing of Semitic Languages, 2014

ADAM: Analyzer for Dialectal Arabic Morphology.
J. King Saud Univ. Comput. Inf. Sci., 2014

arTenTen: Arabic Corpus and Word Sketches.
J. King Saud Univ. Comput. Inf. Sci., 2014

A Pipeline Approach to Supervised Error Correction for the QALB-2014 Shared Task.
Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing, 2014

The Columbia System in the QALB-2014 Shared Task on Arabic Error Correction.
Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing, 2014

The First QALB Shared Task on Automatic Text Correction for Arabic.
Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing, 2014

Domain and Dialect Adaptation for Machine Translation into Egyptian Arabic.
Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing, 2014

Building a Corpus for Palestinian Arabic: a Preliminary Study.
Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing, 2014

Transliteration of Arabizi into Arabic Orthography: Developing a Parallel Annotated Arabizi-Arabic Script SMS/Chat Corpus.
Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing, 2014

A Large Scale Arabic Sentiment Lexicon for Arabic Opinion Mining.
Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing, 2014

A Conventional Orthography for Tunisian Arabic.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Large Scale Arabic Error Annotation: Guidelines and Framework.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

MADAMIRA: A Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Developing an Egyptian Arabic Treebank: Impact of Dialectal Morphology on Annotation and Tool Development.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Tharwa: A Large Scale Dialectal Arabic - Standard Arabic - English Lexicon.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

A Multidialectal Parallel Corpus of Arabic.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Authorship Analysis of Inspire Magazine through Stylometric and Psychological Features.
Proceedings of the IEEE Joint Intelligence and Security Informatics Conference, 2014

Improving deep neural network acoustic modeling for audio corpus indexing under the IARPA babel program.
Proceedings of the INTERSPEECH 2014, 2014

Alignment symmetrisation optimization targeting phrase pivot statistical machine translation.
Proceedings of the 17th Annual conference of the European Association for Machine Translation, 2014

The Illinois-Columbia System in the CoNLL-2014 Shared Task.
Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task, 2014

Automatic Transliteration of Romanized Dialectal Arabic.
Proceedings of the Eighteenth Conference on Computational Natural Language Learning, 2014

Sentence Level Dialect Identification for Machine Translation System Selection.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Unsupervised Morphology-Based Vocabulary Expansion.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Generalized Character-Level Spelling Error Correction.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Foreign Words and the Automatic Processing of Arabic Social Media Text Written in Roman Script.
Proceedings of the First Workshop on Computational Approaches to Code Switching@EMNLP 2014, 2014

2013
Supervised collaboration for syntactic annotation of Quranic Arabic.
Lang. Resour. Evaluation, 2013

LDC Arabic Treebanks and Associated Corpora: Data Divisions Manual.
CoRR, 2013

Dependency Parsing of Modern Standard Arabic with Lexical and Inflectional Features.
Comput. Linguistics, 2013

Translating verbs between MSA and arabic dialects through deep morphological analysis (Un système de traduction de verbes entre arabe standard et arabe dialectal par analyse morphologique profonde) [in French].
Proceedings of the Traitement Automatique des Langues Naturelles, 2013

Dialectal Arabic to English Machine Translation: Pivoting through Modern Standard Arabic.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Morphological Analysis and Disambiguation for Dialectal Arabic.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Processing Spontaneous Orthography.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Automatic Morphological Enrichment of a Morphologically Underspecified Treebank.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

The Effects of Factorizing Root and Pattern Mapping in Bidirectional Tunisian - Standard Arabic Machine Translation.
Proceedings of Machine Translation Summit XIV: Papers, 2013

Orthographic and Morphological Processing for Persian-to-English Statistical Machine Translation.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

DIRA: Dialectal Arabic Information Retrieval Assistant.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

A Web-based Annotation Framework For Large-Scale Text Correction.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

Selective Combination of Pivot and Direct Statistical Machine Translation Models.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

Automatic Extraction of Morphological Lexicons from Morphologically Annotated Corpora.
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013

Automatic Correction and Extension of Morphological Annotations.
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, 2013

Reranking with Linguistic and Semantic Features for Arabic Optical Character Recognition.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages.
Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages, 2013

SPMRL'13 Shared Task System: The CADIM Arabic Dependency Parser.
Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages, 2013

2012
Machine translation between Hebrew and Arabic.
Mach. Transl., 2012

Orthographic and morphological processing for English-Arabic statistical machine translation.
Mach. Transl., 2012

Special issue on Machine Translation for Arabic: Preface.
Mach. Transl., 2012

Improved Arabic-to-English statistical machine translation by reordering post-verbal subjects for word alignment.
Mach. Transl., 2012

A Morphological Analyzer for Egyptian Arabic.
Proceedings of the Twelfth Meeting of the Special Interest Group on Computational Morphology and Phonology, 2012

Conventional Orthography for Dialectal Arabic.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Rich Morphology Generation Using Statistical Machine Translation.
Proceedings of the INLG 2012 - Proceedings of the Seventh International Natural Language Generation Conference, 30 May 2012, 2012

Hebrew Morphological Preprocessing for Statistical Machine Translation.
Proceedings of the 16th Annual conference of the European Association for Machine Translation, 2012

Can Automatic Post-Editing Make MT More Meaningful.
Proceedings of the 16th Annual conference of the European Association for Machine Translation, 2012

Translate, Predict or Generate: Modeling Rich Morphology in Statistical Machine Translation.
Proceedings of the 16th Annual conference of the European Association for Machine Translation, 2012

Identifying Broken Plurals, Irregular Gender, and Rationality in Arabic Text.
Proceedings of the EACL 2012, 2012

Elissa: A Dialectal to Standard Arabic Machine Translation System.
Proceedings of the COLING 2012, 2012

2011
Filtering Antonymous, Trend-Contrasting, and Polarity-Dissimilar Distributional Paraphrases for Improving Statistical Machine Translation.
Proceedings of the Sixth Workshop on Statistical Machine Translation, 2011

Fuzzy Syntactic Reordering for Phrase-based Statistical Machine Translation.
Proceedings of the Sixth Workshop on Statistical Machine Translation, 2011

Automatic Error Analysis for Morphologically Rich Languages.
Proceedings of Machine Translation Summit XIII: Papers, 2011

One-Step Statistical Parsing of Hybrid Dependency-Constituency Syntactic Representations.
Proceedings of the 12th International Conference on Parsing Technologies, 2011

Fast Yet Rich Morphological Analysis.
Proceedings of the Finite-State Methods and Natural Language Processing, 2011

Improving Arabic Dependency Parsing with Form-based and Functional Morphological Features.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011

Using Deep Morphology to Improve Automatic Error Detection in Arabic Handwriting Recognition.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011

A Corpus for Modeling Morpho-Syntactic Agreement in Arabic: Gender, Number and Rationality.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, 2011

2010
Introduction to Arabic Natural Language Processing
Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, ISBN: 978-3-031-02139-8, 2010

Interlingual annotation of parallel text corpora: a new framework for annotation and evaluation.
Nat. Lang. Eng., 2010

Reordering Matrix Post-verbal Subjects for Arabic-to-English SMT.
Proceedings of the Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2010

Morphological Annotation of Quranic Arabic.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

Morphological Analysis and Generation of Arabic Nouns: A Morphemic Functional Approach.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

Improving Arabic-to-English Statistical Machine Translation by Reordering Post-Verbal Subjects for Alignment.
Proceedings of the ACL 2010, 2010

Improving Arabic Dependency Parsing with Lexical and Inflectional Morphological Features.
Proceedings of the First Workshop on Statistical Parsing of Morphologically-Rich Languages, 2010

2009
Symbolic-to-statistical hybridization: extending generation-heavy machine translation.
Mach. Transl., 2009

Improving Arabic-Chinese Statistical Machine Translation using English as Pivot Language.
Proceedings of the Fourth Workshop on Statistical Machine Translation, 2009

Improving the Arabic Pronunciation Dictionary for Phone and Word Recognition with Linguistically-Based Pronunciation Rules.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31, 2009

CATiB: The Columbia Arabic Treebank.
Proceedings of the ACL 2009, 2009

Syntactic Reordering for English-Arabic Phrase-Based Machine Translation.
Proceedings of the Workshop on Computational Approaches to Semitic Languages, 2009

Spoken Arabic Dialect Identification Using Phonotactic Modeling.
Proceedings of the Workshop on Computational Approaches to Semitic Languages, 2009

2008
Using Shallow Syntax Information to Improve Word Alignment and Reordering for SMT.
Proceedings of the Third Workshop on Statistical Machine Translation, 2008

Identification of Naturally Occurring Numerical Expressions in Arabic.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

Improving NER in Arabic Using a Morphological Tagger.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

Automatic Learning of Morphological Variations for Handling Out-of-Vocabulary Terms in Urdu-English MT.
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers, 2008

Arabic Morphological Tagging, Diacritization, and Lemmatization Using Lexeme Models and Feature Ranking.
Proceedings of the ACL 2008, 2008

Four Techniques for Online Handling of Out-of-Vocabulary Words in Arabic-English Statistical Machine Translation.
Proceedings of the ACL 2008, 2008

2007
Arabic Diacritization through Full Morphological Tagging.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007

Combination of Statistical Word Alignments Based on Multiple Preprocessing Schemes.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007

Arabic Dialect Processing Tutorial.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007

Semi-automatic error analysis for large-scale statistical machine translation.
Proceedings of Machine Translation Summit XI: Papers, 2007

Syntactic preprocessing for statistical machine translation.
Proceedings of Machine Translation Summit XI: Papers, 2007

Arabic diacritization in the context of statistical machine translation.
Proceedings of Machine Translation Summit XI: Papers, 2007

Determining Case in Arabic: Learning Complex Linguistic Behavior Requires Complex Linguistic Features.
Proceedings of the EMNLP-CoNLL 2007, 2007

2006
Arabic Preprocessing Schemes for Statistical Machine Translation.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2006

Parallel Syntactic Annotation of Multiple Languages.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Inter-annotator Agreement on a Multilingual Semantic Annotation Task.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Developing and Using a Pilot Dialectal Arabic Treebank.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Design, Construction and Validation of an Arabic-English Conceptual Interlingua for Cross-lingual Information Retrieval.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Parsing Arabic Dialects.
Proceedings of the EACL 2006, 2006

Challenges in Building an Arabic-English GHMT System with SMT Components.
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, 2006

Combination of Arabic Preprocessing Schemes for Statistical Machine Translation.
Proceedings of the ACL 2006, 2006

MAGEAD: A Morphological Analyzer and Generator for the Arabic Dialects.
Proceedings of the ACL 2006, 2006

2005
Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop.
Proceedings of the ACL 2005, 2005

Morphological Analysis and Generation for Arabic Dialects.
Proceedings of the Workshop on Computational Approaches to Semitic Languages, 2005

2004
Interlingual Annotation of Multilingual Text Corpora.
Proceedings of the Workshop Frontiers in Corpus Annotation@HLT-NAACL 2004, 2004

The Use of a Structural N-gram Language Model in Generation-Heavy Hybrid Machine Translation.
Proceedings of the Natural Language Generation, Third International Conference, 2004

Interlingual Annotation for MT Development.
Proceedings of the Machine Translation: From Real Users to Research, 2004

Multi-align: Combining Linguistic and Statistical Techniques to Improve Alignments for Adaptable MT.
Proceedings of the Machine Translation: From Real Users to Research, 2004

2003
Rapid porting of DUSTer to Hindi.
ACM Trans. Asian Lang. Inf. Process., 2003

Hybrid Natural Language Generation from Lexical Conceptual Structures.
Mach. Transl., 2003

A Categorial Variation Database for English.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, 2003

Matador: a large-scale Spanish-English GHMT system.
Proceedings of Machine Translation Summit IX: Papers, 2003

2002
Generation-Heavy Hybrid Machine Translation.
Proceedings of the International Natural Language Generation Conference, 2002

Handling Translation Divergences: Combining Statistical and Symbolic Techniques in Generation-Heavy Machine Translation.
Proceedings of the Machine Translation: From Research to Real Users, 2002

DUSTer: A Method for Unraveling Cross-Language Divergences for Statistical Word-Level Alignment.
Proceedings of the Machine Translation: From Research to Real Users, 2002

2001
Large scale language independent generation using thematic hierarchies.
Proceedings of Machine Translation Summit VIII, 2001

2000
Oxygen: A Language Independent Linearization Engine.
Proceedings of the Envisioning Machine Translation in the Information Future, 2000

1998
A Thematic Hierarchy for Efficient Generation from Lexical-Conceptual Structure.
Proceedings of the Machine Translation and the Information Soup, 1998


  Loading...