Alexander M. Rush

Orcid: 0000-0002-9900-1606

According to our database1, Alexander M. Rush authored at least 140 papers between 2006 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
MambaByte: Token-free Selective State Space Model.
CoRR, 2024

2023
A 16-nm SoC for Noise-Robust Speech and NLP Edge AI Inference With Bayesian Sound Source Separation and Attention-Based DNNs.
IEEE J. Solid State Circuits, February, 2023

End-to-end learning of multiple sequence alignments with differentiable Smith-Waterman.
Bioinform., January, 2023

Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models.
IEEE Trans. Vis. Comput. Graph., 2023

Named Tensor Notation.
Trans. Mach. Learn. Res., 2023

Diffusion Models Without Attention.
CoRR, 2023

Language Model Inversion.
CoRR, 2023

On What Basis? Predicting Text Preference Via Structured Comparative Reasoning.
CoRR, 2023

Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling.
CoRR, 2023

Zephyr: Direct Distillation of LM Alignment.
CoRR, 2023

Guess & Sketch: Language Model Guided Transpilation.
CoRR, 2023

OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents.
CoRR, 2023

Teal: Learning-Accelerated Optimization of WAN Traffic Engineering.
Proceedings of the ACM SIGCOMM 2023 Conference, 2023

Scaling Data-Constrained Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A 12nm 18.1TFLOPs/W Sparse Transformer Processor with Entropy-Based Early Exit, Mixed-Precision Predication and Fine-Grained Power Management.
Proceedings of the IEEE International Solid- State Circuits Conference, 2023

Markup-to-Image Diffusion Models with Scheduled Sampling.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Hop, Union, Generate: Explainable Multi-hop Reasoning without Rationale Supervision.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Pretraining Without Attention.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Tree Prompting: Efficient Task Adaptation without Fine-Tuning.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

MiniChain: A Small Library for Coding with Large Language Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Text Embeddings Reveal (Almost) As Much As Text.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Symbolic Planning and Code Generation for Grounded Dialogue.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Explaining Data Patterns in Natural Language with Language Models.
Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, 2023

Abductive Commonsense Reasoning Exploiting Mutually Exclusive Explanations.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
GenNI: Human-AI Collaboration for Data-Backed Text Generation.
IEEE Trans. Vis. Comput. Graph., 2022

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model.
CoRR, 2022

Teal: Learning-Accelerated Optimization of Traffic Engineering.
CoRR, 2022

ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition.
CoRR, 2022

Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements.
CoRR, 2022

Explaining Patterns in Data with Language Models via Interpretable Autoprompting.
CoRR, 2022

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts.
CoRR, 2022


Unsupervised Text Deidentification.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Model Criticism for Long-Form Text Generation.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Xatu: boosting existing DDoS detection systems using auxiliary signals.
Proceedings of the 18th International Conference on emerging Networking EXperiments and Technologies, 2022


2021
Multitask Prompted Training Enables Zero-Shot Task Generalization.
CoRR, 2021

Low-Complexity Probing via Finding Subnetworks.
CoRR, 2021

Low-Rank Constraints for Fast Inference in Structured Models.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

How many data points is a prompt worth?
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Low-Complexity Probing via Finding Subnetworks.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Template Filling with Generative Transformers.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Developmental Stage Classification of Embryos Using Two-Stream Neural Network with Linear-Chain Conditional Random Field.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2021 - 24th International Conference, Strasbourg, France, September 27, 2021

9.8 A 25mm<sup>2</sup> SoC for IoT Devices with 18ms Noise-Robust Speech-to-Text Latency via Bayesian Speech Denoising and Attention-Based Sequence-to-Sequence DNN Speech Recognition in 16nm FinFET.
Proceedings of the IEEE International Solid-State Circuits Conference, 2021

Learning from others' mistakes: Avoiding dataset biases without modeling them.
Proceedings of the 9th International Conference on Learning Representations, 2021

SM6: A 16nm System-on-Chip for Accurate and Noise-Robust Attention-Based NLP Applications : The 33<sup>rd</sup> Hot Chips Symposium - August 22-24, 2021.
Proceedings of the IEEE Hot Chips 33 Symposium, 2021

Rationales for Sequential Predictions.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021


Block Pruning For Faster Transformers.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Sequence-to-Lattice Models for Fast Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

GRIT: Generative Role-filler Transformers for Document-level Event Entity Extraction.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Parameter-Efficient Transfer Learning with Diff Pruning.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Visual Interaction with Deep Learning Models through Collaborative Semantic Inference.
IEEE Trans. Vis. Comput. Graph., 2020

LAN: A Materials Notation for Two-Dimensional Layered Assemblies.
J. Chem. Inf. Model., 2020

EdgeBERT: Optimizing On-Chip Inference for Multi-Task NLP.
CoRR, 2020

Pre-trained Summarization Distillation.
CoRR, 2020

Document-level Event-based Extraction Using Generative Template-filling Transformers.
CoRR, 2020

MiniConf - A Virtual Conference Framework.
CoRR, 2020

Automating Botnet Detection with Graph Neural Networks.
CoRR, 2020

Movement Pruning: Adaptive Sparsity by Fine-Tuning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Latent Template Induction with Gumbel-CRFs.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Cascaded Text Generation with Markov Transformers.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Improving Event Duration Prediction via Time-aware Pre-training.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Transformers: State-of-the-Art Natural Language Processing.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020

Adversarial Semantic Collisions.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Sequence-Level Mixed Sample Data Augmentation.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Scaling Hidden Markov Language Models.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Algorithm-Hardware Co-Design of Adaptive Floating-Point Encodings for Resilient Deep Learning Inference.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Torch-Struct: Deep Structured Prediction Library.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

Posterior Control of Blackbox Generation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

What is Learned in Visually Grounded Neural Syntax Acquisition.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Seq2seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models.
IEEE Trans. Vis. Comput. Graph., 2019

A Hierarchy of Graph Neural Networks Based on Learnable Local Features.
CoRR, 2019

AdaptivFloat: A Floating-point based Data Type for Resilient Deep Learning Inference.
CoRR, 2019

Encoder-Agnostic Adaptation for Conditional Language Generation.
CoRR, 2019

On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference.
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics, 2019

Unsupervised Recurrent Neural Network Grammars.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Generating Abstractive Summaries with Finetuned Language Models.
Proceedings of the 12th International Conference on Natural Language Generation, 2019

Latent Normalizing Flows for Discrete Sequences.
Proceedings of the 36th International Conference on Machine Learning, 2019

Tensor Variable Elimination for Plated Factor Graphs.
Proceedings of the 36th International Conference on Machine Learning, 2019

Neural Linguistic Steganography.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Commonsense Knowledge Mining from Pretrained Models.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Avoiding Latent Variable Collapse with Generative Skip Models.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Simple Unsupervised Summarization by Contextual Matching.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Compound Probabilistic Context-Free Grammars for Grammar Induction.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

GLTR: Statistical Detection and Visualization of Generated Text.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Don't Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

MASR: A Modular Accelerator for Sparse RNNs.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018
LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks.
IEEE Trans. Vis. Comput. Graph., 2018

A Tutorial on Deep Latent Variable Models of Natural Language.
CoRR, 2018

Latent Alignment and Variational Attention.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

End-to-End Content and Plan Selection for Data-to-Text Generation.
Proceedings of the 11th International Conference on Natural Language Generation, 2018

Adversarially Regularized Autoencoders.
Proceedings of the 35th International Conference on Machine Learning, 2018

Semi-Amortized Variational Autoencoders.
Proceedings of the 35th International Conference on Machine Learning, 2018

Weightless: Lossy weight encoding for deep neural network compression.
Proceedings of the 6th International Conference on Learning Representations, 2018

Learning Neural Templates for Text Generation.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Debugging Sequence-to-Sequence Models with Seq2Seq-Vis.
Proceedings of the Workshop: Analyzing and Interpreting Neural Networks for NLP, 2018

Training for Diversity in Image Paragraph Captioning.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Entity Tracking Improves Cloze-style Reading Comprehension.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Bottom-Up Abstractive Summarization.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

OpenNMT: Neural Machine Translation Toolkit.
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas, 2018

OpenNMT System Description for WNMT 2018: 800 words/sec on a single-core CPU.
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, 2018

2017
OpenNMT: Open-source Toolkit for Neural Machine Translation.
CoRR, 2017

Adversarially Regularized Autoencoders for Generating Discrete Structures.
CoRR, 2017

Image-to-Markup Generation with Coarse-to-Fine Attention.
Proceedings of the 34th International Conference on Machine Learning, 2017

Lie-Access Neural Turing Machines.
Proceedings of the 5th International Conference on Learning Representations, 2017

Structured Attention Networks.
Proceedings of the 5th International Conference on Learning Representations, 2017

Challenges in Data-to-Document Generation.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Adapting Sequence Models for Sentence Correction.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Coarse-to-Fine Attention Models for Document Summarization.
Proceedings of the Workshop on New Frontiers in Summarization, 2017

OpenNMT: Open-Source Toolkit for Neural Machine Translation.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks.
CoRR, 2016

What You Get Is What You See: A Visual Markup Decompiler.
CoRR, 2016

Antecedent Prediction Without a Pipeline.
Proceedings of the Workshop on Coreference Resolution Beyond OntoNotes, 2016

Learning Global Features for Coreference Resolution.
Proceedings of the NAACL HLT 2016, 2016

Abstractive Sentence Summarization with Attentive Recurrent Neural Networks.
Proceedings of the NAACL HLT 2016, 2016

Sequence-to-Sequence Learning as Beam-Search Optimization.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Word Ordering Without Syntax.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

An Embedding Model for Predicting Roll-Call Votes.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Sequence-Level Knowledge Distillation.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction.
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications, 2016

Character-Aware Neural Language Models.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Transforming Dependencies into Phrase Structures.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

A Fast Variational Approach for Learning Markov Random Field Language Models.
Proceedings of the 32nd International Conference on Machine Learning, 2015

A Neural Attention Model for Abstractive Sentence Summarization.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Learning Anaphoricity and Antecedent Ranking Features for Coreference Resolution.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
Lagrangian relaxation for natural language decoding.
PhD thesis, 2014

A Constrained Viterbi Relaxation for Bidirectional Word Alignment.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

2013
Optimal Beam Search for Machine Translation.
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013

Spectral Learning of Refinement HMMs.
Proceedings of the Seventeenth Conference on Computational Natural Language Learning, 2013

2012
A Tutorial on Dual Decomposition and Lagrangian Relaxation for Inference in Natural Language Processing.
J. Artif. Intell. Res., 2012

Vine Pruning for Efficient Multi-Pass Dependency Parsing.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2012

Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints.
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

2011
Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011

Dual Decomposition for Natural Language Processing.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, 2011

2010
On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing.
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010

Dual Decomposition for Parsing with Non-Projective Head Automata.
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010

2006
Induction of Probabilistic Synchronous Tree-Insertion Grammars for Machine Translation.
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, 2006


  Loading...