Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Understanding and Improving Length Generalization in Recurrent Models.

[BibT_eX]

[DOI]

Ricardo Buitrago Ruiz

Albert Gu

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism.

[BibT_eX]

[DOI]

Aviv Bick

Eric P. Xing

Albert Gu

Proceedings of the Forty-second International Conference on Machine Learning, 2025

On the Benefits of Memory for Modeling Time-Dependent PDEs.

[BibT_eX]

[DOI]

Ricardo Buitrago Ruiz

Tanya Marwah

Albert Gu

Andrej Risteski

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers.

[BibT_eX]

[DOI]

CoRR, 2024

An Empirical Study of Mamba-based Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models.

[BibT_eX]

[DOI]

George-Cristian Muraru

CoRR, 2024

Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality.

[BibT_eX]

[DOI]

Tri Dao

Albert Gu

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Augmenting Conformers With Structured State-Space Sequence Models For Online Speech Recognition.

[BibT_eX]

[DOI]

Krzysztof Choromanski

Tara N. Sainath

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Mamba: Linear-Time Sequence Modeling with Selective State Spaces.

[BibT_eX]

[DOI]

Albert Gu

Tri Dao

CoRR, 2023

Augmenting conformers with structured state space models for online speech recognition.

[BibT_eX]

[DOI]

Krzysztof Choromanski

Tara N. Sainath

CoRR, 2023

Structured State Space Models for In-Context Reinforcement Learning.

[BibT_eX]

[DOI]

Feryal M. P. Behbahani

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Resurrecting Recurrent Neural Networks for Long Sequences.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Modelling Long Range Dependencies in $N$D: From Task-Specific to a General Purpose CNN.

[BibT_eX]

[DOI]

Jakub Mikolaj Tomczak

Mark Hoogendoorn

Jan-Jakob Sonke

Proceedings of the Eleventh International Conference on Learning Representations, 2023

How to Train your HIPPO: State Space Models with Generalized Orthogonal Basis Projections.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Pretraining Without Attention.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

2022

S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces.

[BibT_eX]

[DOI]

CoRR, 2022

Towards a General Purpose CNN for Long Range Dependencies in ND.

[BibT_eX]

[DOI]

CoRR, 2022

S4ND: Modeling Images and Videos as Multidimensional Signals with State Spaces.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

On the Parameterization and Initialization of Diagonal State Space Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Diagonal State Spaces are as Effective as Structured State Spaces.

[BibT_eX]

[DOI]

Ankit Gupta

Albert Gu

Jonathan Berant

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

It's Raw! Audio Generation with State-Space Models.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Efficiently Modeling Long Sequences with Structured State Spaces.

[BibT_eX]

[DOI]

Albert Gu

Karan Goel

Christopher Ré

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

Combining Recurrent, Convolutional, and Continuous-time Models with Linear State Space Layers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Catformer: Designing Stable Transformers via Sensitivity Analysis.

[BibT_eX]

[DOI]

Jared Quincy Davis

Albert Gu

Krzysztof Choromanski

Proceedings of the 38th International Conference on Machine Learning, 2021

HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Model Patching: Closing the Subgroup Performance Gap with Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

HiPPO: Recurrent Memory with Optimal Polynomial Projections.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Improving the Gating Mechanism of Recurrent Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Sparse Recovery for Orthogonal Polynomial Transforms.

[BibT_eX]

[DOI]

Proceedings of the 47th International Colloquium on Automata, Languages, and Programming, 2020

2019

Improving the Gating Mechanism of Recurrent Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2019

A Kernel Theory of Modern Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Learning Mixed-Curvature Representations in Product Spaces.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

2018

A Two-pronged Progress in Structured Dense Matrix Vector Multiplication.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, 2018

Learning Compressed Transforms with Low Displacement Rank.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Representation Tradeoffs for Hyperbolic Embeddings.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Learning Invariance with Compact Transforms.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

2016

Recurrence Width for Structured Dense Matrix Vector Multiplication.

[BibT_eX]

[DOI]

CoRR, 2016

2015

Sprague-Grundy Values of the $\mathcal{R}$-Wythoff Game.

[BibT_eX]

[DOI]

Albert Gu

Electron. J. Comb., 2015

2013

The power of deferral: maintaining a constant-competitive steiner tree online.

[BibT_eX]

[DOI]

Albert Gu

Anupam Gupta

Amit Kumar

Proceedings of the Symposium on Theory of Computing Conference, 2013

Albert Gu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...