Razvan Pascanu

Martin Jaggi

CoRR, 2024

No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO.

[BibT_eX]

[DOI]

CoRR, 2024

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models.

[BibT_eX]

[DOI]

George-Cristian Muraru

CoRR, 2024

Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons.

[BibT_eX]

[DOI]

CoRR, 2024

Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models.

[BibT_eX]

[DOI]

George-Cristian Muraru

CoRR, 2024

Disentangling the Causes of Plasticity Loss in Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2024

Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem.

[BibT_eX]

[DOI]

CoRR, 2024

Improving fine-grained understanding in image-text pre-training.

[BibT_eX]

[DOI]

CoRR, 2024

2023

Discovering modular solutions that generalize compositionally.

[BibT_eX]

[DOI]

CoRR, 2023

Continual Learning: Applications and the Road Forward.

[BibT_eX]

[DOI]

CoRR, 2023

Uncovering mesa-optimization algorithms in Transformers.

[BibT_eX]

[DOI]

Blaise Agüera y Arcas

Max Vladymyrov

Reza Babanezhad Harikandeh

João Sacramento

CoRR, 2023

On the Universality of Linear Recurrences Followed by Nonlinear Projections.

[BibT_eX]

[DOI]

CoRR, 2023

Promoting Exploration in Memory-Augmented Adam using Critical Momenta.

[BibT_eX]

[DOI]

Pranshu Malviya

Gonçalo Mordido

Aristide Baratin

CoRR, 2023

Towards Robust and Efficient Continual Language Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Kalman Filter for Online Classification of Non-Stationary Data.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Compute-Optimal Transfer Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Learning to Modulate pre-trained Models in RL.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Deep Reinforcement Learning with Plasticity Injection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

The Tunnel Effect: Building Data Representations in Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Latent Space Representations of Neural Algorithmic Reasoners.

[BibT_eX]

[DOI]

Vladimir V. Mirjanic

Petar Velickovic

Proceedings of the Learning on Graphs Conference, 27-30 November 2023, Virtual Event., 2023

Asynchronous Algorithmic Alignment With Cocycles.

[BibT_eX]

[DOI]

Proceedings of the Learning on Graphs Conference, 27-30 November 2023, Virtual Event., 2023

Resurrecting Recurrent Neural Networks for Long Sequences.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Understanding Plasticity in Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Pre-training via Denoising for Molecular Property Prediction.

[BibT_eX]

[DOI]

Sheheryar Zaidi

Michael Schaarschmidt

James Martens

Hyunjik Kim

Alvaro Sanchez-Gonzalez

Peter W. Battaglia

Jonathan Godwin

Proceedings of the Eleventh International Conference on Learning Representations, 2023

SemPPL: Predicting Pseudo-Labels for Better Contrastive Representations.

[BibT_eX]

[DOI]

Matko Bosnjak

Pierre Harvey Richemond

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Continually learning representations at scale.

[BibT_eX]

[DOI]

Proceedings of the Conference on Lifelong Learning Agents, 2023

2022

An empirical study of implicit regularization in deep offline RL.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2022

Behavior Priors for Efficient Reinforcement Learning.

[BibT_eX]

[DOI]

Arun Ahuja

Florina-Cristina Calnegru

Nicolas Heess

J. Mach. Learn. Res., 2022

NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research.

[BibT_eX]

[DOI]

CoRR, 2022

Architecture Matters in Continual Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?

[BibT_eX]

[DOI]

CoRR, 2022

Disentangling Transfer in Continual Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Reasoning-Modulated Representations.

[BibT_eX]

[DOI]

Proceedings of the Learning on Graphs Conference, 2022

The First Learning on Graphs Conference: Preface.

[BibT_eX]

[DOI]

Proceedings of the Learning on Graphs Conference, 2022

Correlation Based Semantic Transfer with Application to Domain Adaptation.

[BibT_eX]

[DOI]

John Shawe-Taylor

Iasonas Kokkinos

Proceedings of the Neural Information Processing - 29th International Conference, 2022

The CLRS Algorithmic Reasoning Benchmark.

[BibT_eX]

[DOI]

Petar Velickovic

Proceedings of the International Conference on Machine Learning, 2022

Wide Neural Networks Forget Less Catastrophically.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

When Does Re-initialization Work?

[BibT_eX]

[DOI]

Proceedings of the Proceedings on "I Can't Believe It's Not Better!, 2022

Probing Transfer in Deep Reinforcement Learning without Task Engineering.

[BibT_eX]

[DOI]

Proceedings of the Conference on Lifelong Learning Agents, 2022

Test Sample Accuracy Scales with Training Sample Density in Neural Networks.

[BibT_eX]

[DOI]

Xu Ji

R. Devon Hjelm

Andrea Vedaldi

Proceedings of the Conference on Lifelong Learning Agents, 2022

2021

Wide Neural Networks Forget Less Catastrophically.

[BibT_eX]

[DOI]

CoRR, 2021

Task-agnostic Continual Learning with Hybrid Probabilistic Models.

[BibT_eX]

[DOI]

Polina Kirichenko

Mehrdad Farajtabar

Dushyant Rao

CoRR, 2021

Predicting Unreliable Predictions by Shattering a Neural Network.

[BibT_eX]

[DOI]

CoRR, 2021

A study on the plasticity of neural networks.

[BibT_eX]

[DOI]

CoRR, 2021

Drawing Multiple Augmentation Samples Per Image During Training Efficiently Decreases Test Error.

[BibT_eX]

[DOI]

CoRR, 2021

Regularized Behavior Value Estimation.

[BibT_eX]

[DOI]

Çaglar Gülçehre

Sergio Gómez Colmenarejo

CoRR, 2021

Continual World: A Robotic Benchmark For Continual Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Powerpropagation: A sparsity inducing weight reparameterisation.

[BibT_eX]

[DOI]

Peter E. Latham

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

On the Role of Optimization in Double Descent: A Least Squares Study.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

LiRo: Benchmark and leaderboard for Romanian language tasks.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Linear Mode Connectivity in Multitask and Continual Learning.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

BYOL works even without batch statistics.

[BibT_eX]

[DOI]

CoRR, 2020

Temporal Difference Uncertainties as a Signal for Exploration.

[BibT_eX]

[DOI]

CoRR, 2020

Pointer Graph Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Understanding the Role of Training Regimes in Continual Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Top-KAST: Top-K Always Sparse Training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Stabilizing Transformers for Reinforcement Learning.

[BibT_eX]

[DOI]

Max Jaderberg

Raphaël Lopez Kaufman

Proceedings of the 37th International Conference on Machine Learning, 2020

Improving the Gating Mechanism of Recurrent Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Functional Regularisation for Continual Learning with Gaussian Processes.

[BibT_eX]

[DOI]

Michalis K. Titsias

Alexander G. de G. Matthews

Proceedings of the 8th International Conference on Learning Representations, 2020

Multiplicative Interactions and Where to Find Them.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Meta-Learning with Warped Gradient Descent.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

A Deep Neural Network's Loss Surface Contains Every Low-dimensional Pattern.

[BibT_eX]

[DOI]

Simon Osindero

Max Jaderberg

CoRR, 2019

Improving the Gating Mechanism of Recurrent Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2019

Meta-Learning with Warped Gradient Descent.

[BibT_eX]

[DOI]

CoRR, 2019

Task Agnostic Continual Learning via Meta Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Meta-learning of Sequential Strategies.

[BibT_eX]

[DOI]

CoRR, 2019

Ray Interference: a Source of Plateaus in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Exploiting Hierarchy for Learning and Transfer in KL-regularized RL.

[BibT_eX]

[DOI]

CoRR, 2019

Functional Regularisation for Continual Learning using Gaussian Processes.

[BibT_eX]

[DOI]

Michalis K. Titsias

Alexander G. de G. Matthews

CoRR, 2019

Continual Unsupervised Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Deep reinforcement learning with relational inductive biases.

[BibT_eX]

[DOI]

Vinícius Flores Zambaldi

Proceedings of the 7th International Conference on Learning Representations, 2019

Meta-Learning with Latent Embedding Optimization.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Hyperbolic Attention Networks.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Information asymmetry in KL-regularized RL.

[BibT_eX]

[DOI]

Alexandre Galashov

Nicolas Heess

Proceedings of the 7th International Conference on Learning Representations, 2019

A RAD approach to deep mixture models.

[BibT_eX]

[DOI]

Laurent Dinh

Jascha Sohl-Dickstein

Hugo Larochelle

Proceedings of the Deep Generative Models for Highly Structured Data, 2019

Distilling Policy Distillation.

[BibT_eX]

[DOI]

Simon Osindero

Grzegorz Swirszcz

Max Jaderberg

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018

Vector-based navigation using grid-like representations in artificial agents.

[BibT_eX]

[DOI]

Nat., 2018

Adapting Auxiliary Losses Using Gradient Similarity.

[BibT_eX]

[DOI]

Yunshu Du

CoRR, 2018

Relational Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Vinícius Flores Zambaldi

CoRR, 2018

Relational inductive biases, deep learning, and graph networks.

[BibT_eX]

[DOI]

CoRR, 2018

Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery.

[BibT_eX]

[DOI]

Thomas S. Stepleton

Will Dabney

Hubert Soyer

Rémi Munos

CoRR, 2018

Block Mean Approximation for Efficient Second Order Optimization.

[BibT_eX]

[DOI]

CoRR, 2018

Learning Deep Generative Models of Graphs.

[BibT_eX]

[DOI]

CoRR, 2018

Relational recurrent neural networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Progress & Compress: A scalable framework for continual learning.

[BibT_eX]

[DOI]

Agnieszka Grabska-Barwinska

Wojciech Czarnecki

Jelena Luketina

Raia Hadsell

Proceedings of the 35th International Conference on Machine Learning, 2018

Been There, Done That: Meta-Learning with Episodic Recall.

[BibT_eX]

[DOI]

Samuel Ritter

Jane X. Wang

Zeb Kurth-Nelson

Charles Blundell

Matthew M. Botvinick

Proceedings of the 35th International Conference on Machine Learning, 2018

Mix & Match Agent Curricula for Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Memory-based Parameter Adaptation.

[BibT_eX]

[DOI]

Pablo Sprechmann

Jack W. Rae

Alexander Pritzel

Proceedings of the 6th International Conference on Learning Representations, 2018

Model compression via distillation and quantization.

[BibT_eX]

[DOI]

Antonio Polino

Dan Alistarh

Proceedings of the 6th International Conference on Learning Representations, 2018

2017

Imagination-Augmented Agents for Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Danilo Jimenez Rezende

CoRR, 2017

Visual Interaction Networks.

[BibT_eX]

[DOI]

CoRR, 2017

Learning model-based planning from scratch.

[BibT_eX]

[DOI]

CoRR, 2017

Visual Interaction Networks: Learning a Physics Simulator from Video.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Distral: Robust multitask reinforcement learning.

[BibT_eX]

[DOI]

Victor Bapst

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

A simple neural network module for relational reasoning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Imagination-Augmented Agents for Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Danilo Jimenez Rezende

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Sobolev Training for Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Sharp Minima Can Generalize For Deep Nets.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

Discovering objects and their relations from entangled scene representations.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

Learning to Navigate in Complex Environments.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

Metacontrol for Adaptive Imagination-Based Optimization.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

Sim-to-Real Robot Learning from Pixels with Progressive Nets.

[BibT_eX]

[DOI]

Proceedings of the 1st Annual Conference on Robot Learning, CoRL 2017, Mountain View, 2017

2016

Local minima in training of deep networks.

[BibT_eX]

[DOI]

Grzegorz Swirszcz

Agnieszka Grabska-Barwinska

CoRR, 2016

Progressive Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2016

Policy Distillation.

[BibT_eX]

[DOI]

Andrei A. Rusu

Sergio Gomez Colmenarejo

Proceedings of the 4th International Conference on Learning Representations, 2016

Overcoming catastrophic forgetting in neural networks.

[BibT_eX]

[DOI]

CoRR, 2016

Theano: A Python framework for fast computation of mathematical expressions.

[BibT_eX]

[DOI]

Xavier Bouthillier

Alexandre de Brébisson

Samira Ebrahimi Kahou

Pierre-Antoine Manzagol

Christopher Joseph Pal

S. Ramana Subramanyam

CoRR, 2016

Interaction Networks for Learning about Objects, Relations and Physics.

[BibT_eX]

[DOI]

Peter W. Battaglia

Matthew Lai

Danilo Jimenez Rezende

Koray Kavukcuoglu

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

2015

Natural Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Malware classification with recurrent networks.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Revisiting Natural Gradient for Deep Networks

[BibT_eX]

[DOI]

Proceedings of the 2nd International Conference on Learning Representations, 2014

On the number of inference regions of deep feed forward networks with piece-wise linear activations.

[BibT_eX]

[DOI]

Guido Montúfar

Proceedings of the 2nd International Conference on Learning Representations, 2014

How to Construct Deep Recurrent Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2nd International Conference on Learning Representations, 2014

On the saddle point problem for non-convex optimization.

[BibT_eX]

[DOI]

CoRR, 2014

Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2014

On the Number of Linear Regions of Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013

Natural Gradient Revisited

[BibT_eX]

[DOI]

Proceedings of the 1st International Conference on Learning Representations, 2013

Metric-Free Natural Gradient for Joint-Training of Boltzmann Machines

[BibT_eX]

[DOI]

Proceedings of the 1st International Conference on Learning Representations, 2013

Learned-norm pooling for deep neural networks.

[BibT_eX]

[DOI]

CoRR, 2013

Pylearn2: a machine learning research library.

[BibT_eX]

[DOI]

CoRR, 2013

On the difficulty of training recurrent neural networks.

[BibT_eX]

[DOI]

Tomás Mikolov

Proceedings of the 30th International Conference on Machine Learning, 2013

Combining modality specific deep neural networks for emotion recognition in video.

[BibT_eX]

[DOI]

Proceedings of the 2013 International Conference on Multimodal Interaction, 2013

Advances in optimizing recurrent networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Learning Algorithms for the Classification Restricted Boltzmann Machine.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2012

Theano: new features and speed improvements

[BibT_eX]

[DOI]

CoRR, 2012

Understanding the exploding gradient problem

[BibT_eX]

[DOI]

Tomás Mikolov

CoRR, 2012

2011

Contextual tag inference.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2011

A neurodynamical model for working memory.

[BibT_eX]

[DOI]

Herbert Jaeger

Neural Networks, 2011

Deep Learners Benefit More from Out-of-Distribution Examples.

[BibT_eX]

[DOI]

Frédéric Bastien

Arnaud Bergeron

Sylvain Pannetier Lebeuf

Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011

Autotagging music with conditional restricted Boltzmann machines

[BibT_eX]

[DOI]

CoRR, 2011

2010

Deep Self-Taught Learning for Handwritten Character Recognition

[BibT_eX]

[DOI]

Frédéric Bastien

Arnaud Bergeron