We stand with Ukraine

We stand with Ukraine

James Martens

According to our database¹, James Martens authored at least 38 papers between 2010 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Orthogonal Self-Attention.

[DOI]

,

CoRR, February, 2026

2025

Cutting the Skip: Training Residual-Free Transformers.

[DOI]

,

,

,

,

Peyman Moghadam

,

,

Hemanth Saratchandran

,

CoRR, October, 2025

Optimizers Qualitatively Alter Solutions And We Should Leverage This.

[DOI]

,

,

Ionut-Vlad Modoranu

,

Naima Elosegui Borras

,

,

Petar Velickovic

,

,

,

CoRR, July, 2025

2024

Normalization and effective learning rates in reinforcement learning.

[DOI]

,

,

Khimya Khetarpal

,

,

Hado Philip van Hasselt

,

,

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Disentangling the Causes of Plasticity Loss in Neural Networks.

[DOI]

,

,

Khimya Khetarpal

,

Hado van Hasselt

,

,

,

Proceedings of the Conference on Lifelong Learning Agents, 2024

2023

Pre-training via Denoising for Molecular Property Prediction.

[DOI]

Sheheryar Zaidi

,

Michael Schaarschmidt

,

,

,

,

Alvaro Sanchez-Gonzalez

,

Peter W. Battaglia

,

,

Jonathan Godwin

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation.

[DOI]

,

,

,

Aleksandar Botev

,

,

Samuel L. Smith

,

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers.

[DOI]

,

Aleksandar Botev

,

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping.

[DOI]

,

,

Guillaume Desjardins

,

Grzegorz Swirszcz

,

Valentin Dalibard

,

Jascha Sohl-Dickstein

,

Samuel S. Schoenholz

CoRR, 2021

On the validity of kernel approximations for orthogonally-initialized neural networks.

[DOI]

CoRR, 2021

2020

New Insights and Perspectives on the Natural Gradient Method.

[DOI]

J. Mach. Learn. Res., 2020

Blockchain-based Verifiable Credential Sharing with Selective Disclosure.

[DOI]

,

,

,

,

Salil S. Kanhere

Proceedings of the 19th IEEE International Conference on Trust, 2020

2019

Differentiable Game Mechanics.

[DOI]

Alistair Letcher

,

,

Sébastien Racanière

,

,

Jakob N. Foerster

,

,

J. Mach. Learn. Res., 2019

Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks.

[DOI]

,

,

Roger B. Grosse

CoRR, 2019

On the Variance of Unbiased Online Recurrent Optimization.

[DOI]

,

CoRR, 2019

Fast Convergence of Natural Gradient Descent for Over-Parameterized Neural Networks.

[DOI]

,

,

Roger B. Grosse

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model.

[DOI]

,

,

,

,

Sushant Sachdeva

,

,

Christopher J. Shallue

,

Roger B. Grosse

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Adversarial Robustness through Local Linearization.

[DOI]

,

,

,

,

Krishnamurthy Dvijotham

,

Alhussein Fawzi

,

,

Robert Stanforth

,

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

2018

The Mechanics of n-Player Differentiable Games.

[DOI]

,

Sébastien Racanière

,

,

Jakob N. Foerster

,

,

Proceedings of the 35th International Conference on Machine Learning, 2018

Stochastic Gradient Langevin dynamics that Exploit Neural Network Structure.

[DOI]

,

,

Roger B. Grosse

,

,

,

Proceedings of the 6th International Conference on Learning Representations, 2018

Kronecker-factored Curvature Approximations for Recurrent Neural Networks.

[DOI]

,

,

Proceedings of the 6th International Conference on Learning Representations, 2018

2017

Distributed Second-Order Optimization using Kronecker-Factored Approximations.

[DOI]

,

Roger B. Grosse

,

Proceedings of the 5th International Conference on Learning Representations, 2017

2016

Second-order Optimization for Neural Networks.

[DOI]

PhD thesis, 2016

A Kronecker-factored approximate Fisher matrix for convolution layers.

[DOI]

Roger B. Grosse

,

Proceedings of the 33nd International Conference on Machine Learning, 2016

2015

Adding Gradient Noise Improves Learning for Very Deep Networks.

[DOI]

Arvind Neelakantan

,

,

,

,

,

,

CoRR, 2015

Optimizing Neural Networks with Kronecker-factored Approximate Curvature.

[DOI]

,

Roger B. Grosse

Proceedings of the 32nd International Conference on Machine Learning, 2015

2014

On the Expressive Efficiency of Sum Product Networks.

[DOI]

,

Venkatesh Medabalimi

CoRR, 2014

New perspectives on the natural gradient method.

[DOI]

CoRR, 2014

2013

On the Expressive Power of Restricted Boltzmann Machines.

[DOI]

,

Arkadev Chattopadhyay

,

Toniann Pitassi

,

Richard S. Zemel

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

On the importance of initialization and momentum in deep learning.

[DOI]

,

,

,

Geoffrey E. Hinton

Proceedings of the 30th International Conference on Machine Learning, 2013

2012

Training Deep and Recurrent Networks with Hessian-Free Optimization.

[DOI]

,

Proceedings of the Neural Networks: Tricks of the Trade - Second Edition, 2012

Estimating the Hessian by Back-propagating Curvature.

[DOI]

,

,

Proceedings of the 29th International Conference on Machine Learning, 2012

2011

Normalization for probabilistic inference with neurons.

[DOI]

Chris Eliasmith

,

Biol. Cybern., 2011

Generating Text with Recurrent Neural Networks.

[DOI]

,

,

Geoffrey E. Hinton

Proceedings of the 28th International Conference on Machine Learning, 2011

Learning Recurrent Neural Networks with Hessian-Free Optimization.

[DOI]

,

Proceedings of the 28th International Conference on Machine Learning, 2011

2010

Parallelizable Sampling of Markov Random Fields.

[DOI]

,

Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Learning the Linear Dynamical System with ASOS.

[DOI]

Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Deep learning via Hessian-free optimization.

[DOI]

Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Loading...