Yoshua Bengio

According to our database1, Yoshua Bengio authored at least 615 papers between 1988 and 2020.

Collaborative distances:

Awards

Turing Prize recipient

Turing Prize 2018, "For conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing." awarded to Yoshua Bengio and Geoffrey E. Hinton and Yann LeCun.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepages:

On csauthors.net:

Bibliography

2020
On the Morality of Artificial Intelligence [Commentary].
IEEE Technol. Soc. Mag., 2020

Toward Training Recurrent Neural Networks for Lifelong Learning.
Neural Computation, 2020

Deriving Differential Target Propagation from Iterating Approximate Inverses.
CoRR, 2020

BabyAI 1.1.
CoRR, 2020

Revisiting Fundamentals of Experience Replay.
CoRR, 2020

S2RMs: Spatially Structured Recurrent Modules.
CoRR, 2020

Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules.
CoRR, 2020

Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems.
CoRR, 2020

Hybrid Models for Learning to Branch.
CoRR, 2020

Rethinking Distributional Matching Based Domain Adaptation.
CoRR, 2020

Image-to-image Mapping with Many Domains by Sparse Attribute Transfer.
CoRR, 2020

HNHN: Hypergraph Networks with Hyperedge Neurons.
CoRR, 2020

Untangling tradeoffs between recurrence and self-attention in neural networks.
CoRR, 2020

Learning Causal Models Online.
CoRR, 2020

Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing its Gradient Estimator Bias.
CoRR, 2020

Training End-to-End Analog Neural Networks with Equilibrium Propagation.
CoRR, 2020

Predicting COVID-19 Pneumonia Severity on Chest X-ray with Deep Learning.
CoRR, 2020

An Analysis of the Adaptation Speed of Causal Models.
CoRR, 2020

COVI White Paper.
CoRR, 2020

Continual Weight Updates and Convolutional Architectures for Equilibrium Propagation.
CoRR, 2020

Equilibrium Propagation with Continual Weight Updates.
CoRR, 2020

DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning.
CoRR, 2020

Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning.
CoRR, 2020

Experience Grounds Language.
CoRR, 2020

Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims.
CoRR, 2020

Object-Centric Image Generation from Layouts.
CoRR, 2020

Your GAN is Secretly an Energy-based Model and You Should use Discriminator Driven Latent Sampling.
CoRR, 2020

Continuous Domain Adaptation with Variational Domain-Agnostic Feature Replay.
CoRR, 2020

Benchmarking Graph Neural Networks.
CoRR, 2020

On Catastrophic Interference in Atari 2600 Games.
CoRR, 2020

Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning.
CoRR, 2020

HighRes-net: Recursive Fusion for Multi-Frame Super-Resolution of Satellite Imagery.
CoRR, 2020

Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies.
CoRR, 2020

BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization.
CoRR, 2020

Meta-learning framework with applications to zero-shot time-series forecasting.
CoRR, 2020

Combating False Negatives in Adversarial Imitation Learning.
CoRR, 2020

Using Simulated Data to Generate Images of Climate Change.
CoRR, 2020

Universal Successor Features for Transfer Reinforcement Learning.
CoRR, 2020

Learning from Learning Machines: Optimisation, Rules, and Social Norms.
CoRR, 2020

Learning the Arrow of Time for Problems in Reinforcement Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

N-BEATS: Neural basis expansion analysis for interpretable time series forecasting.
Proceedings of the 8th International Conference on Learning Representations, 2020

Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives.
Proceedings of the 8th International Conference on Learning Representations, 2020

The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget.
Proceedings of the 8th International Conference on Learning Representations, 2020

A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms.
Proceedings of the 8th International Conference on Learning Representations, 2020

Multi-Task Self-Supervised Learning for Robust Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Multi-Image Super-Resolution for Remote Sensing using Deep Recurrent Networks.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

On the interplay between noise and curvature and its effect on optimization and generalization.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

A Large-Scale, Open-Domain, Mixed-Interface Dialogue-Based ITS for STEM.
Proceedings of the Artificial Intelligence in Education - 21st International Conference, 2020

Compositional Generalization by Factorizing Alignment and Translation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, 2020

Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Combating False Negatives in Adversarial Imitation Learning (Student Abstract).
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Depth with nonlinearity creates no bad local minima in ResNets.
Neural Networks, 2019

Equivalence of Equilibrium Propagation and Recurrent Backpropagation.
Neural Computation, 2019

Gated Orthogonal Recurrent Units: On Learning to Forget.
Neural Computation, 2019

On the Morality of Artificial Intelligence.
CoRR, 2019

A learning-based algorithm to quickly compute good primal solutions for Stochastic Integer Programs.
CoRR, 2019

Joint Learning of Generative Translator and Classifier for Visually Similar Classes.
CoRR, 2019

CLOSURE: Assessing Systematic Generalization of CLEVR Models.
CoRR, 2019

The effect of task and training on intermediate representations in convolutional neural networks revealed with modified RV similarity analysis.
CoRR, 2019

Applying Knowledge Transfer for Water Body Segmentation in Peru.
CoRR, 2019

Automated curriculum generation for Policy Gradients from Demonstrations.
CoRR, 2019

Ghost Units Yield Biologically Plausible Backprop in Deep Neural Networks.
CoRR, 2019

Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models.
CoRR, 2019

Small-GAN: Speeding Up GAN Training Using Core-sets.
CoRR, 2019

Establishing an Evaluation Metric to Quantify Climate Change Image Realism.
CoRR, 2019

Icentia11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery.
CoRR, 2019

Predicting ice flow using machine learning.
CoRR, 2019

Learning Neural Causal Models from Unknown Interventions.
CoRR, 2019

Underwhelming Generalization Improvements From Controlling Feature Attribution.
CoRR, 2019

GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning.
CoRR, 2019

Avoidance Learning Using Observational Reinforcement Learning.
CoRR, 2019

Recurrent Independent Mechanisms.
CoRR, 2019

Torchmeta: A Meta-Learning library for PyTorch.
CoRR, 2019

Data-Driven Approach to Encoding and Decoding 3-D Crystal Structures.
CoRR, 2019

Weakly-supervised Knowledge Graph Alignment with Adversarial Learning.
CoRR, 2019

Learning the Arrow of Time.
CoRR, 2019

Information matrices and generalization.
CoRR, 2019

Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Accuracy.
CoRR, 2019

Conditional Computation for Continual Learning.
CoRR, 2019

Tackling Climate Change with Machine Learning.
CoRR, 2019

Learning Powerful Policies by Using Consistent Dynamics Model.
CoRR, 2019

Attention Based Pruning for Shift Networks.
CoRR, 2019

State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations.
CoRR, 2019

The Journey is the Reward: Unsupervised Learning of Influential Trajectories.
CoRR, 2019

Visualizing the Consequences of Climate Change Using Cycle-Consistent Adversarial Networks.
CoRR, 2019

Compositional generalization in a deep seq2seq model by separating syntax and semantics.
CoRR, 2019

GradMask: Reduce Overfitting by Regularizing Saliency.
CoRR, 2019

Reinforced Imitation in Heterogeneous Action Space.
CoRR, 2019

Towards Standardization of Data Licenses: The Montreal Data License.
CoRR, 2019

Online continual learning with no task boundaries.
CoRR, 2019

Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future.
CoRR, 2019

Hyperbolic Discounting and Learning over Multiple Horizons.
CoRR, 2019

Maximum Entropy Generators for Energy-Based Models.
CoRR, 2019

The Benefits of Over-parameterization at Initialization in Deep ReLU Networks.
CoRR, 2019

Wasserstein Dependency Measure for Representation Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Variational Temporal Abstraction.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Updates of Equilibrium Prop Match Gradients of Backprop Through Time in an RNN with Static Input.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

On Adversarial Mixup Resynthesis.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Unsupervised State Representation Learning in Atari.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Gradient based sample selection for online continual learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

InfoMask: Masked Variational Latent Representation to Localize Chest Disease.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Learning Speaker Representations with Mutual Information.
Proceedings of the Interspeech 2019, 2019

Learning Problem-Agnostic Speech Representations from Multiple Self-Supervised Tasks.
Proceedings of the Interspeech 2019, 2019

Speech Model Pre-Training for End-to-End Spoken Language Understanding.
Proceedings of the Interspeech 2019, 2019

Interpolation Consistency Training for Semi-supervised Learning.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

A Data-Efficient Framework for Training and Sim-to-Real Transfer of Navigation Policies.
Proceedings of the International Conference on Robotics and Automation, 2019

Manifold Mixup: Better Representations by Interpolating Hidden States.
Proceedings of the 36th International Conference on Machine Learning, 2019

On the Spectral Bias of Neural Networks.
Proceedings of the 36th International Conference on Machine Learning, 2019

GMNN: Graph Markov Neural Networks.
Proceedings of the 36th International Conference on Machine Learning, 2019

State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations.
Proceedings of the 36th International Conference on Machine Learning, 2019

Perceptual Generative Autoencoders.
Proceedings of the Deep Generative Models for Highly Structured Data, 2019

Deep Graph Infomax.
Proceedings of the 7th International Conference on Learning Representations, 2019

An Empirical Study of Example Forgetting during Deep Neural Network Learning.
Proceedings of the 7th International Conference on Learning Representations, 2019

Probabilistic Planning with Sequential Monte Carlo methods.
Proceedings of the 7th International Conference on Learning Representations, 2019

Quaternion Recurrent Neural Networks.
Proceedings of the 7th International Conference on Learning Representations, 2019

Modeling the Long Term Future in Model-Based Reinforcement Learning.
Proceedings of the 7th International Conference on Learning Representations, 2019

h-detach: Modifying the LSTM Gradient Towards Better Optimization.
Proceedings of the 7th International Conference on Learning Representations, 2019

On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length.
Proceedings of the 7th International Conference on Learning Representations, 2019

Learning deep representations by mutual information estimation and maximization.
Proceedings of the 7th International Conference on Learning Representations, 2019

InfoBot: Transfer and Exploration via the Information Bottleneck.
Proceedings of the 7th International Conference on Learning Representations, 2019

Recall Traces: Backtracking Models for Efficient Reinforcement Learning.
Proceedings of the 7th International Conference on Learning Representations, 2019

Adversarial Domain Adaptation for Stable Brain-Machine Interfaces.
Proceedings of the 7th International Conference on Learning Representations, 2019

BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning.
Proceedings of the 7th International Conference on Learning Representations, 2019

Learning Fixed Points in Generative Adversarial Networks: From Image-to-Image Translation to Disease Detection and Localization.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

A Highly Adaptive Acoustic Model for Accurate Multi-dialect Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

How Transferable Are Features in Convolutional Neural Network Acoustic Models across Languages?
Proceedings of the IEEE International Conference on Acoustics, 2019

The Pytorch-kaldi Speech Recognition Toolkit.
Proceedings of the IEEE International Conference on Acoustics, 2019

Representation Mixing for TTS Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2019

Interactive Language Learning by Question Answering.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Interpolated Adversarial Training: Achieving Robust Neural Networks Without Sacrificing Too Much Accuracy.
Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, 2019

Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Combined Reinforcement Learning via Abstract Representations.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Light Gated Recurrent Units for Speech Recognition.
IEEE Trans. Emerg. Top. Comput. Intell., 2018

Drawing and Recognizing Chinese Characters with Recurrent Neural Network.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Dynamic Neural Turing Machine with Continuous and Discrete Addressing Schemes.
Neural Computation, 2018

Learning normalized inputs for iterative estimation in medical image segmentation.
Medical Image Anal., 2018

Fine-grained attention mechanism for neural machine translation.
Neurocomputing, 2018

Quantized Guided Pruning for Efficient Hardware Implementations of Convolutional Neural Networks.
CoRR, 2018

Speech and Speaker Recognition from Raw Waveform with SincNet.
CoRR, 2018

The effects of negative adaptation in Model-Agnostic Meta-Learning.
CoRR, 2018

Keep Drawing It: Iterative language-based image generation and editing.
CoRR, 2018

DEFactor: Differentiable Edge Factorization-based Probabilistic Graph Generation.
CoRR, 2018

Interpretable Convolutional Filters with SincNet.
CoRR, 2018

On Training Recurrent Neural Networks for Lifelong Learning.
CoRR, 2018

Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon.
CoRR, 2018

How can deep learning advance computational modeling of sensory information processing?
CoRR, 2018

BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop.
CoRR, 2018

Towards the Latent Transcriptome.
CoRR, 2018

On the Learning Dynamics of Deep Neural Networks.
CoRR, 2018

Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding.
CoRR, 2018

Learning deep representations by mutual information estimation and maximization.
CoRR, 2018

Generalization of Equilibrium Propagation to Vector Field Dynamics.
CoRR, 2018

Predicting Solution Summaries to Integer Linear Programs under Imperfect Information with Machine Learning.
CoRR, 2018

DNN's Sharpest Directions Along the SGD Trajectory.
CoRR, 2018

The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach.
CoRR, 2018

On the Spectral Bias of Deep Neural Networks.
CoRR, 2018

Towards Gene Expression Convolutions using Gene Interaction Graphs.
CoRR, 2018

Modularity Matters: Learning Invariant Relational Reasoning Tasks.
CoRR, 2018

Manifold Mixup: Encouraging Meaningful On-Manifold Interpolation as a Regularizer.
CoRR, 2018

Learning to rank for censored survival data.
CoRR, 2018

Low-memory convolutional neural networks through incremental depth-first processing.
CoRR, 2018

Commonsense mining as knowledge base completion? A study on the impact of novelty.
CoRR, 2018

Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations.
CoRR, 2018

Recall Traces: Backtracking Models for Efficient Reinforcement Learning.
CoRR, 2018

Disentangling the independently controllable factors of variation by interacting with the world.
CoRR, 2018

Learning Anonymized Representations with Adversarial Neural Networks.
CoRR, 2018

A Walk with SGD.
CoRR, 2018

Generalization in Machine Learning via Analytical Learning Theory.
CoRR, 2018

A Deep Reinforcement Learning Chatbot (Short Version).
CoRR, 2018

A3T: Adversarially Augmented Adversarial Training.
CoRR, 2018

ObamaNet: Photo-realistic lip-sync from text.
CoRR, 2018

Dendritic error backpropagation in deep cortical microcircuits.
CoRR, 2018

Deep convolutional networks for quality assessment of protein folds.
Bioinform., 2018

Speaker Recognition from Raw Waveform with SincNet.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences.
Proceedings of The Third Workshop on Representation Learning for NLP, 2018

MetaGAN: An Adversarial Approach to Few-Shot Learning.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Bayesian Model-Agnostic Meta-Learning.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Dendritic cortical microcircuits approximate the backpropagation algorithm.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Sparse Attentive Backtracking: Temporal Credit Assignment Through Reminding.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Image-to-image translation for cross-domain disentanglement.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Twin Regularization for Online Speech Recognition.
Proceedings of the Interspeech 2018, 2018

Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition.
Proceedings of the Interspeech 2018, 2018

MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Focused Hierarchical RNNs for Conditional Sequence Processing.
Proceedings of the 35th International Conference on Machine Learning, 2018

Mutual Information Neural Estimation.
Proceedings of the 35th International Conference on Machine Learning, 2018

Fraternal Dropout.
Proceedings of the 6th International Conference on Learning Representations, 2018

Graph Attention Networks.
Proceedings of the 6th International Conference on Learning Representations, 2018

Deep Complex Networks.
Proceedings of the 6th International Conference on Learning Representations, 2018

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning.
Proceedings of the 6th International Conference on Learning Representations, 2018

ChatPainter: Improving Text to Image Generation using Dialogue.
Proceedings of the 6th International Conference on Learning Representations, 2018

Twin Networks: Matching the Future for Sequence Generation.
Proceedings of the 6th International Conference on Learning Representations, 2018

Extending the Framework of Equilibrium Propagation to General Dynamics.
Proceedings of the 6th International Conference on Learning Representations, 2018

Universal Successor Representations for Transfer Reinforcement Learning.
Proceedings of the 6th International Conference on Learning Representations, 2018

FigureQA: An Annotated Figure Dataset for Visual Reasoning.
Proceedings of the 6th International Conference on Learning Representations, 2018

Finding Flatter Minima with SGD.
Proceedings of the 6th International Conference on Learning Representations, 2018

Residual Connections Encourage Iterative Inference.
Proceedings of the 6th International Conference on Learning Representations, 2018

Boundary Seeking GANs.
Proceedings of the 6th International Conference on Learning Representations, 2018

Dynamic Frame Skipping for Fast Speech Recognition in Recurrent Neural Network Based Acoustic Models.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Towards End-to-end Spoken Language Understanding.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2018, 2018

HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

On the Iterative Refinement of Densely Connected Representation Levels for Semantic Segmentation.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

Neural Models for Key Phrase Extraction and Question Generation.
Proceedings of the Workshop on Machine Reading for Question Answering@ACL 2018, 2018

Straight to the Tree: Constituency Parsing with Neural Syntactic Distance.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
End-to-End Online Writer Identification With Recurrent Neural Network.
IEEE Trans. Hum. Mach. Syst., 2017

Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark.
Pattern Recognit., 2017

STDP-Compatible Approximation of Backpropagation in an Energy-Based Model.
Neural Computation, 2017

The representational geometry of word meanings acquired by neural machine translation models.
Mach. Transl., 2017

Brain tumor segmentation with Deep Neural Networks.
Medical Image Anal., 2017

Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations.
J. Mach. Learn. Res., 2017

Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation.
Frontiers Comput. Neurosci., 2017

On integrating a language model into neural machine translation.
Comput. Speech Lang., 2017

Multi-way, multilingual neural machine translation.
Comput. Speech Lang., 2017

Context-dependent word representation for neural machine translation.
Comput. Speech Lang., 2017

Measuring the tendency of CNNs to Learn Surface Statistical Regularities.
CoRR, 2017

Variational Bi-LSTMs.
CoRR, 2017

ACtuAL: Actor-Critic Under Adversarial Learning.
CoRR, 2017

Three Factors Influencing Minima in SGD.
CoRR, 2017

Sparse Attentive Backtracking: Long-Range Credit Assignment in Recurrent Networks.
CoRR, 2017

Generalization in Deep Learning.
CoRR, 2017

The Consciousness Prior.
CoRR, 2017

A Deep Reinforcement Learning Chatbot.
CoRR, 2017

Twin Networks: Using the Future as a Regularizer.
CoRR, 2017

Independently Controllable Factors.
CoRR, 2017

Deep Complex Networks.
CoRR, 2017

Image Segmentation by Iterative Inference from Conditional Score Estimation.
CoRR, 2017

Multiscale sequence modeling with a learned dictionary.
CoRR, 2017

Deep Learning for Patient-Specific Kidney Graft Survival Analysis.
CoRR, 2017

Boundary-Seeking Generative Adversarial Networks.
CoRR, 2017

Plan, Attend, Generate: Character-level Neural Machine Translation with Planning in the Decoder.
CoRR, 2017

Memory Augmented Neural Networks with Wormhole Connections.
CoRR, 2017

Learning Normalized Inputs for Iterative Estimation in Medical Image Segmentation.
CoRR, 2017

Count-ception: Counting by Fully Convolutional Redundant Counting.
CoRR, 2017

Maximum-Likelihood Augmented Discrete Generative Adversarial Networks.
CoRR, 2017

Independently Controllable Features.
CoRR, 2017

Learning to Compute Word Embeddings On the Fly.
CoRR, 2017

Plan, Attend, Generate: Character-Level Neural Machine Translation with Planning.
Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017

GibbsNet: Iterative Adversarial Inference for Deep Graphical Models.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Plan, Attend, Generate: Planning for Sequence-to-Sequence Models.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Z-Forcing: Training Stochastic Recurrent Networks.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Towards more hardware-friendly deep learning.
Proceedings of the Workshop on Trends in Machine-Learning (and impact on computer architecture), 2017

Improving Speech Recognition by Revising Gated Recurrent Units.
Proceedings of the Interspeech 2017, 2017

Dynamic Layer Normalization for Adaptive Neural Acoustic Modeling in Speech Recognition.
Proceedings of the Interspeech 2017, 2017

A robust adaptive stochastic gradient method for deep learning.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Sharp Minima Can Generalize For Deep Nets.
Proceedings of the 34th International Conference on Machine Learning, 2017

A Closer Look at Memorization in Deep Networks.
Proceedings of the 34th International Conference on Machine Learning, 2017

Improving Generative Adversarial Networks with Denoising Feature Matching.
Proceedings of the 5th International Conference on Learning Representations, 2017

Char2Wav: End-to-End Speech Synthesis.
Proceedings of the 5th International Conference on Learning Representations, 2017

Diet Networks: Thin Parameters for Fat Genomics.
Proceedings of the 5th International Conference on Learning Representations, 2017

Generalizable Features From Unsupervised Learning.
Proceedings of the 5th International Conference on Learning Representations, 2017

SampleRNN: An Unconditional End-to-End Neural Audio Generation Model.
Proceedings of the 5th International Conference on Learning Representations, 2017

Towards an automatic Turing test: Learning to evaluate dialogue responses.
Proceedings of the 5th International Conference on Learning Representations, 2017

A Structured Self-Attentive Sentence Embedding.
Proceedings of the 5th International Conference on Learning Representations, 2017

Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations.
Proceedings of the 5th International Conference on Learning Representations, 2017

Mollifying Networks.
Proceedings of the 5th International Conference on Learning Representations, 2017

Hierarchical Multiscale Recurrent Neural Networks.
Proceedings of the 5th International Conference on Learning Representations, 2017

Mode Regularized Generative Adversarial Networks.
Proceedings of the 5th International Conference on Learning Representations, 2017

An Actor-Critic Algorithm for Sequence Prediction.
Proceedings of the 5th International Conference on Learning Representations, 2017

Understanding intermediate layers using linear classifier probes.
Proceedings of the 5th International Conference on Learning Representations, 2017

Count-ception: Counting by Fully Convolutional Redundant Counting.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

A network of deep neural networks for Distant Speech Recognition.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

On random weights for texture generation in one layer CNNS.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Denoising Criterion for Variational Auto-Encoding Framework.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Learning to Understand Phrases by Embedding the Dictionary.
Trans. Assoc. Comput. Linguistics, 2016

Big Data: Theoretical Aspects [Scanning the Issue].
Proceedings of the IEEE, 2016

EmoNets: Multimodal deep learning approaches for emotion recognition in video.
J. Multimodal User Interfaces, 2016

Knowledge Matters: Importance of Prior Information for Optimization.
J. Mach. Learn. Res., 2016

Iterative Alternating Neural Attention for Machine Reading.
CoRR, 2016

Invariant Representations for Noisy Speech Recognition.
CoRR, 2016

Towards a Biologically Plausible Backprop.
CoRR, 2016

Diet Networks: Thin Parameters for Fat Genomic.
CoRR, 2016

Recurrent Neural Networks With Limited Numerical Precision.
CoRR, 2016

On Random Weights for Texture Generation in One Layer Neural Networks.
CoRR, 2016

Neural Networks with Few Multiplications.
Proceedings of the 4th International Conference on Learning Representations, 2016

Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations.
CoRR, 2016

Deep Directed Generative Models with Energy-Based Probability Estimation.
CoRR, 2016

Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes.
CoRR, 2016

BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1.
CoRR, 2016

Hierarchical Memory Networks.
CoRR, 2016

Feedforward Initialization for Fast Inference of Deep Generative Networks is biologically plausible.
CoRR, 2016

Theano: A Python framework for fast computation of mathematical expressions.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, 2016

A Neural Knowledge Language Model.
CoRR, 2016

NYU-MILA Neural Machine Translation Systems for WMT'16.
Proceedings of the First Conference on Machine Translation, 2016

Batch-normalized joint training for DNN-based distant speech recognition.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Architectural Complexity Measures of Recurrent Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

On Multiplicative Integration with Recurrent Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Binarized Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Professor Forcing: A New Algorithm for Training Recurrent Networks.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism.
Proceedings of the NAACL HLT 2016, 2016

HeMIS: Hetero-Modal Image Segmentation.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016, 2016

Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks.
Proceedings of the Interspeech 2016, 2016

Deconstructing the Ladder Network Architecture.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Noisy Activation Functions.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Bidirectional Helmholtz Machines.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Unitary Evolution Recurrent Neural Networks.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Batch normalized recurrent neural networks.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

End-to-end attention-based large vocabulary speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016

Oracle Performance for Visual Captioning.
Proceedings of the British Machine Vision Conference 2016, 2016

Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Pointing the Unknown Words.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

A Character-level Decoder without Explicit Segmentation for Neural Machine Translation.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

Deep Learning.
Adaptive computation and machine learning, MIT Press, ISBN: 978-0-262-03561-3, 2016

2015
Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks.
IEEE Trans. Multimedia, 2015

Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Challenges in representation learning: A report on three machine learning contests.
Neural Networks, 2015

Editorial introduction to the Neural Networks special issue on Deep Learning of Representations.
Neural Networks, 2015

Deep learning.
Nat., 2015

Trainable performance upper bounds for image and video captioning.
CoRR, 2015

ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks.
CoRR, 2015

ReSeg: A Recurrent Neural Network for Object Segmentation.
CoRR, 2015

Hierarchical Neural Network Generative Models for Movie Dialogues.
CoRR, 2015

FitNets: Hints for Thin Deep Nets.
Proceedings of the 3rd International Conference on Learning Representations, 2015

Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews.
Proceedings of the 3rd International Conference on Learning Representations, 2015

Blocks and Fuel: Frameworks for deep learning.
CoRR, 2015

Target Propagation.
Proceedings of the 3rd International Conference on Learning Representations, 2015

Embedding Word Similarity with Neural Machine Translation.
Proceedings of the 3rd International Conference on Learning Representations, 2015

On Using Monolingual Corpora in Neural Machine Translation.
CoRR, 2015

NICE: Non-linear Independent Components Estimation.
Proceedings of the 3rd International Conference on Learning Representations, 2015

RMSProp and equilibrated adaptive learning rates for non-convex optimization.
CoRR, 2015

Low precision arithmetic for deep learning.
Proceedings of the 3rd International Conference on Learning Representations, 2015

Training opposing directed models using geometric mean matching.
CoRR, 2015

Reweighted Wake-Sleep.
Proceedings of the 3rd International Conference on Learning Representations, 2015

An objective function for STDP.
CoRR, 2015

Towards Biologically Plausible Deep Learning.
CoRR, 2015

Early Inference in Energy-Based Models Approximates Back-Propagation.
CoRR, 2015

Task Loss Estimation for Sequence Prediction.
CoRR, 2015

Neural Machine Translation by Jointly Learning to Align and Translate.
Proceedings of the 3rd International Conference on Learning Representations, 2015

Variance Reduction in SGD by Distributed Importance Sampling.
CoRR, 2015

GSNs : Generative Stochastic Networks.
CoRR, 2015

Montreal Neural Machine Translation Systems for WMT'15.
Proceedings of the Tenth Workshop on Statistical Machine Translation, 2015

Difference Target Propagation.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2015

Artificial Neural Networks Applied to Taxi Destination Prediction.
Proceedings of the ECML/PKDD 2015 Discovery Challenges co-located with European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2015), 2015

Equilibrated adaptive learning rates for non-convex optimization.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

BinaryConnect: Training Deep Neural Networks with binary weights during propagations.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

A Recurrent Latent Variable Model for Sequential Data.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Attention-Based Models for Speech Recognition.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.
Proceedings of the 32nd International Conference on Machine Learning, 2015

BilBOWA: Fast Bilingual Distributed Representations without Word Alignments.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Gated Feedback Recurrent Neural Networks.
Proceedings of the 32nd International Conference on Machine Learning, 2015

A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion.
Proceedings of the 24th ACM International Conference on Information and Knowledge Management, 2015

IAPR keynote lecture IV: Deep learning.
Proceedings of the 3rd IAPR Asian Conference on Pattern Recognition, 2015

On Using Very Large Target Vocabulary for Neural Machine Translation.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
Evolving Culture Versus Local Minima.
Proceedings of the Growing Adaptive Machines, 2014

The Spike-and-Slab RBM and Extensions to Discrete and Sparse Data Distributions.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Learning semantic representations of objects and their parts.
Mach. Learn., 2014

A semantic matching energy function for learning with multi-relational data - Application to word-sense disambiguation.
Mach. Learn., 2014

What regularized auto-encoders learn from the data-generating distribution.
J. Mach. Learn. Res., 2014

Revisiting Natural Gradient for Deep Networks
Proceedings of the 2nd International Conference on Learning Representations, 2014

An empirical analysis of dropout in piecewise linear networks.
Proceedings of the 2nd International Conference on Learning Representations, 2014

On the number of inference regions of deep feed forward networks with piece-wise linear activations.
Proceedings of the 2nd International Conference on Learning Representations, 2014

How to Construct Deep Recurrent Neural Networks.
Proceedings of the 2nd International Conference on Learning Representations, 2014

On the saddle point problem for non-convex optimization.
CoRR, 2014

Multimodal Transitions for Generative Stochastic Networks.
Proceedings of the 2nd International Conference on Learning Representations, 2014

Deep Directed Generative Autoencoders.
CoRR, 2014

Not All Neural Embeddings are Born Equal.
CoRR, 2014

ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient.
CoRR, 2014

Generative Adversarial Networks.
CoRR, 2014

An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks.
Proceedings of the 2nd International Conference on Learning Representations, 2014

Deep Tempering.
CoRR, 2014

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling.
CoRR, 2014

End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results.
CoRR, 2014

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation.
CoRR, 2014

Exponentially Increasing the Capacity-to-Computation Ratio for Conditional Computation in Deep Learning.
CoRR, 2014

Bounding the Test Log-Likelihood of Generative Models.
Proceedings of the 2nd International Conference on Learning Representations, 2014

How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation.
CoRR, 2014

Conditioning and time representation in long short-term memory networks.
Biological Cybernetics, 2014

Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation.
Proceedings of SSST@EMNLP 2014, 2014

On the Properties of Neural Machine Translation: Encoder-Decoder Approaches.
Proceedings of SSST@EMNLP 2014, 2014

On the Equivalence between Deep NADE and Generative Stochastic Networks.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2014

Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2014

How transferable are features in deep neural networks?
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Iterative Neural Autoregressive Distribution Estimator NADE-k.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

On the Number of Linear Regions of Deep Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Generative Adversarial Nets.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Scaling up deep learning.
Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014

Marginalized Denoising Auto-encoders for Nonlinear Representations.
Proceedings of the 31th International Conference on Machine Learning, 2014

Deep Generative Stochastic Networks Trainable by Backprop.
Proceedings of the 31th International Conference on Machine Learning, 2014

Deep learning and cultural evolution.
Proceedings of the Genetic and Evolutionary Computation Conference, 2014

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

Learning Concept Embeddings for Query Expansion by Quantum Entropy Minimization.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

On the Challenges of Physical Implementations of RBMs.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013
Deep Learning of Representations.
Proceedings of the Handbook on Neural Information Processing, 2013

Scaling Up Spike-and-Slab Models for Unsupervised Feature Learning.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

Representation Learning: A Review and New Perspectives.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

Estimating or Propagating Gradients Through Stochastic Neurons
CoRR, 2013

Natural Gradient Revisited
Proceedings of the 1st International Conference on Learning Representations, 2013

Big Neural Networks Waste Capacity
Proceedings of the 1st International Conference on Learning Representations, 2013

Joint Training Deep Boltzmann Machines for Classification
Proceedings of the 1st International Conference on Learning Representations, 2013

Metric-Free Natural Gradient for Joint-Training of Boltzmann Machines
Proceedings of the 1st International Conference on Learning Representations, 2013

A Semantic Matching Energy Function for Learning with Multi-relational Data
Proceedings of the 1st International Conference on Learning Representations, 2013

Regularized Auto-Encoders Estimate Local Statistics
Proceedings of the 1st International Conference on Learning Representations, 2013

Learned-norm pooling for deep neural networks.
CoRR, 2013

Pylearn2: a machine learning research library.
CoRR, 2013

Deep Generative Stochastic Networks Trainable by Backprop.
CoRR, 2013

Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation.
CoRR, 2013

Learning Deep Physiological Models of Affect.
IEEE Comput. Intell. Mag., 2013

Deep Learning of Representations: Looking Forward.
Proceedings of the Statistical Language and Speech Processing, 2013

Modeling term dependencies with quantum language models for IR.
Proceedings of the 36th International ACM SIGIR conference on research and development in Information Retrieval, 2013

Multi-Prediction Deep Boltzmann Machines.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Stochastic Ratio Matching of RBMs for Sparse High-Dimensional Inputs.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Generalized Denoising Auto-Encoders as Generative Models.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Audio Chord Recognition with Recurrent Neural Networks.
Proceedings of the 14th International Society for Music Information Retrieval Conference, 2013

Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding.
Proceedings of the INTERSPEECH 2013, 2013

Unsupervised Learning of Semantics of Object Detections for Scene Categorization.
Proceedings of the Pattern Recognition Applications and Methods - International Conference, 2013

Unsupervised and Transfer Learning under Uncertainty - From Object Detections to Scene Categorization.
Proceedings of the ICPRAM 2013, 2013


On the difficulty of training recurrent neural networks.
Proceedings of the 30th International Conference on Machine Learning, 2013

Maxout Networks.
Proceedings of the 30th International Conference on Machine Learning, 2013

Better Mixing via Deep Representations.
Proceedings of the 30th International Conference on Machine Learning, 2013


High-dimensional sequence transduction.
Proceedings of the IEEE International Conference on Acoustics, 2013

Advances in optimizing recurrent networks.
Proceedings of the IEEE International Conference on Acoustics, 2013

Stacked calibration of off-policy policy evaluation for video game matchmaking.
Proceedings of the 2013 IEEE Conference on Computational Inteligence in Games (CIG), 2013

Texture Modeling with Convolutional Spike-and-Slab RBMs and Deep Extensions.
Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, 2013

2012
Practical Recommendations for Gradient-Based Training of Deep Architectures.
Proceedings of the Neural Networks: Tricks of the Trade - Second Edition, 2012

Beyond Skill Rating: Advanced Matchmaking in Ghost Recon Online.
IEEE Trans. Comput. Intell. AI Games, 2012

Unsupervised and Transfer Learning Challenge: a Deep Learning Approach.
Proceedings of the Unsupervised and Transfer Learning, 2012

Learning Algorithms for the Classification Restricted Boltzmann Machine.
J. Mach. Learn. Res., 2012

Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing.
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

Random Search for Hyper-Parameter Optimization.
J. Mach. Learn. Res., 2012

Deep Learning of Representations for Unsupervised and Transfer Learning.
Proceedings of the Unsupervised and Transfer Learning, 2012

Joint Training of Deep Boltzmann Machines
CoRR, 2012

Theano: new features and speed improvements
CoRR, 2012

Understanding the exploding gradient problem
CoRR, 2012

Disentangling Factors of Variation via Generative Entangling
CoRR, 2012

Efficient EM Training of Gaussian Mixtures with Missing Data
CoRR, 2012

Implicit Density Estimation by Local Moment Matching to Sample from Auto-Encoders
CoRR, 2012

Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives
CoRR, 2012

Practical recommendations for gradient-based training of deep architectures
CoRR, 2012

On Training Deep Boltzmann Machines
CoRR, 2012

Evolving Culture vs Local Minima
CoRR, 2012

Spike-and-Slab Sparse Coding for Unsupervised Feature Discovery
CoRR, 2012

Detonation Classification from acoustic Signature with the Restricted Boltzmann Machine.
Comput. Intell., 2012

Building Musically-relevant Audio Features through Multiple Timescale Representations.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

Discriminative Non-negative Matrix Factorization for Multiple Pitch Estimation.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

A Generative Process for Contractive Auto-Encoders.
Proceedings of the 29th International Conference on Machine Learning, 2012

Large-Scale Feature Learning With Spike-and-Slab Sparse Coding.
Proceedings of the 29th International Conference on Machine Learning, 2012

Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription.
Proceedings of the 29th International Conference on Machine Learning, 2012

Disentangling Factors of Variation for Facial Expression Recognition.
Proceedings of the Computer Vision - ECCV 2012, 2012

Deep Learning for NLP (without Magic).
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, 2012

2011
Contextual tag inference.
ACM Trans. Multim. Comput. Commun. Appl., 2011

Quickly Generating Representative Samples from an RBM-Derived Process.
Neural Computation, 2011

Suitability of V1 Energy Models for Object Classification.
Neural Computation, 2011

Deep Sparse Rectifier Neural Networks.
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011

A Spike and Slab Restricted Boltzmann Machine.
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011

Deep Learners Benefit More from Out-of-Distribution Examples.
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011

Discussion of "The Neural Autoregressive Distribution Estimator".
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011

The Statistical Inefficiency of Sparse Coding for Images (or, One Gabor to Rule them All)
CoRR, 2011

Towards Open-Text Semantic Parsing via Multi-Task Learning of Structured Embeddings
CoRR, 2011

Learning invariant features through local space contraction
CoRR, 2011

Adding noise to the input of a model trained with a regularized objective
CoRR, 2011

Autotagging music with conditional restricted Boltzmann machines
CoRR, 2011

Higher Order Contractive Auto-Encoder.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2011

The Manifold Tangent Classifier.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

On Tracking The Partition Function.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Shallow vs. Deep Sum-Product Networks.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Algorithms for Hyper-Parameter Optimization.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

On learning distributed representations of semantics.
Proceedings of the 2011 Symposium on Machine Learning in Speech and Language Processing, 2011

Temporal Pooling and Multiscale Learning for Automatic Annotation and Ranking of Music Audio.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Contractive Auto-Encoders: Explicit Invariance During Feature Extraction.
Proceedings of the 28th International Conference on Machine Learning, 2011

Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach.
Proceedings of the 28th International Conference on Machine Learning, 2011

Large-Scale Learning of Embeddings with Reconstruction Sampling.
Proceedings of the 28th International Conference on Machine Learning, 2011

Unsupervised Models of Images by Spikeand-Slab RBMs.
Proceedings of the 28th International Conference on Machine Learning, 2011

On the Expressive Power of Deep Architectures.
Proceedings of the Discovery Science - 14th International Conference, 2011

Learning Structured Embeddings of Knowledge Bases.
Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

2010
Deep Belief Networks Are Compact Universal Approximators.
Neural Computation, 2010

Tractable Multivariate Binary Density Estimation and the Restricted Boltzmann Forest.
Neural Computation, 2010

Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion.
J. Mach. Learn. Res., 2010

Understanding the difficulty of training deep feedforward neural networks.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Why Does Unsupervised Pre-training Help Deep Learning?
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Why Does Unsupervised Pre-training Help Deep Learning?
J. Mach. Learn. Res., 2010

Tempered Markov Chain Monte Carlo for training of Restricted Boltzmann Machines.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Alternative time representation in dopamine models.
J. Comput. Neurosci., 2010

Adaptive Parallel Tempering for Stochastic Maximum Likelihood Learning of RBMs
CoRR, 2010

Deep Self-Taught Learning for Handwritten Character Recognition
CoRR, 2010

Decision trees do not generalize to new variations.
Comput. Intell., 2010

Learning Tags that Vary Within a Song.
Proceedings of the 11th International Society for Music Information Retrieval Conference, 2010

Word Representations: A Simple and General Method for Semi-Supervised Learning.
Proceedings of the ACL 2010, 2010

2009
A Hybrid Pareto Mixture for Conditional Asymmetric Fat-Tailed Distributions.
IEEE Trans. Neural Networks, 2009

Justifying and Generalizing Contrastive Divergence.
Neural Computation, 2009

Exploring Strategies for Training Deep Neural Networks.
J. Mach. Learn. Res., 2009

The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training.
Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, 2009

Incorporating Functional Knowledge in Neural Networks.
J. Mach. Learn. Res., 2009

Learning Deep Architectures for AI.
Found. Trends Mach. Learn., 2009

An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Slow, Decorrelated Features for Pretraining Complex Cell-like Networks.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Quadratic Features and Deep Architectures for Chunking.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31, 2009

Workshop summary: Workshop on learning feature hierarchies.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Curriculum learning.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

2008
Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model.
IEEE Trans. Neural Networks, 2008

Neural net language models.
Scholarpedia, 2008

Representational Power of Restricted Boltzmann Machines and Deep Belief Networks.
Neural Computation, 2008

Extracting and composing robust features with denoising autoencoders.
Proceedings of the Machine Learning, 2008

Classification using discriminative restricted Boltzmann machines.
Proceedings of the Machine Learning, 2008

Zero-data Learning of New Tasks.
Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

2007
Continuous Neural Networks.
Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, 2007

A Hybrid Pareto Model for Conditional Density Estimation of Asymmetric Fat-Tail Data.
Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, 2007

Noisy K Best-Paths for Approximate Dynamic Programming with Application to Portfolio Optimization.
JCP, 2007

Topmoumoute Online Natural Gradient Algorithm.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Learning the 2-D Topology of Images.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Augmented Functional Time Series Representation and Forecasting with Gaussian Processes.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

An empirical evaluation of deep architectures on problems with many factors of variation.
Proceedings of the Machine Learning, 2007

2006
Nonlocal Estimation of Manifold Structure.
Neural Computation, 2006

Collaborative Filtering on a Family of Biological Targets.
J. Chem. Inf. Model., 2006

Greedy Layer-Wise Training of Deep Networks.
Proceedings of the Advances in Neural Information Processing Systems 19, 2006

The <i>K</i> Best-Paths Approach to Approximate Dynamic Programming with Application to Portfolio Optimization.
Proceedings of the Advances in Artificial Intelligence, 2006

Spectral Dimensionality Reduction.
Proceedings of the Feature Extraction - Foundations and Applications, 2006

Entropy Regularization.
Proceedings of the Semi-Supervised Learning, 2006

Large-Scale Algorithms.
Proceedings of the Semi-Supervised Learning, 2006

Label Propagation and Quadratic Criterion.
Proceedings of the Semi-Supervised Learning, 2006

2005
Convex Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Non-Local Manifold Parzen Windows.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

The Curse of Highly Variable Functions for Local Kernel Machines.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Semi-supervised Learning by Entropy Minimization.
Proceedings of the Actes de CAP 05, Conférence francophone sur l'apprentissage automatique, 2005

Greedy Spectral Embedding.
Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005

Hierarchical Probabilistic Neural Network Language Model.
Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005

Efficient Non-Parametric Function Induction in Semi-Supervised Learning.
Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005

2004
Learning Eigenfunctions Links Spectral Embedding and Kernel PCA.
Neural Computation, 2004

No Unbiased Estimator of the Variance of K-Fold Cross-Validation.
J. Mach. Learn. Res., 2004

Locally Linear Embedding for dimensionality reduction in QSAR.
J. Comput. Aided Mol. Des., 2004

Brain Inspired Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Non-Local Manifold Tangent Learning.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Unsupervised Sense Disambiguation Using Bilingual Probabilistic Models.
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004

2003
Bias learning, knowledge sharing.
IEEE Trans. Neural Networks, 2003

Inference for the Generalization Error.
Mach. Learn., 2003

A Neural Probabilistic Language Model.
J. Mach. Learn. Res., 2003

Extensions to Metric-Based Model Selection.
J. Mach. Learn. Res., 2003

Scaling Large Learning Problems with Hard Parallel Mixtures.
Int. J. Pattern Recognit. Artif. Intell., 2003

Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

Quick Training of Probabilistic Neural Nets by Importance Sampling.
Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, 2003

2002
Robust Regression with Asymmetric Heavy-Tail Noise Distributions.
Neural Computation, 2002

A Parallel Mixture of SVMs for Very Large Scale Problems.
Neural Computation, 2002

Kernel Matching Pursuit.
Mach. Learn., 2002

Model Selection for Small Sample Regression.
Mach. Learn., 2002

Guest Introduction: Special Issue on New Methods for Model Selection and Model Combination.
Mach. Learn., 2002

Metric-based model selection for time-series forecasting.
Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, 2002

Manifold Parzen Windows.
Proceedings of the Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, 2002

2001
Cost functions and model combination for VaR-based asset allocation using neural networks.
IEEE Trans. Neural Networks, 2001

Experiments on the application of IOHMMs to model financial returns series.
IEEE Trans. Neural Networks, 2001

Topic Segmentation : A First Stage to Dialog-Based Information Extraction.
Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium, 2001

K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

Estimating Car Insurance Premia: a Case Study in High-Dimensional Data Inference.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

2000
Taking on the curse of dimensionality in joint distributions using neural networks.
IEEE Trans. Neural Networks Learn. Syst., 2000

Boosting Neural Networks.
Neural Computation, 2000

Gradient-Based Optimization of Hyperparameters.
Neural Computation, 2000

Incorporating Second-Order Functional Knowledge for Better Option Pricing.
Proceedings of the Advances in Neural Information Processing Systems 13, 2000

A Neural Probabilistic Language Model.
Proceedings of the Advances in Neural Information Processing Systems 13, 2000

A Neural Support Vector Network Architecture with Adaptive Kernels.
Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, 2000

Probabilistic Neural Network Models for Sequential Data.
Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, 2000

Continuous Optimization of Hyper-Parameters.
Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, 2000

1999
Stochastic Learning of Strategic Equilibria for Auctions.
Neural Computation, 1999

Object Recognition with Gradient-Based Learning.
Proceedings of the Shape, Contour and Grouping in Computer Vision, 1999

Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 12, [NIPS Conference, Denver, Colorado, USA, November 29, 1999

Binary Pseudowavelets and Applications to Bilevel Image Processing.
Proceedings of the Data Compression Conference, 1999

1998
High quality document image compression with "DjVu".
J. Electronic Imaging, 1998

Gaussian Mixture Densities for Classification of Nuclear Power Plant Data.
Comput. Artif. Intell., 1998

Support vector machines for improving the classification of brain PET images.
Proceedings of the Medical Imaging 1998: Image Processing, 1998

A Memory-Efficient Adaptive Huffman Coding Algorthm for Very Large Sets of Symbols.
Proceedings of the Data Compression Conference, 1998

The Z-Coder Adaptive Binary Coder.
Proceedings of the Data Compression Conference, 1998

Browsing through High Quality Document Images with DjVu.
Proceedings of the IEEE Forum on Reasearch and Technology Advances in Digital Libraries, 1998

1997
Using a Financial Training Criterion Rather than a Prediction Criterion.
Int. J. Neural Syst., 1997

Training Methods for Adaptive Boosting of Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 10, 1997

Shared Context Probabilistic Transducers.
Proceedings of the Advances in Neural Information Processing Systems 10, 1997

Discriminative feature and model design for automatic speech recognition.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Reading checks with multilayer graph transformer networks.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

AdaBoosting Neural Networks: Application to on-line Character Recognition.
Proceedings of the Artificial Neural Networks, 1997

Global Training of Document Processing Systems Using Graph Transformer Networks.
Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97), 1997

1996
Input-output HMMs for sequence processing.
IEEE Trans. Neural Networks, 1996

Multi-Task Learning for Stock Selection.
Proceedings of the Advances in Neural Information Processing Systems 9, 1996

1995
On the search for new learning rules for ANNs.
Neural Process. Lett., 1995

LeRec: a NN/HMM hybrid for on-line handwriting recognition.
Neural Computation, 1995

Diffusion of Context and Credit Information in Markovian Models.
J. Artif. Intell. Res., 1995

Hierarchical Recurrent Neural Networks for Long-Term Dependencies.
Proceedings of the Advances in Neural Information Processing Systems 8, 1995

Recurrent Neural Networks for Missing or Asynchronous Data.
Proceedings of the Advances in Neural Information Processing Systems 8, 1995

1994
Learning long-term dependencies with gradient descent is difficult.
IEEE Trans. Neural Networks, 1994

Convergence Properties of the K-Means Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 7, 1994

Diffusion of Credit in Markovian Models.
Proceedings of the Advances in Neural Information Processing Systems 7, 1994

An Input Output HMM Architecture.
Proceedings of the Advances in Neural Information Processing Systems 7, 1994

Word-level training of a handwritten word recognizer based on convolutional neural networks.
Proceedings of the 12th IAPR International Conference on Pattern Recognition, 1994

An EM approach to grammatical inference: input/output HMMs.
Proceedings of the 12th IAPR International Conference on Pattern Recognition, 1994

Word normalization for online handwritten word recognition.
Proceedings of the 12th IAPR International Conference on Pattern Recognition, 1994

Use of Genetic Programming for the Search of a New Learning Rule for Neural Networks.
Proceedings of the First IEEE Conference on Evolutionary Computation, 1994

1993
A Connectionist Approach to Speech Recognition.
Int. J. Pattern Recognit. Artif. Intell., 1993

Credit Assignment through Time: Alternatives to Backpropagation.
Proceedings of the Advances in Neural Information Processing Systems 6, 1993

Globally Trained Handwritten Word Recognizer Using Spatial Representation, Convolutional Neural Networks, and Hidden Markov Models.
Proceedings of the Advances in Neural Information Processing Systems 6, 1993

The problem of learning long-term dependencies in recurrent networks.
Proceedings of International Conference on Neural Networks (ICNN'88), San Francisco, CA, USA, March 28, 1993

1992
Global optimization of a neural network-hidden Markov model hybrid.
IEEE Trans. Neural Networks, 1992

Phonetically motivated acoustic parameters for continuous speech recognition using artificial neural networks.
Speech Commun., 1992

Learning the dynamic nature of speech with back-propagation for sequences.
Pattern Recognit. Lett., 1992

1991
Neural Network - Gaussian Mixture Hybrid for Speech Recognition or Density Estimation.
Proceedings of the Advances in Neural Information Processing Systems 4, 1991

A comparative study on hybrid acoustic phonetic decoders based on artificial neural networks.
Proceedings of the Second European Conference on Speech Communication and Technology, 1991

1990
Phonetically-based multi-layered neural networks for vowel classification.
Speech Commun., 1990

Efficient recognition of immunoglobulin domains from amino acid sequences using a neural network.
Comput. Appl. Biosci., 1990

A hybrid coder for hidden Markov models using a recurrent neural network.
Proceedings of the 1990 International Conference on Acoustics, 1990

1989
Programmable Execution of Multi-Layered Networks for Automatic Speech Recognition.
Commun. ACM, 1989

Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge.
Proceedings of the Advances in Neural Information Processing Systems 2, 1989

A Neural Network to Detect Homologies in Proteins.
Proceedings of the Advances in Neural Information Processing Systems 2, 1989

Speech coding with multilayer networks.
Proceedings of the Neurocomputing - Algorithms, Architectures and Applications, Proceedings of the NATO Advanced Research Workshop on Neurocomputing Algorithms, Architectures and Applications, Les Arcs, France, February 27, 1989

On the Generalization Capability of Multi-Layered Networks in the Extraction of Speech Properties.
Proceedings of the 11th International Joint Conference on Artificial Intelligence. Detroit, 1989

1988
Use of Multi-Layered Networks for Coding Speech with Phonetic Features.
Proceedings of the Advances in Neural Information Processing Systems 1, 1988

Data-Driven Execution of Multi-Layered Networks for Automatic Speech Recognition.
Proceedings of the 7th National Conference on Artificial Intelligence, 1988


  Loading...