Yoshua Bengio

According to our database1, Yoshua Bengio authored at least 545 papers between 1988 and 2018.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepages:

On csauthors.net:

Bibliography

2018
Light Gated Recurrent Units for Speech Recognition.
IEEE Trans. Emerging Topics in Comput. Intellig., 2018

Drawing and Recognizing Chinese Characters with Recurrent Neural Network.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Dynamic Neural Turing Machine with Continuous and Discrete Addressing Schemes.
Neural Computation, 2018

Learning normalized inputs for iterative estimation in medical image segmentation.
Medical Image Analysis, 2018

Fine-grained attention mechanism for neural machine translation.
Neurocomputing, 2018

Dendritic cortical microcircuits approximate the backpropagation algorithm.
CoRR, 2018

Depth with Nonlinearity Creates No Bad Local Minima in ResNets.
CoRR, 2018

How can deep learning advance computational modeling of sensory information processing?
CoRR, 2018

BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop.
CoRR, 2018

A Data-Efficient Framework for Training and Sim-to-Real Transfer of Navigation Policies.
CoRR, 2018

Towards the Latent Transcriptome.
CoRR, 2018

h-detach: Modifying the LSTM Gradient Towards Better Optimization.
CoRR, 2018

Adversarial Domain Adaptation for Stable Brain-Machine Interfaces.
CoRR, 2018

Deep Graph Infomax.
CoRR, 2018

HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering.
CoRR, 2018

On the Learning Dynamics of Deep Neural Networks.
CoRR, 2018

Combined Reinforcement Learning via Abstract Representations.
CoRR, 2018

Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding.
CoRR, 2018

Learning deep representations by mutual information estimation and maximization.
CoRR, 2018

Generalization of Equilibrium Propagation to Vector Field Dynamics.
CoRR, 2018

Speaker Recognition from raw waveform with SincNet.
CoRR, 2018

Predicting Solution Summaries to Integer Linear Programs under Imperfect Information with Machine Learning.
CoRR, 2018

DNN's Sharpest Directions Along the SGD Trajectory.
CoRR, 2018

The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach.
CoRR, 2018

On the Spectral Bias of Deep Neural Networks.
CoRR, 2018

Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition.
CoRR, 2018

Towards Gene Expression Convolutions using Gene Interaction Graphs.
CoRR, 2018

Modularity Matters: Learning Invariant Relational Reasoning Tasks.
CoRR, 2018

Manifold Mixup: Encouraging Meaningful On-Manifold Interpolation as a Regularizer.
CoRR, 2018

Quaternion Recurrent Neural Networks.
CoRR, 2018

Focused Hierarchical RNNs for Conditional Sequence Processing.
CoRR, 2018

Straight to the Tree: Constituency Parsing with Neural Syntactic Distance.
CoRR, 2018

Bayesian Model-Agnostic Meta-Learning.
CoRR, 2018

Learning to rank for censored survival data.
CoRR, 2018

Image-to-image translation for cross-domain disentanglement.
CoRR, 2018

On the iterative refinement of densely connected representation levels for semantic segmentation.
CoRR, 2018

Low-memory convolutional neural networks through incremental depth-first processing.
CoRR, 2018

Commonsense mining as knowledge base completion? A study on the impact of novelty.
CoRR, 2018

Twin Regularization for online speech recognition.
CoRR, 2018

Universal Successor Representations for Transfer Reinforcement Learning.
CoRR, 2018

Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations.
CoRR, 2018

Recall Traces: Backtracking Models for Efficient Reinforcement Learning.
CoRR, 2018

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning.
CoRR, 2018

Fine-Grained Attention Mechanism for Neural Machine Translation.
CoRR, 2018

Light Gated Recurrent Units for Speech Recognition.
CoRR, 2018

Disentangling the independently controllable factors of variation by interacting with the world.
CoRR, 2018

Learning Anonymized Representations with Adversarial Neural Networks.
CoRR, 2018

A Walk with SGD.
CoRR, 2018

Towards end-to-end spoken language understanding.
CoRR, 2018

ChatPainter: Improving Text to Image Generation using Dialogue.
CoRR, 2018

Generalization in Machine Learning via Analytical Learning Theory.
CoRR, 2018

MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation.
CoRR, 2018

A Deep Reinforcement Learning Chatbot (Short Version).
CoRR, 2018

A3T: Adversarially Augmented Adversarial Training.
CoRR, 2018

ObamaNet: Photo-realistic lip-sync from text.
CoRR, 2018

Dendritic error backpropagation in deep cortical microcircuits.
CoRR, 2018

Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences.
Proceedings of The Third Workshop on Representation Learning for NLP, 2018

Twin Regularization for Online Speech Recognition.
Proceedings of the Interspeech 2018, 2018

Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition.
Proceedings of the Interspeech 2018, 2018

MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Focused Hierarchical RNNs for Conditional Sequence Processing.
Proceedings of the 35th International Conference on Machine Learning, 2018

Mutual Information Neural Estimation.
Proceedings of the 35th International Conference on Machine Learning, 2018

Dynamic Frame Skipping for Fast Speech Recognition in Recurrent Neural Network Based Acoustic Models.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Towards End-to-end Spoken Language Understanding.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2018, 2018

HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Neural Models for Key Phrase Extraction and Question Generation.
Proceedings of the Workshop on Machine Reading for Question Answering@ACL 2018, 2018

Straight to the Tree: Constituency Parsing with Neural Syntactic Distance.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

Gated Orthogonal Recurrent Units: On Learning to Forget.
Proceedings of the Workshops of the The Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
End-to-End Online Writer Identification With Recurrent Neural Network.
IEEE Trans. Human-Machine Systems, 2017

Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark.
Pattern Recognition, 2017

STDP-Compatible Approximation of Backpropagation in an Energy-Based Model.
Neural Computation, 2017

The representational geometry of word meanings acquired by neural machine translation models.
Machine Translation, 2017

Brain tumor segmentation with Deep Neural Networks.
Medical Image Analysis, 2017

Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations.
Journal of Machine Learning Research, 2017

Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation.
Front. Comput. Neurosci., 2017

On integrating a language model into neural machine translation.
Computer Speech & Language, 2017

Multi-way, multilingual neural machine translation.
Computer Speech & Language, 2017

Context-dependent word representation for neural machine translation.
Computer Speech & Language, 2017

GibbsNet: Iterative Adversarial Inference for Deep Graphical Models.
CoRR, 2017

Measuring the tendency of CNNs to Learn Surface Statistical Regularities.
CoRR, 2017

Plan, Attend, Generate: Planning for Sequence-to-Sequence Models.
CoRR, 2017

Equivalence of Equilibrium Propagation and Recurrent Backpropagation.
CoRR, 2017

Variational Bi-LSTMs.
CoRR, 2017

Z-Forcing: Training Stochastic Recurrent Networks.
CoRR, 2017

ACtuAL: Actor-Critic Under Adversarial Learning.
CoRR, 2017

Three Factors Influencing Minima in SGD.
CoRR, 2017

Sparse Attentive Backtracking: Long-Range Credit Assignment in Recurrent Networks.
CoRR, 2017

Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net.
CoRR, 2017

Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask.
CoRR, 2017

Fraternal Dropout.
CoRR, 2017

Graph Attention Networks.
CoRR, 2017

FigureQA: An Annotated Figure Dataset for Visual Reasoning.
CoRR, 2017

Generalization in Deep Learning.
CoRR, 2017

Residual Connections Encourage Iterative Inference.
CoRR, 2017

Improving speech recognition by revising gated recurrent units.
CoRR, 2017

The Consciousness Prior.
CoRR, 2017

A Deep Reinforcement Learning Chatbot.
CoRR, 2017

Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses.
CoRR, 2017

Twin Networks: Using the Future as a Regularizer.
CoRR, 2017

Independently Controllable Factors.
CoRR, 2017

Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks.
CoRR, 2017

Deep Complex Networks.
CoRR, 2017

Image Segmentation by Iterative Inference from Conditional Score Estimation.
CoRR, 2017

Batch-normalized joint training for DNN-based distant speech recognition.
CoRR, 2017

A network of deep neural networks for distant speech recognition.
CoRR, 2017

Multiscale sequence modeling with a learned dictionary.
CoRR, 2017

Deep Learning for Patient-Specific Kidney Graft Survival Analysis.
CoRR, 2017

A Structured Self-attentive Sentence Embedding.
CoRR, 2017

Dynamic Layer Normalization for Adaptive Neural Acoustic Modeling in Speech Recognition.
CoRR, 2017

Gated Orthogonal Recurrent Units: On Learning to Forget.
CoRR, 2017

Boundary-Seeking Generative Adversarial Networks.
CoRR, 2017

A Robust Adaptive Stochastic Gradient Method for Deep Learning.
CoRR, 2017

Plan, Attend, Generate: Character-level Neural Machine Translation with Planning in the Decoder.
CoRR, 2017

Memory Augmented Neural Networks with Wormhole Connections.
CoRR, 2017

Learning Normalized Inputs for Iterative Estimation in Medical Image Segmentation.
CoRR, 2017

Sharp Minima Can Generalize For Deep Nets.
CoRR, 2017

Count-ception: Counting by Fully Convolutional Redundant Counting.
CoRR, 2017

Maximum-Likelihood Augmented Discrete Generative Adversarial Networks.
CoRR, 2017

Independently Controllable Features.
CoRR, 2017

Learning to Compute Word Embeddings On the Fly.
CoRR, 2017

A Closer Look at Memorization in Deep Networks.
CoRR, 2017

Plan, Attend, Generate: Character-Level Neural Machine Translation with Planning.
Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017

GibbsNet: Iterative Adversarial Inference for Deep Graphical Models.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Plan, Attend, Generate: Planning for Sequence-to-Sequence Models.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Z-Forcing: Training Stochastic Recurrent Networks.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Towards more hardware-friendly deep learning.
Proceedings of the Workshop on Trends in Machine-Learning (and impact on computer architecture), 2017

Improving Speech Recognition by Revising Gated Recurrent Units.
Proceedings of the Interspeech 2017, 2017

Dynamic Layer Normalization for Adaptive Neural Acoustic Modeling in Speech Recognition.
Proceedings of the Interspeech 2017, 2017

A robust adaptive stochastic gradient method for deep learning.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Sharp Minima Can Generalize For Deep Nets.
Proceedings of the 34th International Conference on Machine Learning, 2017

A Closer Look at Memorization in Deep Networks.
Proceedings of the 34th International Conference on Machine Learning, 2017

Count-ception: Counting by Fully Convolutional Redundant Counting.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

A network of deep neural networks for Distant Speech Recognition.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

On random weights for texture generation in one layer CNNS.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Denoising Criterion for Variational Auto-Encoding Framework.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Learning to Understand Phrases by Embedding the Dictionary.
TACL, 2016

Big Data: Theoretical Aspects [Scanning the Issue].
Proceedings of the IEEE, 2016

EmoNets: Multimodal deep learning approaches for emotion recognition in video.
J. Multimodal User Interfaces, 2016

Knowledge Matters: Importance of Prior Information for Optimization.
Journal of Machine Learning Research, 2016

Drawing and Recognizing Chinese Characters with Recurrent Neural Network.
CoRR, 2016

Architectural Complexity Measures of Recurrent Neural Networks.
CoRR, 2016

Online and Offline Handwritten Chinese Character Recognition: A Comprehensive Study and New Benchmark.
CoRR, 2016

On Multiplicative Integration with Recurrent Neural Networks.
CoRR, 2016

Iterative Alternating Neural Attention for Machine Reading.
CoRR, 2016

Invariant Representations for Noisy Speech Recognition.
CoRR, 2016

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues.
CoRR, 2016

Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation.
CoRR, 2016

Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus.
CoRR, 2016

Towards a Biologically Plausible Backprop.
CoRR, 2016

Diet Networks: Thin Parameters for Fat Genomic.
CoRR, 2016

Recurrent Neural Networks With Limited Numerical Precision.
CoRR, 2016

Recurrent Neural Networks With Limited Numerical Precision.
CoRR, 2016

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space.
CoRR, 2016

On Random Weights for Texture Generation in One Layer Neural Networks.
CoRR, 2016

Generalizable Features From Unsupervised Learning.
CoRR, 2016

SampleRNN: An Unconditional End-to-End Neural Audio Generation Model.
CoRR, 2016

Professor Forcing: A New Algorithm for Training Recurrent Networks.
CoRR, 2016

Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations.
CoRR, 2016

Deep Directed Generative Models with Energy-Based Probability Estimation.
CoRR, 2016

The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation.
CoRR, 2016

Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations.
CoRR, 2016

HeMIS: Hetero-Modal Image Segmentation.
CoRR, 2016

Mollifying Networks.
CoRR, 2016

Noisy Activation Functions.
CoRR, 2016

Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes.
CoRR, 2016

Pointing the Unknown Words.
CoRR, 2016

Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism.
CoRR, 2016

BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1.
CoRR, 2016

A Character-level Decoder without Explicit Segmentation for Neural Machine Translation.
CoRR, 2016

Hierarchical Multiscale Recurrent Neural Networks.
CoRR, 2016

Context-Dependent Word Representation for Neural Machine Translation.
CoRR, 2016

Mode Regularized Generative Adversarial Networks.
CoRR, 2016

Hierarchical Memory Networks.
CoRR, 2016

Feedforward Initialization for Fast Inference of Deep Generative Networks is biologically plausible.
CoRR, 2016

An Actor-Critic Algorithm for Sequence Prediction.
CoRR, 2016

Understanding intermediate layers using linear classifier probes.
CoRR, 2016

Theano: A Python framework for fast computation of mathematical expressions.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, 2016

A Neural Knowledge Language Model.
CoRR, 2016

NYU-MILA Neural Machine Translation Systems for WMT'16.
Proceedings of the First Conference on Machine Translation, 2016

Batch-normalized joint training for DNN-based distant speech recognition.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Architectural Complexity Measures of Recurrent Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

On Multiplicative Integration with Recurrent Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Binarized Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Professor Forcing: A New Algorithm for Training Recurrent Networks.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism.
Proceedings of the NAACL HLT 2016, 2016

HeMIS: Hetero-Modal Image Segmentation.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016, 2016

Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks.
Proceedings of the Interspeech 2016, 2016

Deconstructing the Ladder Network Architecture.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Noisy Activation Functions.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Bidirectional Helmholtz Machines.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Unitary Evolution Recurrent Neural Networks.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Batch normalized recurrent neural networks.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

End-to-end attention-based large vocabulary speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016

Oracle Performance for Visual Captioning.
Proceedings of the British Machine Vision Conference 2016, 2016

Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Pointing the Unknown Words.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

A Character-level Decoder without Explicit Segmentation for Neural Machine Translation.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

Deep Learning.
Adaptive computation and machine learning, MIT Press, ISBN: 978-0-262-03561-3, 2016

2015
Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks.
IEEE Trans. Multimedia, 2015

Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2015

Challenges in representation learning: A report on three machine learning contests.
Neural Networks, 2015

Editorial introduction to the Neural Networks special issue on Deep Learning of Representations.
Neural Networks, 2015

Deep learning.
Nature, 2015

Trainable performance upper bounds for image and video captioning.
CoRR, 2015

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.
CoRR, 2015

ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks.
CoRR, 2015

ReSeg: A Recurrent Neural Network for Object Segmentation.
CoRR, 2015

A Hierarchical Recurrent Encoder-Decoder For Generative Context-Aware Query Suggestion.
CoRR, 2015

Hierarchical Neural Network Generative Models for Movie Dialogues.
CoRR, 2015

Deconstructing the Ladder Network Architecture.
CoRR, 2015

Blocks and Fuel: Frameworks for deep learning.
CoRR, 2015

Neural Networks with Few Multiplications.
CoRR, 2015

Batch Normalized Recurrent Neural Networks.
CoRR, 2015

EmoNets: Multimodal deep learning approaches for emotion recognition in video.
CoRR, 2015

Denoising Criterion for Variational Auto-Encoding Framework.
CoRR, 2015

Learning to Understand Phrases by Embedding the Dictionary.
CoRR, 2015

Brain Tumor Segmentation with Deep Neural Networks.
CoRR, 2015

On Using Monolingual Corpora in Neural Machine Translation.
CoRR, 2015

RMSProp and equilibrated adaptive learning rates for non-convex optimization.
CoRR, 2015

BinaryConnect: Training Deep Neural Networks with binary weights during propagations.
CoRR, 2015

A Recurrent Latent Variable Model for Sequential Data.
CoRR, 2015

Gated Feedback Recurrent Neural Networks.
CoRR, 2015

Attention-Based Models for Speech Recognition.
CoRR, 2015

Describing Multimedia Content using Attention-based Encoder-Decoder Networks.
CoRR, 2015

Artificial Neural Networks Applied to Taxi Destination Prediction.
CoRR, 2015

Training opposing directed models using geometric mean matching.
CoRR, 2015

An objective function for STDP.
CoRR, 2015

Towards Biologically Plausible Deep Learning.
CoRR, 2015

Early Inference in Energy-Based Models Approximates Back-Propagation.
CoRR, 2015

Task Loss Estimation for Sequence Prediction.
CoRR, 2015

End-to-End Attention-based Large Vocabulary Speech Recognition.
CoRR, 2015

Unitary Evolution Recurrent Neural Networks.
CoRR, 2015

Variance Reduction in SGD by Distributed Importance Sampling.
CoRR, 2015

GSNs : Generative Stochastic Networks.
CoRR, 2015

Montreal Neural Machine Translation Systems for WMT'15.
Proceedings of the Tenth Workshop on Statistical Machine Translation, 2015

Difference Target Propagation.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2015

Artificial Neural Networks Applied to Taxi Destination Prediction.
Proceedings of the ECML/PKDD 2015 Discovery Challenges co-located with European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2015), 2015

Equilibrated adaptive learning rates for non-convex optimization.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

BinaryConnect: Training Deep Neural Networks with binary weights during propagations.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

A Recurrent Latent Variable Model for Sequential Data.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Attention-Based Models for Speech Recognition.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.
Proceedings of the 32nd International Conference on Machine Learning, 2015

BilBOWA: Fast Bilingual Distributed Representations without Word Alignments.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Gated Feedback Recurrent Neural Networks.
Proceedings of the 32nd International Conference on Machine Learning, 2015

A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion.
Proceedings of the 24th ACM International Conference on Information and Knowledge Management, 2015

IAPR keynote lecture IV: Deep learning.
Proceedings of the 3rd IAPR Asian Conference on Pattern Recognition, 2015

On Using Very Large Target Vocabulary for Neural Machine Translation.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
Evolving Culture Versus Local Minima.
Proceedings of the Growing Adaptive Machines, 2014

The Spike-and-Slab RBM and Extensions to Discrete and Sparse Data Distributions.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Learning semantic representations of objects and their parts.
Machine Learning, 2014

A semantic matching energy function for learning with multi-relational data - Application to word-sense disambiguation.
Machine Learning, 2014

What regularized auto-encoders learn from the data-generating distribution.
Journal of Machine Learning Research, 2014

How transferable are features in deep neural networks?
CoRR, 2014

On the Equivalence Between Deep NADE and Generative Stochastic Networks.
CoRR, 2014

FitNets: Hints for Thin Deep Nets.
CoRR, 2014

Iterative Neural Autoregressive Distribution Estimator (NADE-k).
CoRR, 2014

Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation.
CoRR, 2014

On the saddle point problem for non-convex optimization.
CoRR, 2014

Deep Directed Generative Autoencoders.
CoRR, 2014

On the Number of Linear Regions of Deep Neural Networks.
CoRR, 2014

Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews.
CoRR, 2014

Target Propagation.
CoRR, 2014

On Using Very Large Target Vocabulary for Neural Machine Translation.
CoRR, 2014

Embedding Word Similarity with Neural Machine Translation.
CoRR, 2014

Not All Neural Embeddings are Born Equal.
CoRR, 2014

ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient.
CoRR, 2014

BilBOWA: Fast Bilingual Distributed Representations without Word Alignments.
CoRR, 2014

Generative Adversarial Networks.
CoRR, 2014

NICE: Non-linear Independent Components Estimation.
CoRR, 2014

Deep Tempering.
CoRR, 2014

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization.
CoRR, 2014

Low precision arithmetic for deep learning.
CoRR, 2014

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling.
CoRR, 2014

End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results.
CoRR, 2014

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation.
CoRR, 2014

On the Properties of Neural Machine Translation: Encoder-Decoder Approaches.
CoRR, 2014

Exponentially Increasing the Capacity-to-Computation Ratio for Conditional Computation in Deep Learning.
CoRR, 2014

Reweighted Wake-Sleep.
CoRR, 2014

How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation.
CoRR, 2014

Neural Machine Translation by Jointly Learning to Align and Translate.
CoRR, 2014

Conditioning and time representation in long short-term memory networks.
Biological Cybernetics, 2014

Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation.
Proceedings of SSST@EMNLP 2014, 2014

On the Properties of Neural Machine Translation: Encoder-Decoder Approaches.
Proceedings of SSST@EMNLP 2014, 2014

On the Equivalence between Deep NADE and Generative Stochastic Networks.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2014

Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2014

How transferable are features in deep neural networks?
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Iterative Neural Autoregressive Distribution Estimator NADE-k.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

On the Number of Linear Regions of Deep Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Generative Adversarial Nets.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Scaling up deep learning.
Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014

Marginalized Denoising Auto-encoders for Nonlinear Representations.
Proceedings of the 31th International Conference on Machine Learning, 2014

Deep Generative Stochastic Networks Trainable by Backprop.
Proceedings of the 31th International Conference on Machine Learning, 2014

Deep learning and cultural evolution.
Proceedings of the Genetic and Evolutionary Computation Conference, 2014

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

Learning Concept Embeddings for Query Expansion by Quantum Entropy Minimization.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

On the Challenges of Physical Implementations of RBMs.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013
Deep Learning of Representations.
Proceedings of the Handbook on Neural Information Processing, 2013

Scaling Up Spike-and-Slab Models for Unsupervised Feature Learning.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

Representation Learning: A Review and New Perspectives.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

Generalized Denoising Auto-Encoders as Generative Models
CoRR, 2013

Estimating or Propagating Gradients Through Stochastic Neurons
CoRR, 2013

Deep Learning of Representations: Looking Forward
CoRR, 2013

Maxout Networks
CoRR, 2013

Knowledge Matters: Importance of Prior Information for Optimization
CoRR, 2013

Natural Gradient Revisited
CoRR, 2013

Big Neural Networks Waste Capacity
CoRR, 2013

Joint Training Deep Boltzmann Machines for Classification
CoRR, 2013

Metric-Free Natural Gradient for Joint-Training of Boltzmann Machines
CoRR, 2013

A Semantic Matching Energy Function for Learning with Multi-relational Data
CoRR, 2013

An empirical analysis of dropout in piecewise linear networks.
CoRR, 2013

On the number of inference regions of deep feed forward networks with piece-wise linear activations.
CoRR, 2013

How to Construct Deep Recurrent Neural Networks.
CoRR, 2013

Multimodal Transitions for Generative Stochastic Networks.
CoRR, 2013

Learned-norm pooling for deep neural networks.
CoRR, 2013

Pylearn2: a machine learning research library.
CoRR, 2013

An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks.
CoRR, 2013

Challenges in Representation Learning: A report on three machine learning contests.
CoRR, 2013

On the Challenges of Physical Implementations of RBMs.
CoRR, 2013

Bounding the Test Log-Likelihood of Generative Models.
CoRR, 2013

Deep Generative Stochastic Networks Trainable by Backprop.
CoRR, 2013

Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation.
CoRR, 2013

Learning Deep Physiological Models of Affect.
IEEE Comp. Int. Mag., 2013

Deep Learning of Representations: Looking Forward.
Proceedings of the Statistical Language and Speech Processing, 2013

Modeling term dependencies with quantum language models for IR.
Proceedings of the 36th International ACM SIGIR conference on research and development in Information Retrieval, 2013

Multi-Prediction Deep Boltzmann Machines.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Stochastic Ratio Matching of RBMs for Sparse High-Dimensional Inputs.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Generalized Denoising Auto-Encoders as Generative Models.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Audio Chord Recognition with Recurrent Neural Networks.
Proceedings of the 14th International Society for Music Information Retrieval Conference, 2013

Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding.
Proceedings of the INTERSPEECH 2013, 2013

Unsupervised Learning of Semantics of Object Detections for Scene Categorization.
Proceedings of the Pattern Recognition Applications and Methods - International Conference, 2013

Unsupervised and Transfer Learning under Uncertainty - From Object Detections to Scene Categorization.
Proceedings of the ICPRAM 2013, 2013


On the difficulty of training recurrent neural networks.
Proceedings of the 30th International Conference on Machine Learning, 2013

Maxout Networks.
Proceedings of the 30th International Conference on Machine Learning, 2013

Better Mixing via Deep Representations.
Proceedings of the 30th International Conference on Machine Learning, 2013


High-dimensional sequence transduction.
Proceedings of the IEEE International Conference on Acoustics, 2013

Advances in optimizing recurrent networks.
Proceedings of the IEEE International Conference on Acoustics, 2013

Stacked calibration of off-policy policy evaluation for video game matchmaking.
Proceedings of the 2013 IEEE Conference on Computational Inteligence in Games (CIG), 2013

Texture Modeling with Convolutional Spike-and-Slab RBMs and Deep Extensions.
Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, 2013

2012
Practical Recommendations for Gradient-Based Training of Deep Architectures.
Proceedings of the Neural Networks: Tricks of the Trade - Second Edition, 2012

Beyond Skill Rating: Advanced Matchmaking in Ghost Recon Online.
IEEE Trans. Comput. Intellig. and AI in Games, 2012

Unsupervised and Transfer Learning Challenge: a Deep Learning Approach.
Proceedings of the Unsupervised and Transfer Learning, 2012

Learning Algorithms for the Classification Restricted Boltzmann Machine.
Journal of Machine Learning Research, 2012

Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing.
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

Random Search for Hyper-Parameter Optimization.
Journal of Machine Learning Research, 2012

Deep Learning of Representations for Unsupervised and Transfer Learning.
Proceedings of the Unsupervised and Transfer Learning, 2012

Joint Training of Deep Boltzmann Machines
CoRR, 2012

High-dimensional sequence transduction
CoRR, 2012

Advances in Optimizing Recurrent Networks
CoRR, 2012

Texture Modeling with Convolutional Spike-and-Slab RBMs and Deep Extensions
CoRR, 2012

Theano: new features and speed improvements
CoRR, 2012

Understanding the exploding gradient problem
CoRR, 2012

Regularized Auto-Encoders Estimate Local Statistics
CoRR, 2012

Disentangling Factors of Variation via Generative Entangling
CoRR, 2012

Efficient EM Training of Gaussian Mixtures with Missing Data
CoRR, 2012

Better Mixing via Deep Representations
CoRR, 2012

Implicit Density Estimation by Local Moment Matching to Sample from Auto-Encoders
CoRR, 2012

Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives
CoRR, 2012

Practical recommendations for gradient-based training of deep architectures
CoRR, 2012

On Training Deep Boltzmann Machines
CoRR, 2012

Evolving Culture vs Local Minima
CoRR, 2012

Spike-and-Slab Sparse Coding for Unsupervised Feature Discovery
CoRR, 2012

Detonation Classification from acoustic Signature with the Restricted Boltzmann Machine.
Computational Intelligence, 2012

Building Musically-relevant Audio Features through Multiple Timescale Representations.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

Discriminative Non-negative Matrix Factorization for Multiple Pitch Estimation.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

A Generative Process for Contractive Auto-Encoders.
Proceedings of the 29th International Conference on Machine Learning, 2012

Large-Scale Feature Learning With Spike-and-Slab Sparse Coding.
Proceedings of the 29th International Conference on Machine Learning, 2012

Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription.
Proceedings of the 29th International Conference on Machine Learning, 2012

Disentangling Factors of Variation for Facial Expression Recognition.
Proceedings of the Computer Vision - ECCV 2012, 2012

Deep Learning for NLP (without Magic).
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, 2012

2011
Contextual tag inference.
TOMCCAP, 2011

Quickly Generating Representative Samples from an RBM-Derived Process.
Neural Computation, 2011

Suitability of V1 Energy Models for Object Classification.
Neural Computation, 2011

Deep Sparse Rectifier Neural Networks.
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011

A Spike and Slab Restricted Boltzmann Machine.
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011

Deep Learners Benefit More from Out-of-Distribution Examples.
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011

Discussion of "The Neural Autoregressive Distribution Estimator".
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011

The Statistical Inefficiency of Sparse Coding for Images (or, One Gabor to Rule them All)
CoRR, 2011

Towards Open-Text Semantic Parsing via Multi-Task Learning of Structured Embeddings
CoRR, 2011

Learning invariant features through local space contraction
CoRR, 2011

Adding noise to the input of a model trained with a regularized objective
CoRR, 2011

Autotagging music with conditional restricted Boltzmann machines
CoRR, 2011

Higher Order Contractive Auto-Encoder.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2011

The Manifold Tangent Classifier.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

On Tracking The Partition Function.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Shallow vs. Deep Sum-Product Networks.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Algorithms for Hyper-Parameter Optimization.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

On learning distributed representations of semantics.
Proceedings of the 2011 Symposium on Machine Learning in Speech and Language Processing, 2011

Temporal Pooling and Multiscale Learning for Automatic Annotation and Ranking of Music Audio.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Contractive Auto-Encoders: Explicit Invariance During Feature Extraction.
Proceedings of the 28th International Conference on Machine Learning, 2011

Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach.
Proceedings of the 28th International Conference on Machine Learning, 2011

Large-Scale Learning of Embeddings with Reconstruction Sampling.
Proceedings of the 28th International Conference on Machine Learning, 2011

Unsupervised Models of Images by Spikeand-Slab RBMs.
Proceedings of the 28th International Conference on Machine Learning, 2011

On the Expressive Power of Deep Architectures.
Proceedings of the Discovery Science - 14th International Conference, 2011

On the Expressive Power of Deep Architectures.
Proceedings of the Algorithmic Learning Theory - 22nd International Conference, 2011

Learning Structured Embeddings of Knowledge Bases.
Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

2010
Deep Belief Networks Are Compact Universal Approximators.
Neural Computation, 2010

Tractable Multivariate Binary Density Estimation and the Restricted Boltzmann Forest.
Neural Computation, 2010

Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion.
Journal of Machine Learning Research, 2010

Understanding the difficulty of training deep feedforward neural networks.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Why Does Unsupervised Pre-training Help Deep Learning?
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Why Does Unsupervised Pre-training Help Deep Learning?
Journal of Machine Learning Research, 2010

Tempered Markov Chain Monte Carlo for training of Restricted Boltzmann Machines.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Alternative time representation in dopamine models.
Journal of Computational Neuroscience, 2010

Adaptive Parallel Tempering for Stochastic Maximum Likelihood Learning of RBMs
CoRR, 2010

Deep Self-Taught Learning for Handwritten Character Recognition
CoRR, 2010

Decision trees do not generalize to new variations.
Computational Intelligence, 2010

Learning Tags that Vary Within a Song.
Proceedings of the 11th International Society for Music Information Retrieval Conference, 2010

Word Representations: A Simple and General Method for Semi-Supervised Learning.
Proceedings of the ACL 2010, 2010

2009
A Hybrid Pareto Mixture for Conditional Asymmetric Fat-Tailed Distributions.
IEEE Trans. Neural Networks, 2009

Justifying and Generalizing Contrastive Divergence.
Neural Computation, 2009

Exploring Strategies for Training Deep Neural Networks.
Journal of Machine Learning Research, 2009

The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training.
Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, 2009

Incorporating Functional Knowledge in Neural Networks.
Journal of Machine Learning Research, 2009

Learning Deep Architectures for AI.
Foundations and Trends in Machine Learning, 2009

An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Slow, Decorrelated Features for Pretraining Complex Cell-like Networks.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Quadratic Features and Deep Architectures for Chunking.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31, 2009

Workshop summary: Workshop on learning feature hierarchies.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Curriculum learning.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

2008
Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model.
IEEE Trans. Neural Networks, 2008

Neural net language models.
Scholarpedia, 2008

Representational Power of Restricted Boltzmann Machines and Deep Belief Networks.
Neural Computation, 2008

Extracting and composing robust features with denoising autoencoders.
Proceedings of the Machine Learning, 2008

Classification using discriminative restricted Boltzmann machines.
Proceedings of the Machine Learning, 2008

Zero-data Learning of New Tasks.
Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

2007
Continuous Neural Networks.
Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, 2007

A Hybrid Pareto Model for Conditional Density Estimation of Asymmetric Fat-Tail Data.
Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, 2007

Noisy K Best-Paths for Approximate Dynamic Programming with Application to Portfolio Optimization.
JCP, 2007

Topmoumoute Online Natural Gradient Algorithm.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Learning the 2-D Topology of Images.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Augmented Functional Time Series Representation and Forecasting with Gaussian Processes.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

An empirical evaluation of deep architectures on problems with many factors of variation.
Proceedings of the Machine Learning, 2007

2006
Nonlocal Estimation of Manifold Structure.
Neural Computation, 2006

Collaborative Filtering on a Family of Biological Targets.
Journal of Chemical Information and Modeling, 2006

Greedy Layer-Wise Training of Deep Networks.
Proceedings of the Advances in Neural Information Processing Systems 19, 2006

The K Best-Paths Approach to Approximate Dynamic Programming with Application to Portfolio Optimization.
Proceedings of the Advances in Artificial Intelligence, 2006

2005
Convex Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Non-Local Manifold Parzen Windows.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

The Curse of Highly Variable Functions for Local Kernel Machines.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Semi-supervised Learning by Entropy Minimization.
Proceedings of the Actes de CAP 05, Conférence francophone sur l'apprentissage automatique, 2005

Greedy Spectral Embedding.
Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005

Hierarchical Probabilistic Neural Network Language Model.
Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005

Efficient Non-Parametric Function Induction in Semi-Supervised Learning.
Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005

2004
Learning Eigenfunctions Links Spectral Embedding and Kernel PCA.
Neural Computation, 2004

No Unbiased Estimator of the Variance of K-Fold Cross-Validation.
Journal of Machine Learning Research, 2004

Locally Linear Embedding for dimensionality reduction in QSAR.
Journal of Computer-Aided Molecular Design, 2004

Brain Inspired Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Semi-supervised Learning by Entropy Minimization.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Non-Local Manifold Tangent Learning.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Unsupervised Sense Disambiguation Using Bilingual Probabilistic Models.
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004

2003
Bias learning, knowledge sharing.
IEEE Trans. Neural Networks, 2003

Inference for the Generalization Error.
Machine Learning, 2003

A Neural Probabilistic Language Model.
Journal of Machine Learning Research, 2003

Extensions to Metric-Based Model Selection.
Journal of Machine Learning Research, 2003

Scaling Large Learning Problems with Hard Parallel Mixtures.
IJPRAI, 2003

Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

No Unbiased Estimator of the Variance of K-Fold Cross-Validation.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

Quick Training of Probabilistic Neural Nets by Importance Sampling.
Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, 2003

2002
Robust Regression with Asymmetric Heavy-Tail Noise Distributions.
Neural Computation, 2002

A Parallel Mixture of SVMs for Very Large Scale Problems.
Neural Computation, 2002

Kernel Matching Pursuit.
Machine Learning, 2002

Model Selection for Small Sample Regression.
Machine Learning, 2002

Guest Introduction: Special Issue on New Methods for Model Selection and Model Combination.
Machine Learning, 2002

Scaling Large Learning Problems with Hard Parallel Mixtures.
Proceedings of the Pattern Recognition with Support Vector Machines, 2002

Metric-based model selection for time-series forecasting.
Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, 2002

Manifold Parzen Windows.
Proceedings of the Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, 2002

2001
Cost functions and model combination for VaR-based asset allocation using neural networks.
IEEE Trans. Neural Networks, 2001

Experiments on the application of IOHMMs to model financial returns series.
IEEE Trans. Neural Networks, 2001

Topic Segmentation : A First Stage to Dialog-Based Information Extraction.
Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium, 2001

K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

A Parallel Mixture of SVMs for Very Large Scale Problems.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

Estimating Car Insurance Premia: a Case Study in High-Dimensional Data Inference.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

2000
Taking on the curse of dimensionality in joint distributions using neural networks.
IEEE Trans. Neural Netw. Learning Syst., 2000

Boosting Neural Networks.
Neural Computation, 2000

Gradient-Based Optimization of Hyperparameters.
Neural Computation, 2000

Incorporating Second-Order Functional Knowledge for Better Option Pricing.
Proceedings of the Advances in Neural Information Processing Systems 13, 2000

A Neural Probabilistic Language Model.
Proceedings of the Advances in Neural Information Processing Systems 13, 2000

A Neural Support Vector Network Architecture with Adaptive Kernels.
IJCNN (5), 2000

Bias Learning, Knowledge Sharing.
IJCNN (1), 2000

Probabilistic Neural Network Models for Sequential Data.
IJCNN (5), 2000

Continuous Optimization of Hyper-Parameters.
IJCNN (1), 2000

1999
Stochastic Learning of Strategic Equilibria for Auctions.
Neural Computation, 1999

Object Recognition with Gradient-Based Learning.
Proceedings of the Shape, Contour and Grouping in Computer Vision, 1999

Inference for the Generalization Error.
Proceedings of the Advances in Neural Information Processing Systems 12, [NIPS Conference, Denver, Colorado, USA, November 29, 1999

Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 12, [NIPS Conference, Denver, Colorado, USA, November 29, 1999

Binary Pseudowavelets and Applications to Bilevel Image Processing.
Proceedings of the Data Compression Conference, 1999

1998
High quality document image compression with "DjVu".
J. Electronic Imaging, 1998

Gaussian Mixture Densities for Classification of Nuclear Power Plant Data.
Computers and Artificial Intelligence, 1998

Support vector machines for improving the classification of brain PET images.
Proceedings of the Medical Imaging 1998: Image Processing, 1998

A Memory-Efficient Adaptive Huffman Coding Algorthm for Very Large Sets of Symbols.
Proceedings of the Data Compression Conference, 1998

The Z-Coder Adaptive Binary Coder.
Proceedings of the Data Compression Conference, 1998

Browsing through High Quality Document Images with DjVu.
Proceedings of the IEEE Forum on Reasearch and Technology Advances in Digital Libraries, 1998

1997
Using a Financial Training Criterion Rather than a Prediction Criterion.
Int. J. Neural Syst., 1997

Training Methods for Adaptive Boosting of Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 10, 1997

Shared Context Probabilistic Transducers.
Proceedings of the Advances in Neural Information Processing Systems 10, 1997

Discriminative feature and model design for automatic speech recognition.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Reading checks with multilayer graph transformer networks.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

AdaBoosting Neural Networks: Application to on-line Character Recognition.
Proceedings of the Artificial Neural Networks, 1997

Global Training of Document Processing Systems Using Graph Transformer Networks.
Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97), 1997

1996
Input-output HMMs for sequence processing.
IEEE Trans. Neural Networks, 1996

Multi-Task Learning for Stock Selection.
Proceedings of the Advances in Neural Information Processing Systems 9, 1996

1995
On the search for new learning rules for ANNs.
Neural Processing Letters, 1995

LeRec: a NN/HMM hybrid for on-line handwriting recognition.
Neural Computation, 1995

Diffusion of Context and Credit Information in Markovian Models.
J. Artif. Intell. Res., 1995

Diffusion of Context and Credit Information in Markovian Models.
CoRR, 1995

Hierarchical Recurrent Neural Networks for Long-Term Dependencies.
Proceedings of the Advances in Neural Information Processing Systems 8, 1995

Recurrent Neural Networks for Missing or Asynchronous Data.
Proceedings of the Advances in Neural Information Processing Systems 8, 1995

1994
Learning long-term dependencies with gradient descent is difficult.
IEEE Trans. Neural Networks, 1994

Convergence Properties of the K-Means Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 7, 1994

Diffusion of Credit in Markovian Models.
Proceedings of the Advances in Neural Information Processing Systems 7, 1994

An Input Output HMM Architecture.
Proceedings of the Advances in Neural Information Processing Systems 7, 1994

Word-level training of a handwritten word recognizer based on convolutional neural networks.
Proceedings of the 12th IAPR International Conference on Pattern Recognition, 1994

An EM approach to grammatical inference: input/output HMMs.
Proceedings of the 12th IAPR International Conference on Pattern Recognition, 1994

Word normalization for online handwritten word recognition.
Proceedings of the 12th IAPR International Conference on Pattern Recognition, 1994

Use of Genetic Programming for the Search of a New Learning Rule for Neural Networks.
Proceedings of the First IEEE Conference on Evolutionary Computation, 1994

1993
A Connectionist Approach to Speech Recognition.
IJPRAI, 1993

Credit Assignment through Time: Alternatives to Backpropagation.
Proceedings of the Advances in Neural Information Processing Systems 6, 1993

Globally Trained Handwritten Word Recognizer Using Spatial Representation, Convolutional Neural Networks, and Hidden Markov Models.
Proceedings of the Advances in Neural Information Processing Systems 6, 1993

1992
Global optimization of a neural network-hidden Markov model hybrid.
IEEE Trans. Neural Networks, 1992

Phonetically motivated acoustic parameters for continuous speech recognition using artificial neural networks.
Speech Communication, 1992

Learning the dynamic nature of speech with back-propagation for sequences.
Pattern Recognition Letters, 1992

1991
Neural Network - Gaussian Mixture Hybrid for Speech Recognition or Density Estimation.
Proceedings of the Advances in Neural Information Processing Systems 4, 1991

A comparative study on hybrid acoustic phonetic decoders based on artificial neural networks.
Proceedings of the Second European Conference on Speech Communication and Technology, 1991

Phonetically motivated acoustic parameters for continuous speech recognition using artificial neural networks.
Proceedings of the Second European Conference on Speech Communication and Technology, 1991

1990
Phonetically-based multi-layered neural networks for vowel classification.
Speech Communication, 1990

Efficient recognition of immunoglobulin domains from amino acid sequences using a neural network.
Computer Applications in the Biosciences, 1990

1989
Programmable Execution of Multi-Layered Networks for Automatic Speech Recognition.
Commun. ACM, 1989

Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge.
Proceedings of the Advances in Neural Information Processing Systems 2, 1989

A Neural Network to Detect Homologies in Proteins.
Proceedings of the Advances in Neural Information Processing Systems 2, 1989

On the Generalization Capability of Multi-Layered Networks in the Extraction of Speech Properties.
Proceedings of the 11th International Joint Conference on Artificial Intelligence. Detroit, 1989

1988
Use of Multi-Layered Networks for Coding Speech with Phonetic Features.
Proceedings of the Advances in Neural Information Processing Systems 1, 1988

Data-Driven Execution of Multi-Layered Networks for Automatic Speech Recognition.
Proceedings of the 7th National Conference on Artificial Intelligence, 1988


  Loading...