Olivier Delalleau

According to our database1, Olivier Delalleau authored at least 35 papers between 2003 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages.
CoRR, May, 2025

Llama-Nemotron: Efficient Reasoning Models.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, May, 2025

Adversarial Training of Reward Models.
CoRR, April, 2025

Dedicated Feedback and Edit Models Empower Inference-Time Scaling for Open-Ended General-Domain Tasks.
CoRR, March, 2025

Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment.
CoRR, February, 2025

HelpSteer2-Preference: Complementing Ratings with Preferences.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

HelpSteer3: Human-Annotated Feedback and Edit Data to Empower Inference-Time Scaling in Open-Ended General-Domain Tasks.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Diverging Preferences: When do Annotators Disagree and do Models Know?
CoRR, 2024

Nemotron-4 340B Technical Report.
CoRR, 2024

HelpSteer2: Open-source dataset for training top-performing reward models.
CoRR, 2024

NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment.
CoRR, 2024

HelpSteer 2: Open-source dataset for training top-performing reward models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

2020
A Closer Look at Codistillation for Distributed Training.
CoRR, 2020

2019
Discrete and Continuous Action Representation for Practical RL in Video Games.
CoRR, 2019

2016
Theano: A Python framework for fast computation of mathematical expressions.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, 2016

2013
Stacked calibration of off-policy policy evaluation for video game matchmaking.
Proceedings of the 2013 IEEE Conference on Computational Inteligence in Games (CIG), 2013

2012
Beyond Skill Rating: Advanced Matchmaking in Ghost Recon Online.
IEEE Trans. Comput. Intell. AI Games, 2012

Efficient EM Training of Gaussian Mixtures with Missing Data
CoRR, 2012

Detonation Classification from acoustic Signature with the Restricted Boltzmann Machine.
Comput. Intell., 2012

2011
Shallow vs. Deep Sum-Product Networks.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

On the Expressive Power of Deep Architectures.
Proceedings of the Algorithmic Learning Theory - 22nd International Conference, 2011

2010
Tempered Markov Chain Monte Carlo for training of Restricted Boltzmann Machines.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Decision trees do not generalize to new variations.
Comput. Intell., 2010

2009
Justifying and Generalizing Contrastive Divergence.
Neural Comput., 2009

2006
Spectral Dimensionality Reduction.
Proceedings of the Feature Extraction - Foundations and Applications, 2006

Large-Scale Algorithms.
Proceedings of the Semi-Supervised Learning, 2006

Label Propagation and Quadratic Criterion.
Proceedings of the Semi-Supervised Learning, 2006

2005
Convex Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

The Curse of Highly Variable Functions for Local Kernel Machines.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Efficient Non-Parametric Function Induction in Semi-Supervised Learning.
Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005

2004
Learning Eigenfunctions Links Spectral Embedding and Kernel PCA.
Neural Comput., 2004

Locally Linear Embedding for dimensionality reduction in QSAR.
J. Comput. Aided Mol. Des., 2004

2003
Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003


  Loading...