Anne-Laure Boulesteix

Orcid: 0000-0002-2729-0947

Affiliations:
  • LMU Munich, Institute for Medical Information Processing, Biometry and Epidemiology, Germany


According to our database1, Anne-Laure Boulesteix authored at least 55 papers between 2003 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Understanding random forests and overfitting: a visualization and simulation study.
CoRR, 2024

2023
A white paper on good research practices in benchmarking: The case of cluster analysis.
WIREs Data. Mining. Knowl. Discov., November, 2023

Over-optimistic evaluation and reporting of novel cluster algorithms: an illustrative study.
Adv. Data Anal. Classif., March, 2023

Over-optimism in unsupervised microbiome analysis: Insights from network learning and clustering.
PLoS Comput. Biol., January, 2023

Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges.
WIREs Data. Mining. Knowl. Discov., 2023

Evaluating machine learning models in non-standard settings: An overview and new findings.
CoRR, 2023

Prediction approaches for partly missing multi-omics covariate data: A literature review and an empirical comparison study.
CoRR, 2023

2022
Validation of cluster analysis results on validation data: A systematic framework.
WIREs Data Mining Knowl. Discov., 2022

Over-optimism in benchmark studies and the multiplicity of design and analysis options when interpreting their results.
WIREs Data Mining Knowl. Discov., 2022

Interaction forests: Identifying and exploiting interpretable quantitative and qualitative interaction effects.
Comput. Stat. Data Anal., 2022

2021
Improved Outcome Prediction Across Data Sources Through Robust Parameter Tuning.
J. Classif., 2021

NetCoMi: network construction and comparison for microbiome data in R.
Briefings Bioinform., 2021

Large-scale benchmark study of survival prediction methods using multi-omics data.
Briefings Bioinform., 2021

2020
Combining clinical and molecular data in regression prediction models: insights from a simulation study.
Briefings Bioinform., 2020

2019
Hyperparameters and tuning strategies for random forest.
WIREs Data Mining Knowl. Discov., 2019

Tunability: Importance of Hyperparameters of Machine Learning Algorithms.
J. Mach. Learn. Res., 2019

2018
On the choice and influence of the number of boosting steps for high-dimensional linear Cox-models.
Comput. Stat., 2018

Priority-Lasso: a simple hierarchical approach to the prediction of clinical outcome using multi-omics data.
BMC Bioinform., 2018

Random forest versus logistic regression: a large-scale benchmark experiment.
BMC Bioinform., 2018

A computationally fast variable importance test for random forests for high-dimensional data.
Adv. Data Anal. Classif., 2018

2017
To Tune or Not to Tune the Number of Trees in Random Forest.
J. Mach. Learn. Res., 2017

Detection of influential points as a byproduct of resampling-based variable selection procedures.
Comput. Stat. Data Anal., 2017

IPF-LASSO: Integrative L<sup>1</sup>-Penalized Regression with Penalty Factors for Prediction Based on Multi-Omics Data.
Comput. Math. Methods Medicine, 2017

Improving cross-study prediction through addon batch effect adjustment or addon normalization.
Bioinform., 2017

2016
Random forest for ordinal responses: Prediction and variable selection.
Comput. Stat. Data Anal., 2016

Combining location-and-scale batch effect adjustment with data cleaning by latent factor adjustment.
BMC Bioinform., 2016

2015
Ten Simple Rules for Reducing Overoptimistic Reporting in Methodological Computational Research.
PLoS Comput. Biol., 2015

Letter to the Editor: On the term 'interaction' and related phrases in the literature on Random Forests.
Briefings Bioinform., 2015

Letter to the Editor: On Reviews and Papers on New Methods.
Briefings Bioinform., 2015

2014
Cross-study validation for the assessment of prediction algorithms.
Bioinform., 2014

2013
On the Simultaneous Analysis of Clinical and Omics Data: A Comparison of Globalboosttest and Pre-validation Techniques.
Proceedings of the Statistical Models for Data Analysis, 2013

Complexity Selection with Cross-validation for Lasso and Sparse Partial Least Squares Using High-Dimensional Data.
Proceedings of the Algorithms from and for Nature and Life, 2013

An AUC-based permutation variable importance measure for random forests.
BMC Bioinform., 2013

On representative and illustrative comparisons with real data in bioinformatics: response to the letter to the editor by Smith <i>et al.</i>.
Bioinform., 2013

2012
Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics.
WIREs Data Mining Knowl. Discov., 2012

A Plea for Neutral Comparison Studies in Computational Sciences
CoRR, 2012

Random forest Gini importance favours SNPs with large minor allele frequency: impact, sources and recommendations.
Briefings Bioinform., 2012

2011
Added predictive value of high-throughput molecular data to clinical data and its validation.
Briefings Bioinform., 2011

Editorial.
Briefings Bioinform., 2011

2010
Testing the additional predictive value of high-dimensional molecular data.
BMC Bioinform., 2010

Over-optimism in bioinformatics: an illustration.
Bioinform., 2010

Over-optimism in bioinformatics research.
Bioinform., 2010

2009
Survival prediction using gene expression data: A review and comparison.
Comput. Stat. Data Anal., 2009

Regularized estimation of large-scale gene association networks using graphical Gaussian models.
BMC Bioinform., 2009

Stability and aggregation of ranked gene lists.
Briefings Bioinform., 2009

2008
Conditional variable importance for random forests.
BMC Bioinform., 2008

CMA - a comprehensive Bioconductor package for supervised classification with high dimensional data.
BMC Bioinform., 2008

Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value.
Bioinform., 2008

2007
Unbiased split selection for classification trees based on the Gini Index.
Comput. Stat. Data Anal., 2007

Maximally selected Chi-squared statistics and non-monotonic associations: An exact approach based on two cutpoints.
Comput. Stat. Data Anal., 2007

Bias in random forest variable importance measures: Illustrations, sources and a solution.
BMC Bioinform., 2007

WilcoxCV: an R package for fast variable selection in cross-validation.
Bioinform., 2007

Partial least squares: a versatile tool for the analysis of high-dimensional genomic data.
Briefings Bioinform., 2007

2006
Identification of interaction patterns and classification with applications to microarray data.
Comput. Stat. Data Anal., 2006

2003
A CART-based approach to discover emerging patterns in microarray data.
Bioinform., 2003


  Loading...