Daniel P. Berrar

Orcid: 0000-0002-7038-2601

  • Tokyo Institute of Technology, Interdisciplinary Graduate School of Science and Engineering

According to our database1, Daniel P. Berrar authored at least 38 papers between 2003 and 2022.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


Using p-values for the comparison of classifiers: pitfalls and alternatives.
Data Min. Knowl. Discov., 2022

A self-organizing incremental neural network for continual supervised learning.
Expert Syst. Appl., 2021

Deep learning in bioinformatics and biomedicine.
Briefings Bioinform., 2021

SOINN+, a Self-Organizing Incremental Neural Network for Unsupervised Learning from Noisy Data Streams.
Expert Syst. Appl., 2020

Introduction to the Non-Parametric Bootstrap.
Proceedings of the Encyclopedia of Bioinformatics and Computational Biology - Volume 1, 2019

Performance Measures for Binary Classification.
Proceedings of the Encyclopedia of Bioinformatics and Computational Biology - Volume 1, 2019

Proceedings of the Encyclopedia of Bioinformatics and Computational Biology - Volume 1, 2019

Bayes' Theorem and Naive Bayes Classifier.
Proceedings of the Encyclopedia of Bioinformatics and Computational Biology - Volume 1, 2019

The Open International Soccer Database for machine learning.
Mach. Learn., 2019

Guest editorial: special issue on machine learning for soccer.
Mach. Learn., 2019

Incorporating domain knowledge in machine learning for soccer outcome prediction.
Mach. Learn., 2019

Should significance testing be abandoned in machine learning?
Int. J. Data Sci. Anal., 2019

Self-Organizing Incremental Neural Networks for Continual Learning.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Confidence curves: an alternative to null hypothesis significance testing for the comparison of classifiers.
Mach. Learn., 2017

Caveats and pitfalls in crowdsourcing research: the case of soccer referee bias.
Int. J. Data Sci. Anal., 2017

On the Jeffreys-Lindley Paradox and the Looming Reproducibility Crisis in Machine Learning.
Proceedings of the 2017 IEEE International Conference on Data Science and Advanced Analytics, 2017

Learning from automatically labeled data: case study on click fraud prediction.
Knowl. Inf. Syst., 2016

On the Noise Resilience of Ranking Measures.
Proceedings of the Neural Information Processing - 23rd International Conference, 2016

Detecting click fraud in online advertising: a data mining approach.
J. Mach. Learn. Res., 2014

An Empirical Evaluation of Ranking Measures With Respect to Robustness to Noise.
J. Artif. Intell. Res., 2014

Turing Test Considered Mostly Harmless.
New Gener. Comput., 2013

Computing machinery and creativity: lessons learned from the Turing test.
Kybernetes, 2013

Significance tests or confidence intervals: which are preferable for the comparison of classifiers?
J. Exp. Theor. Artif. Intell., 2013

Caveats and pitfalls of ROC analysis in clinical microarray research (and how to avoid them).
Briefings Bioinform., 2012

Null QQ plots: A simple graphical alternative to significance testing for the comparison of classifiers.
Proceedings of the 21st International Conference on Pattern Recognition, 2012

Multidimensional scaling with discrimination coefficients for supervised visualization of high-dimensional data.
Neural Comput. Appl., 2011

The Omnipresent Computing Menace to Information Society.
J. Adv. Comput. Intell. Intell. Informatics, 2011

Artificial Intelligence in Neuroscience and Systems Biology: Lessons Learnt, Open Problems, and the Road Ahead.
Adv. Artif. Intell., 2010

Quo Vadis, Artificial Intelligence?
Adv. Artif. Intell., 2010

Text mining of full-text journal articles combined with gene expression analysis reveals a relationship between sphingosine-1-phosphate and invasiveness of a glioblastoma cell line.
BMC Bioinform., 2006

Instance-based concept learning from multiclass DNA microarray data.
BMC Bioinform., 2006

Avoiding model selection bias in small-sample genomic datasets.
Bioinform., 2006

Neural Plasma.
Proceedings of the Artificial Intelligence in Theory and Practice, 2006

P-found: The Protein Folding and Unfolding Simulation Repository.
Proceedings of the 2006 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, 2006

Survival Trees for Analyzing Clinical Outcome in Lung Adenocarcinomas Based on Gene Expression Profiles: Identification of Neogenin and Diacylglycerol Kinase Expression as Critical Factors.
J. Comput. Biol., 2005

Grid warehousing of molecular dynamics protein unfolding data.
Proceedings of the 5th International Symposium on Cluster Computing and the Grid (CCGrid 2005), 2005

Multiclass Cancer Classification Using Gene Expression Profiling and Probabilistic Neural Networks.
Proceedings of the 8th Pacific Symposium on Biocomputing, 2003

A Probabilistic Neural Network for Gene Selection and Classification of Microarray Data.
Proceedings of the International Conference on Artificial Intelligence, 2003