Veronika Laippala
Orcid: 0000-0002-7635-429X
  According to our database1,
  Veronika Laippala
  authored at least 40 papers
  between 2007 and 2025.
  
  
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
  2025
Register Always Matters: Analysis of LLM Pretraining Data Through the Lens of Language Variation.
    
  
    CoRR, April, 2025
    
  
    CoRR, March, 2025
    
  
    Proceedings of the 31st International Conference on Computational Linguistics, 2025
    
  
An Expanded Massive Multilingual Dataset for High-Performance Language Technologies (HPLT).
    
  
    Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
    
  
  2024
    CoRR, 2024
    
  
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order.
    
  
    CoRR, 2024
    
  
    Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
    
  
  2023
In search of founding era registers: automatic modeling of registers from the corpus of Founding Era American English.
    
  
    Digit. Scholarsh. Humanit., November, 2023
    
  
Register identification from the unrestricted open Web using the Corpus of Online Registers of English.
    
  
    Lang. Resour. Evaluation, September, 2023
    
  
    Proceedings of the 24th Nordic Conference on Computational Linguistics, 2023
    
  
    Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
    
  
  2022
    Frontiers Artif. Intell., 2022
    
  
    Proceedings of the Eighth Workshop on Noisy User-generated Text, 2022
    
  
    Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022
    
  
  2021
Exploring the role of lexis and grammar for the stable identification of register in an unrestricted corpus of web documents.
    
  
    Lang. Resour. Evaluation, 2021
    
  
    Proceedings of the 23rd Nordic Conference on Computational Linguistics, 2021
    
  
Beyond the English Web: Zero-Shot Cross-Lingual and Lightweight Monolingual Classification of Registers.
    
  
    Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, 2021
    
  
  2020
    Proceedings of The 12th Language Resources and Evaluation Conference, 2020
    
  
    Proceedings of the 12th Web as Corpus Workshop, 2020
    
  
  2019
    Proceedings of the 22nd Nordic Conference on Computational Linguistics, NoDaLiDa 2019, Turku, Finland, September 30, 2019
    
  
  2018
    Lang. Resour. Evaluation, 2018
    
  
  2017
    Proceedings of the 21st Nordic Conference on Computational Linguistics, 2017
    
  
  2015
    Proceedings of the 20th Nordic Conference of Computational Linguistics, 2015
    
  
Towards the Classification of the Finnish Internet Parsebank: Detecting Translations and Informality.
    
  
    Proceedings of the 20th Nordic Conference of Computational Linguistics, 2015
    
  
    Proceedings of the Third International Conference on Dependency Linguistics, 2015
    
  
  2014
    Lang. Resour. Evaluation, 2014
    
  
    Proceedings of the Human Language Technologies - The Baltic Perspective, 2014
    
  
  2013
Using cluster analysis to identify weak signals of lethal trends in aviation and healthcare documentation.
    
  
    Int. J. Netw. Virtual Organisations, 2013
    
  
    Proceedings of the 19th Nordic Conference of Computational Linguistics, 2013
    
  
    Proceedings of the 19th Nordic Conference of Computational Linguistics, 2013
    
  
  2012
    Proceedings of the Exploring the Abyss of Inequalities, 2012
    
  
    Proceedings of the CLEF 2012 Evaluation Labs and Workshop, 2012
    
  
  2011
    Proceedings of the Computational Dependency Theory [papers from the International Conference on Dependency Linguistics, 2011
    
  
  2010
    Proceedings of the Fourth Linguistic Annotation Workshop, 2010
    
  
  2009
Towards automated processing of clinical Finnish: Sublanguage analysis and a rule-based parser.
    
  
    Int. J. Medical Informatics, 2009
    
  
Parsing Clinical Finnish: Experiments with Rule-Based and Statistical Dependency Parsers.
    
  
    Proceedings of the 17th Nordic Conference of Computational Linguistics, 2009
    
  
  2007
On the unification of syntactic annotations under the Stanford dependency scheme: A case study on BioInfer and GENIA.
    
  
    Proceedings of the Biological, translational, and clinical language processing, 2007