Michael Dinzinger

Orcid: 0009-0003-1747-5643

According to our database1, Michael Dinzinger authored at least 21 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Form Without Function: Agent Social Behavior in the Moltbook Network.
CoRR, April, 2026

WebFAQ 2.0: A Multilingual QA Dataset with Mined Hard Negatives for Dense Retrieval.
CoRR, February, 2026

OwlerLite: Scope- and Freshness-Aware Web Retrieval for LLM Assistants.
CoRR, January, 2026

Creating Specialized RAG-Based Search Engines Using the Open Web Index.
Proceedings of the Advances in Information Retrieval, 2026

CoRECT: A Framework for Evaluating Embedding Compression Techniques at Scale.
Proceedings of the Advances in Information Retrieval, 2026

2025
Investigating the Robustness of Embedding Models on Noisy Input Texts.
Proceedings of the 2nd International Workshop on Open Web Search co-located with the 47th European Conference on Information Retrieval (ECIR 2025), 2025

Combining Embedding Models for RAG: A Similarity-Based Approach.
Proceedings of the 2nd International Workshop on Open Web Search co-located with the 47th European Conference on Information Retrieval (ECIR 2025), 2025

WebFAQ: A Multilingual Collection of Natural Q&A Datasets for Dense Retrieval.
Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025

Compressed Concatenation of Small Embedding Models.
Proceedings of the 34th ACM International Conference on Information and Knowledge Management, 2025

2024
OWLer URLFrontier.
Dataset, September, 2024

OWLer URLFrontier.
Dataset, September, 2024

OWLer StormCrawler.
Dataset, September, 2024

Extracted external domains from Wikipedia dump - 20/03/2024.
Dataset, April, 2024

A Comprehensive Dataset for Webpage Classification (Part 3: Benign 2).
Dataset, March, 2024

A Comprehensive Dataset for Webpage Classification (Part 2: Benign 1).
Dataset, March, 2024

A Comprehensive Dataset for Webpage Classification (Part 1: Adult & Malicious).
Dataset, March, 2024

A Longitudinal Study of Content Control Mechanisms.
Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, 2024

A Survey of Web Content Control for Generative AI.
Proceedings of the first International Workshop on Open Web Search co-located with the 46th European Conference on Information Retrieval ECIR 2024, 2024

The Open Web Index - Crawling and Indexing the Web for Public Use.
Proceedings of the Advances in Information Retrieval, 2024

2023
The Open Web Search Crawler (OWLer).
Dataset, August, 2023

2022
Data Science Meets High-Tech Manufacturing - The BTW 2021 Data Science Challenge.
Datenbank-Spektrum, 2022


  Loading...