Avrilia Floratou

Orcid: 0009-0007-5760-8657

Affiliations:
  • Microsoft, USA
  • IBM Almaden Research Center, San Jose, CA, USA (former)


According to our database1, Avrilia Floratou authored at least 43 papers between 2010 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
LST-Bench: Benchmarking Log-Structured Tables in the Cloud.
Proc. ACM Manag. Data, February, 2024

PyFroid: Scaling Data Analysis on a Commodity Workstation.
Proceedings of the Proceedings 27th International Conference on Extending Database Technology, 2024


2023
Diversity, Equity and Inclusion Activities in Database Conferences: A 2022 Report.
SIGMOD Rec., June, 2023

OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance From Database Query Event Logs.
Proc. VLDB Endow., 2023

Will LLMs reshape, supercharge, or kill data science?
Proc. VLDB Endow., 2023

Exploiting Structure in Regular Expression Queries.
Proc. ACM Manag. Data, 2023

PACMMOD V1 N2 Editorial.
Proc. ACM Manag. Data, 2023

ReAcTable: Enhancing ReAct for Table Question Answering.
CoRR, 2023

Rapidash: Efficient Constraint Discovery via Rapid Verification.
CoRR, 2023

From Words to Code: Harnessing Data for Program Synthesis from Natural Language.
CoRR, 2023

PolySem: Efficient Polyglot Analytics on Semantic Data.
Proceedings of the Joint Proceedings of Workshops at the 49th International Conference on Very Large Data Bases (VLDB 2023), Vancouver, Canada, August 28, 2023

Demonstration of Geyser: Provenance Extraction and Applications over Data Science Scripts.
Proceedings of the Companion of the 2023 International Conference on Management of Data, 2023

Schema Matching using Pre-Trained Language Models.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

2022
Data Science Through the Looking Glass: Analysis of Millions of GitHub Notebooks and ML.NET Pipelines.
SIGMOD Rec., 2022

Diversity and Inclusion Activities in Database Conferences: A 2021 Report.
SIGMOD Rec., 2022

OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance from Database Logs.
CoRR, 2022

2020
Vamsa: Tracking Provenance in Data Science Scripts.
CoRR, 2020

Spur: Mitigating Slow Instances in Large-Scale Streaming Pipelines.
Proceedings of the 2020 International Conference on Management of Data, 2020

Vamsa: Automated Provenance Tracking in Data Science Scripts.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020


2019
Columnar Storage Formats.
Proceedings of the Encyclopedia of Big Data Technologies., 2019

Data Science through the looking glass and what we found there.
CoRR, 2019

General-Purpose vs. Specialized Data Analytics Systems: A Game of ML & SQL Thrones.
Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019

2018
Distributed File Systems.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Dhalion in Action: Automatic Management of Streaming Applications.
Proc. VLDB Endow., 2018

Netco: Cache and I/O Management for Analytics over Disaggregated Stores.
Proceedings of the ACM Symposium on Cloud Computing, 2018

2017
Dhalion: Self-Regulating Stream Processing in Heron.
Proc. VLDB Endow., 2017

Twitter Heron: Towards Extensible Streaming Engines.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

No data left behind: real-time insights from a complex data ecosystem.
Proceedings of the 2017 Symposium on Cloud Computing, SoCC 2017, Santa Clara, CA, USA, 2017

Self-Regulating Streaming Systems: Challenges and Opportunities.
Proceedings of the International Workshop on Real-Time Business Intelligence and Analytics, 2017

2016
ATHENA: An Ontology-Driven System for Natural Language Querying over Relational Data Stores.
Proc. VLDB Endow., 2016

Adaptive Caching in Big SQL using the HDFS Cache.
Proceedings of the Seventh ACM Symposium on Cloud Computing, 2016

2015
Towards Systematic Data Center Design.
Proceedings of the Seventh Biennial Conference on Innovative Data Systems Research, 2015

Replica Placement in Multi-tenant Database Environments.
Proceedings of the 2015 IEEE International Congress on Big Data, New York City, NY, USA, June 27, 2015

2014
SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures.
Proc. VLDB Endow., 2014

Towards Building Wind Tunnels for Data Center Design.
Proc. VLDB Endow., 2014

Benchmarking SQL-on-Hadoop Systems: TPC or Not TPC?
Proceedings of the Big Data Benchmarking - 5th International Workshop, 2014

2012
Can the Elephants Handle the NoSQL Onslaught?
Proc. VLDB Endow., 2012

2011
Efficient and Accurate Discovery of Patterns in Sequence Data Sets.
IEEE Trans. Knowl. Data Eng., 2011

Column-Oriented Storage Techniques for MapReduce.
Proc. VLDB Endow., 2011

When Free Is Not Really Free: What Does It Cost to Run a Database Workload in the Cloud?
Proceedings of the Topics in Performance Evaluation, Measurement and Characterization, 2011

2010
Efficient and accurate discovery of patterns in sequence datasets.
Proceedings of the 26th International Conference on Data Engineering, 2010


  Loading...