Matei Zaharia

Orcid: 0000-0002-7547-7204

  • Stanford University, CA, USA

According to our database1, Matei Zaharia authored at least 194 papers between 2006 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 


Online presence:



Adaptive and Robust Query Execution for Lakehouses At Scale.
Proc. VLDB Endow., August, 2024

ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Structured Data.
Proc. ACM Manag. Data, 2024

Compass: Encrypted Semantic Search with High Accuracy.
IACR Cryptol. ePrint Arch., 2024

Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design.
CoRR, 2024

LOTUS: Enabling Semantic Queries with LLMs Over Tables of Unstructured and Structured Data.
CoRR, 2024

Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs.
CoRR, 2024

Generating Probabilistic Scenario Programs from Natural Language.
CoRR, 2024

RAFT: Adapting Language Model to Domain Specific RAG.
CoRR, 2024

Optimizing LLM Queries in Relational Workloads.
CoRR, 2024

Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems.
CoRR, 2024

World Model on Million-Length Video And Language With Blockwise RingAttention.
CoRR, 2024

Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks.
Proceedings of the IEEE Security and Privacy, 2024

ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

DSPy: Compiling Declarative Language Model Calls into State-of-the-Art Pipelines.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

RingAttention with Blockwise Transformers for Near-Infinite Context.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

ALTO: An Efficient Network Orchestrator for Compound AI Systems.
Proceedings of the 4th Workshop on Machine Learning and Systems, 2024

Accelerating Aggregation Queries on Unstructured Streams of Data.
Proc. VLDB Endow., 2023

Epoxy: ACID Transactions Across Diverse Data Stores.
Proc. VLDB Endow., 2023

R<sup>3</sup>: Record-Replay-Retroaction for Database-Backed Applications.
Proc. VLDB Endow., 2023

DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines.
CoRR, 2023

Image and Data Mining in Reticular Chemistry Using GPT-4V.
CoRR, 2023

Data Acquisition: A New Frontier in Data-centric AI.
CoRR, 2023

Exploration with Principles for Diverse AI Supervision.
CoRR, 2023

DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines.
CoRR, 2023

Ring Attention with Blockwise Transformers for Near-Infinite Context.
CoRR, 2023

How is ChatGPT's behavior changing over time?
CoRR, 2023

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance.
CoRR, 2023

Zelda: Video Analytics using Vision-Language Models.
CoRR, 2023

Cornflakes: Zero-Copy Serialization for Microsecond-Scale Networking.
Proceedings of the 29th Symposium on Operating Systems Principles, 2023

MegaBlocks: Efficient Sparse Training with Mixture-of-Experts.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

Congestion Control Safety via Comparative Statics.
Proceedings of the IEEE INFOCOM 2023, 2023

Transactions Make Debugging Easy.
Proceedings of the 13th Conference on Innovative Data Systems Research, 2023

Analyzing and Comparing Lakehouse Storage Systems.
Proceedings of the 13th Conference on Innovative Data Systems Research, 2023

Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

HAPI Explorer: Comprehension, Discovery, and Explanation on History of ML APIs.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Author Correction: Advances, challenges and opportunities in creating data for trustworthy AI.
Nat. Mac. Intell., October, 2022

Optimizing Video Analytics with Declarative Model Relationships.
Proc. VLDB Endow., 2022

Parallelism-Optimizing Data Placement for Faster Data-Parallel Computations.
Proc. VLDB Endow., 2022

Cloud Data Systems: What are the Opportunities for the Database Research Community?
Proc. VLDB Endow., 2022

Advances, challenges and opportunities in creating data for trustworthy AI.
Nat. Mach. Intell., 2022

Allocation of fungible resources via a fast, scalable price discovery method.
Math. Program. Comput., 2022

Overlook: Differentially Private Exploratory Visualization for Big Data.
J. Priv. Confidentiality, 2022

Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP.
CoRR, 2022

Apiary: A DBMS-Backed Transactional Function-as-a-Service Framework.
CoRR, 2022

Extricating IoT Devices from Vendor Infrastructure with Karl.
CoRR, 2022

TASTI: Semantic Indexes for Machine Learning-based Queries over Unstructured Data.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

Finding Label and Model Errors in Perception Data With Learned Observation Assertions.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

Data-Parallel Actors: A Programming Model for Scalable Query Serving Systems.
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022

Estimating and Explaining Model Performance When Both Covariates and Labels Shift.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Efficient Online ML API Selection for Multi-Label Classification Tasks.
Proceedings of the International Conference on Machine Learning, 2022

Hindsight: Posterior-guided training of retrievers for improved open-ended generation.
Proceedings of the Tenth International Conference on Learning Representations, 2022

How Did the Model Change? Efficiently Assessing Machine Learning API Shifts.
Proceedings of the Tenth International Conference on Learning Representations, 2022

PLAID: An Efficient Engine for Late Interaction Retrieval.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

VIVA: An End-to-End System for Interactive Video Analytics.
Proceedings of the 12th Conference on Innovative Data Systems Research, 2022

A Progress Report on DBOS: A Database-oriented Operating System.
Proceedings of the 12th Conference on Innovative Data Systems Research, 2022

Similarity Search for Efficient Active Learning and Search of Rare Concepts.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

DIFF: a relational interface for large-scale data explanation.
VLDB J., 2021

Relevance-guided Supervision for OpenQA with ColBERT.
Trans. Assoc. Comput. Linguistics, 2021

Designing Production-Friendly Machine Learning.
Proc. VLDB Endow., 2021

DBOS: A DBMS-oriented Operating System.
Proc. VLDB Endow., 2021

Accelerating Approximate Aggregation Queries with Expensive Predicates.
Proc. VLDB Endow., 2021

What can Data-Centric AI Learn from Data and ML Engineering?
CoRR, 2021

Toward Compact Parameter Representations for Architecture-Agnostic Neural Network Compression.
CoRR, 2021

DistIR: An Intermediate Representation and Simulator for Efficient Neural Network Distribution.
CoRR, 2021

Did the Model Change? Efficiently Assessing Machine Learning API Shifts.
CoRR, 2021

Proof: Accelerating Approximate Aggregation Queries with Expensive Predicates.
CoRR, 2021

Don't Give Up on Large Optimization Problems; POP Them!
CoRR, 2021

Efficient Large-Scale Language Model Training on GPU Clusters.
CoRR, 2021

FrugalMCT: Efficient Online ML API Selection for Multi-Label Classification Tasks.
CoRR, 2021

Express: Lowering the Cost of Metadata-hiding Communication with Cryptographic Privacy.
Proceedings of the 30th USENIX Security Symposium, 2021

Solving Large-Scale Granular Resource Allocation Problems Efficiently with POP.
Proceedings of the SOSP '21: ACM SIGOPS 28th Symposium on Operating Systems Principles, 2021

Efficient large-scale language model training on GPU clusters using megatron-LM.
Proceedings of the International Conference for High Performance Computing, 2021

Contracting Wide-area Network Topologies to Solve Flow Problems Quickly.
Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation, 2021

Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Memory-Efficient Pipeline-Parallel DNN Training.
Proceedings of the 38th International Conference on Machine Learning, 2021

Breakfast of champions: towards zero-copy serialization with NIC scatter-gather.
Proceedings of the HotOS '21: Workshop on Hot Topics in Operating Systems, 2021

Don't Hate the Player, Hate the Game: Safety and Utility in Multi-Agent Congestion Control.
Proceedings of the HotNets '21: The 20th ACM Workshop on Hot Topics in Networks, 2021

Clamor: Extending Functional Cluster Computing Frameworks with Fine-Grained Remote Memory Access.
Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021

Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics.
Proceedings of the 11th Conference on Innovative Data Systems Research, 2021

Challenges and Opportunities for Autonomous Vehicle Query Systems.
Proceedings of the 11th Conference on Innovative Data Systems Research, 2021

Posh: A Data-Aware Shell.
login Usenix Mag., 2020

A Demonstration of Willump: A Statistically-Aware End-to-end Optimizer for Machine Learning Inference.
Proc. VLDB Endow., 2020

Jointly Optimizing Preprocessing and Inference for DNN-based Visual Analytics.
Proc. VLDB Endow., 2020

Approximate Selection with Guarantees using Proxies.
Proc. VLDB Endow., 2020

Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores.
Proc. VLDB Endow., 2020

Task-agnostic Indexes for Deep Learning-based Queries over Unstructured Data.
CoRR, 2020

DBOS: A Proposal for a Data-Centric Operating System.
CoRR, 2020

Similarity Search for Efficient Active Learning and Search of Rare Concepts.
CoRR, 2020

Offload Annotations: Bringing Heterogeneous Computing to Existing Libraries and Workloads.
Proceedings of the 2020 USENIX Annual Technical Conference, 2020

Spectral Lower Bounds on the I/O Complexity of Computation Graphs.
Proceedings of the SPAA '20: 32nd ACM Symposium on Parallelism in Algorithms and Architectures, 2020

Developments in MLflow: A System to Accelerate the Machine Learning Lifecycle.
Proceedings of the Fourth Workshop on Data Management for End-To-End Machine Learning, 2020

ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

Sparse GPU kernels for deep learning.
Proceedings of the International Conference for High Performance Computing, 2020

Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

FrugalML: How to use ML Prediction APIs more accurately and cheaply.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Willump: A Statistically-Aware End-to-end Optimizer for Machine Learning Inference.
Proceedings of the Third Conference on Machine Learning and Systems, 2020

Model Assertions for Monitoring and Improving ML Models.
Proceedings of the Third Conference on Machine Learning and Systems, 2020

Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with Roc.
Proceedings of the Third Conference on Machine Learning and Systems, 2020

Selection via Proxy: Efficient Data Selection for Deep Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

A Polystore Based Database Operating System (DBOS).
Proceedings of the Heterogeneous Data Management, Polystores, and Analytics for Healthcare, 2020

PPMLP 2020: Workshop on Privacy-Preserving Machine Learning In Practice.
Proceedings of the CCS '20: 2020 ACM SIGSAC Conference on Computer and Communications Security, 2020

Fleet: A Framework for Massively Parallel Streaming on FPGAs.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

Outsourcing Everyday Jobs to Thousands of Cloud Functions with gg.
login Usenix Mag., 2019

Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark.
ACM SIGOPS Oper. Syst. Rev., 2019

BlazeIt: Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics.
Proc. VLDB Endow., 2019

ObliDB: Oblivious Query Processing for Secure Databases.
Proc. VLDB Endow., 2019

MLPerf Training Benchmark.
CoRR, 2019

Automated Lower Bounds on the I/O Complexity of Computation Graphs.
CoRR, 2019

SysML: The New Frontier of Machine Learning Systems.
CoRR, 2019

From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers.
Proceedings of the 2019 USENIX Annual Technical Conference, 2019

Optimizing data-intensive computations in existing libraries with split annotations.
Proceedings of the 27th ACM Symposium on Operating Systems Principles, 2019

PipeDream: generalized pipeline parallelism for DNN training.
Proceedings of the 27th ACM Symposium on Operating Systems Principles, 2019

TASO: optimizing deep learning computation with automatic generation of graph substitutions.
Proceedings of the 27th ACM Symposium on Operating Systems Principles, 2019

Beyond Data and Model Parallelism for Deep Neural Networks.
Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019

Optimizing DNN Computation with Relaxed Graph Substitutions.
Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019

LIT: Learned Intermediate Representation Training for Model Compression.
Proceedings of the 36th International Conference on Machine Learning, 2019

To Index or Not to Index: Optimizing Exact Maximum Inner Product Search.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

Lessons from Large-Scale Software as a Service at Databricks.
Proceedings of the ACM Symposium on Cloud Computing, SoCC 2019, 2019

Challenges and Opportunities in DNN-Based Video Analytics: A Demonstration of the BlazeIt Video Query Engine.
Proceedings of the 9th Biennial Conference on Innovative Data Systems Research, 2019

Big Data Platforms for Data Analytics.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Evaluating End-to-End Optimization for Data Analytics Applications in Weld.
Proc. VLDB Endow., 2018

Filter Before You Parse: Faster Analytics on Raw Data with Sparser.
Proc. VLDB Endow., 2018

DIFF: A Relational Interface for Large-Scale Data Explanation.
Proc. VLDB Endow., 2018

Accelerating the Machine Learning Lifecycle with MLflow.
IEEE Data Eng. Bull., 2018

Splitability Annotations: Optimizing Black-Box Function Composition in Existing Libraries.
CoRR, 2018

LIT: Block-wise Intermediate Representation Training for Model Compression.
CoRR, 2018

BlazeIt: Fast Exploratory Video Queries using Neural Networks.
CoRR, 2018

MISTIQUE: A System to Store and Query Model Intermediates for Model Diagnosis.
Proceedings of the 2018 International Conference on Management of Data, 2018

Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark.
Proceedings of the 2018 International Conference on Management of Data, 2018

NoScope: Optimizing Deep CNN-Based Queries over Video Streams at Scale.
Proc. VLDB Endow., 2017

An Oblivious General-Purpose SQL Database for the Cloud.
CoRR, 2017

Weld: Rethinking the Interface Between Data-Intensive Applications.
CoRR, 2017

Optimizing Deep CNN-Based Queries over Video Streams at Scale.
CoRR, 2017

Infrastructure for Usable Machine Learning: The Stanford DAWN Project.
CoRR, 2017

SimDex: Exploiting Model Similarity in Exact Matrix Factorization Recommendations.
CoRR, 2017

Stadium: A Distributed Metadata-Private Messaging System.
Proceedings of the 26th Symposium on Operating Systems Principles, 2017

DIY Hosting for Online Privacy.
Proceedings of the 16th ACM Workshop on Hot Topics in Networks, Palo Alto, CA, USA, 2017

A Common Runtime for High Performance Data Analysis.
Proceedings of the 8th Biennial Conference on Innovative Data Systems Research, 2017

Making caches work for graph analytics.
Proceedings of the 2017 IEEE International Conference on Big Data (IEEE BigData 2017), 2017

Voodoo - A Vector Algebra for Portable Database Performance on Modern Hardware.
Proc. VLDB Endow., 2016

MLlib: Machine Learning in Apache Spark.
J. Mach. Learn. Res., 2016

Splinter: Practical Private Queries on Public Data.
IACR Cryptol. ePrint Arch., 2016

Stadium: A Distributed Metadata-Private Messaging System.
IACR Cryptol. ePrint Arch., 2016

Optimizing Cache Performance for Graph Analytics.
CoRR, 2016

Apache Spark: a unified engine for big data processing.
Commun. ACM, 2016

SparkR: Scaling R Programs with Spark.
Proceedings of the 2016 International Conference on Management of Data, 2016

ModelDB: a system for machine learning model management.
Proceedings of the Workshop on Human-In-the-Loop Data Analytics, 2016

Introduction to Spark 2.0 for Database Researchers.
Proceedings of the 2016 International Conference on Management of Data, 2016

FairRide: Near-Optimal, Fair Cache Sharing.
Proceedings of the 13th USENIX Symposium on Networked Systems Design and Implementation, 2016

Yggdrasil: An Optimized System for Training Deep Decision Trees at Scale.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Matrix Computations and Optimization in Apache Spark.
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016

GraphFrames: an integrated API for mixing graph and relational queries.
Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems, Redwood Shores, CA, USA, June 24, 2016

Scaling Spark in the Real World: Performance and Usability.
Proc. VLDB Endow., 2015

linalg: Matrix Computations in Apache Spark.
CoRR, 2015

Vuvuzela: scalable private messaging resistant to traffic analysis.
Proceedings of the 25th Symposium on Operating Systems Principles, 2015

Spark SQL: Relational Data Processing in Spark.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks.
Proceedings of the ACM Symposium on Cloud Computing, 2014

An Architecture for and Fast and General Data Processing on Large Clusters.
PhD thesis, 2013

Large-Scale Estimation in Cyberphysical Systems Using Streaming Data: A Case Study With Arterial Traffic Estimation.
IEEE Trans Autom. Sci. Eng., 2013

Discretized streams: fault-tolerant streaming computation at scale.
Proceedings of the ACM SIGOPS 24th Symposium on Operating Systems Principles, 2013

Sparrow: distributed, low latency scheduling.
Proceedings of the ACM SIGOPS 24th Symposium on Operating Systems Principles, 2013

Shark: SQL and rich analytics at scale.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

Choosy: max-min fair sharing for datacenter jobs with constraints.
Proceedings of the Eighth Eurosys Conference 2013, 2013

Fast and Interactive Analytics over Hadoop Data with Spark.
login Usenix Mag., 2012

Large Scale Estimation in Cyberphysical Systems using Streaming Data: a Case Study with Smartphone Traces
CoRR, 2012

Cloud Terminal: Secure Access to Sensitive Applications from Untrusted Systems.
Proceedings of the 2012 USENIX Annual Technical Conference, 2012

Shark: fast data analysis using coarse-grained distributed memory.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2012

Multi-resource fair queueing for packet processing.
Proceedings of the ACM SIGCOMM 2012 Conference, 2012

Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing.
Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, 2012

Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters.
Proceedings of the 4th USENIX Workshop on Hot Topics in Cloud Computing, 2012

Optimally Designing Games for Cognitive Science Research.
Proceedings of the 34th Annual Meeting of the Cognitive Science Society, 2012

Mesos: Flexible Resource Sharing for the Cloud.
login Usenix Mag., 2011

Faster and More Accurate Sequence Alignment with SNAP
CoRR, 2011

Design and implementation of the KioskNet system.
Comput. Networks, 2011

Managing data transfers in computer clusters with orchestra.
Proceedings of the ACM SIGCOMM 2011 Conference on Applications, 2011

Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center.
Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation, 2011

Dominant Resource Fairness: Fair Allocation of Multiple Resource Types.
Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation, 2011

The Datacenter Needs an Operating System.
Proceedings of the 3rd USENIX Workshop on Hot Topics in Cloud Computing, 2011

Scaling the mobile millennium system in the cloud.
Proceedings of the ACM Symposium on Cloud Computing in conjunction with SOSP 2011, 2011

A view of cloud computing.
Commun. ACM, 2010

Spark: Cluster Computing with Working Sets.
Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing, 2010

Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling.
Proceedings of the European Conference on Computer Systems, 2010

ICTD for healthcare in Ghana: Two parallel case studies.
Proceedings of the 2009 International Conference on Information and Communication Technologies and Development, 2009

A Common Substrate for Cluster Computing.
Proceedings of the Workshop on Hot Topics in Cloud Computing, 2009

Gossip-based search selection in hybrid peer-to-peer networks.
Concurr. Comput. Pract. Exp., 2008

Improving MapReduce Performance in Heterogeneous Environments.
Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation, 2008

Very low-cost internet access using KioskNet.
Comput. Commun. Rev., 2007

Finding Content in File-Sharing Networks When You Can't Even Spell.
Proceedings of the 6th International workshop on Peer-To-Peer Systems, 2007

Design and implementation of the KioskNet system.
Proceedings of the 2007 International Conference on Information and Communication Technologies and Development, 2007

Low-cost communication for rural internet kiosks using mechanical backhaul.
Proceedings of the 12th Annual International Conference on Mobile Computing and Networking, 2006
