Charith Mendis

Orcid: 0000-0002-8140-2321

Affiliations:
  • University of Illinois at Urbana-Champaign, IL, USA


According to our database1, Charith Mendis authored at least 35 papers between 2015 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Dias: Dynamic Rewriting of Pandas Code.
Proc. ACM Manag. Data, February, 2024

SPLAT: A framework for optimised GPU code-generation for SParse reguLar ATtention.
CoRR, 2024

ConstraintFlow: A DSL for Specification and Verification of Neural Network Analyses.
CoRR, 2024

TGOnline: Enhancing Temporal Graph Learning with Adaptive Online Meta-Learning.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

COMET: Neural Cost Model Explanation Framework.
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

TGLite: A Lightweight Programming Framework for Continuous-Time Temporal Graph Neural Networks.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

Hydride: A Retargetable and Extensible Synthesis-based Compiler for Modern Hardware Architectures.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

Two-Face: Combining Collective and One-Sided Communication for Efficient Distributed SpMM.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
TpuGraphs: A Performance Prediction Dataset on Large Tensor Computational Graphs.
CoRR, 2023

FLuRKA: Fast fused Low-Rank & Kernel Attention.
CoRR, 2023

Input-sensitive dense-sparse primitive compositions for GNN acceleration.
CoRR, 2023

Dias: Dynamic Rewriting of Pandas Code.
CoRR, 2023

CoMEt: x86 Cost Model Explanation Framework.
CoRR, 2023

TGOpt: Redundancy-Aware Optimizations for Temporal Graph Attention Networks.
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

TpuGraphs: A Performance Prediction Dataset on Large Tensor Computational Graphs.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Learning Large Graph Property Prediction via Graph Segment Training.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Unified Convolution Framework: A compiler-based approach to support sparse convolutions.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

Challenges in Metaverse Research: An Internet of Things Perspective.
Proceedings of the IEEE International Conference on Metaverse Computing, 2023

SPADE: A Flexible and Scalable Accelerator for SpMM and SDDMM.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

WACO: Learning Workload-Aware Co-optimization of the Format and Schedule of a Sparse Tensor Program.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
All you need is superword-level parallelism: systematic control-flow vectorization with SLP.
Proceedings of the PLDI '22: 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, San Diego, CA, USA, June 13, 2022

GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation.
Proceedings of the IEEE International Symposium on Workload Characterization, 2022

2021
A Learned Performance Model for Tensor Processing Units.
Proceedings of the Fourth Conference on Machine Learning and Systems, 2021

VeGen: a vectorizer generator for SIMD and beyond.
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

2020
DiffTune: Optimizing CPU Simulator Parameters with Learned Differentiable Surrogates.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

2019
Compiler Auto-Vectorization with Imitation Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

BHive: A Benchmark Suite and Measurement Framework for Validating x86-64 Basic Block Performance Models.
Proceedings of the IEEE International Symposium on Workload Characterization, 2019

Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks.
Proceedings of the 36th International Conference on Machine Learning, 2019

Revec: program rejuvenation through revectorization.
Proceedings of the 28th International Conference on Compiler Construction, 2019

2018
goSLP: globally optimized superword level parallelism framework.
Proc. ACM Program. Lang., 2018

Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks.
CoRR, 2018

2017
Making caches work for graph analytics.
Proceedings of the 2017 IEEE International Conference on Big Data (IEEE BigData 2017), 2017

2016
Optimizing Cache Performance for Graph Analytics.
CoRR, 2016

Parallelizing WFST speech decoders.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Helium: lifting high-performance stencil kernels from stripped x86 binaries to halide DSL code.
Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2015


  Loading...