Evangelos Georganas

According to our database1, Evangelos Georganas authored at least 34 papers between 2012 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures.
CoRR, 2023

2022
Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning and HPC Workloads.
Frontiers Appl. Math. Stat., 2022

FPGA-based AI Smart NICs for Scalable Distributed AI Training Systems.
CoRR, 2022

FPGA-Based AI Smart NICs for Scalable Distributed AI Training Systems.
IEEE Comput. Archit. Lett., 2022

Accelerating Deep Learning based Identification of Chromatin Accessibility from noisy ATAC-seq Data.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

2021
Efficient and Generic 1D Dilated Convolution Layer for Deep Learning.
CoRR, 2021

Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning Workloads.
CoRR, 2021

DistGNN: scalable distributed training for large-scale graph neural networks.
Proceedings of the International Conference for High Performance Computing, 2021

Tensor processing primitives: a programming abstraction for efficiency and portability in deep learning workloads.
Proceedings of the International Conference for High Performance Computing, 2021

Towards Flexible and Compiler-Friendly Layer Fusion for CNNs on Multicore CPUs.
Proceedings of the Euro-Par 2021: Parallel Processing, 2021

2020
The Parallelism Motifs of Genomic Data Analysis.
CoRR, 2020

Optimizing deep learning recommender systems training on CPU cluster architectures.
Proceedings of the International Conference for High Performance Computing, 2020

Harnessing Deep Learning via a Single Building Block.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

2019
Optimizing Deep Learning RNN Topologies on Intel Architecture.
Supercomput. Front. Innov., 2019

High-Performance Deep Learning via a Single Building Block.
CoRR, 2019

A Study of BFLOAT16 for Deep Learning Training.
CoRR, 2019

Training Google Neural Machine Translation on an Intel CPU Cluster.
Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019

ISA mapper: a compute and hardware agnostic deep learning compiler.
Proceedings of the 16th ACM International Conference on Computing Frontiers, 2019

2018
Extreme scale de novo metagenome assembly.
Proceedings of the International Conference for High Performance Computing, 2018

Anatomy of high-performance deep learning convolutions on SIMD architectures.
Proceedings of the International Conference for High Performance Computing, 2018

Mixed Precision Training of Convolutional Neural Networks using Integer Operations.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
Extreme-Scale De Novo Genome Assembly.
CoRR, 2017

A New Parallel Research Kernel to Expand Research on Dynamic Load-Balancing Capabilities.
Proceedings of the High Performance Computing - 32nd International Conference, 2017

MerBench: PGAS Benchmarks for High Performance Genome Assembly.
Proceedings of PAW@SC 2017: Second Annual PGAS Applications Workshop, 2017

Performance Characterization of De Novo Genome Assembly on Leading Parallel Systems.
Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017

2016
Scalable Parallel Algorithms for Genome Analysis.
PhD thesis, 2016

Design and Implementation of a Parallel Research Kernel for Assessing Dynamic Load-Balancing Capabilities.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

2015
HipMer: an extreme-scale de novo genome assembler.
Proceedings of the International Conference for High Performance Computing, 2015

merAligner: A Fully Parallel Sequence Aligner.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

2014
Scalable multimedia content analysis on parallel platforms using python.
ACM Trans. Multim. Comput. Commun. Appl., 2014

Constructing Performance Models for Dense Linear Algebra Algorithms on Cray XE Systems.
CoRR, 2014

Parallel De Bruijn Graph Construction and Traversal for De Novo Genome Assembly.
Proceedings of the International Conference for High Performance Computing, 2014

2013
A Communication-Optimal N-Body Algorithm for Direct Interactions.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

2012
Communication avoiding and overlapping for numerical linear algebra.
Proceedings of the SC Conference on High Performance Computing Networking, 2012


  Loading...