Bharat Kaul

According to our database1, Bharat Kaul authored at least 30 papers between 2012 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
AUTOSPARSE: Towards Automated Sparse Training of Deep Neural Networks.
CoRR, 2023

2022
Accelerating Deep Learning based Identification of Chromatin Accessibility from noisy ATAC-seq Data.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

2021
PolyDL: Polyhedral Optimizations for Creation of High-performance DL Primitives.
ACM Trans. Archit. Code Optim., 2021

MADRaS : Multi Agent Driving Simulator.
J. Artif. Intell. Res., 2021

Efficient and Generic 1D Dilated Convolution Layer for Deep Learning.
CoRR, 2021

AI Powered Compiler Techniques for DL Code Optimization.
CoRR, 2021

GNNerator: A Hardware/Software Framework for Accelerating Graph Neural Networks.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

SEERL: Sample Efficient Ensemble Reinforcement Learning.
Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021

2020
PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives.
CoRR, 2020

SIGMA: A Sparse and Irregular GEMM Accelerator with Flexible Interconnects for DNN Training.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

ERLP: Ensembles of Reinforcement Learning Policies (Student Abstract).
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
K-TanH: Hardware Efficient Activations For Deep Learning.
CoRR, 2019

High Performance Scalable FPGA Accelerator for Deep Neural Networks.
CoRR, 2019

Automatic Model Parallelism for Deep Neural Networks with Compiler and Hardware Support.
CoRR, 2019

Mixed Precision Training With 8-bit Floating Point.
CoRR, 2019

A Study of BFLOAT16 for Deep Learning Training.
CoRR, 2019

Manna: An Accelerator for Memory-Augmented Neural Networks.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

X-MANN: A Crossbar based Architecture for Memory Augmented Neural Networks.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

2018
On Scale-out Deep Learning Training for Cloud and HPC.
CoRR, 2018

Mixed Precision Training of Convolutional Neural Networks using Integer Operations.
Proceedings of the 6th International Conference on Learning Representations, 2018

Out-of-Distribution Detection Using an Ensemble of Self Supervised Leave-Out Classifiers.
Proceedings of the Computer Vision - ECCV 2018, 2018

RAIL: Risk-Averse Imitation Learning.
Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018

2017
Ternary Neural Networks with Fine-Grained Quantization.
CoRR, 2017

Mixed Low-precision Deep Learning Inference using Dynamic Fixed Point.
CoRR, 2017

Ternary Residual Networks.
CoRR, 2017

ScaleDeep: A Scalable Compute Architecture for Learning and Evaluating Deep Networks.
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

2016
Distributed Deep Learning Using Synchronous Stochastic Gradient Descent.
CoRR, 2016

2015
Exploring Shared-Memory Optimizations for an Unstructured Mesh CFD Application on Modern Parallel Systems.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

2014
Improving Communication Performance and Scalability of Native Applications on Intel Xeon Phi Coprocessor Clusters.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

2012
High Performance Non-uniform FFT on Modern X86-based Multi-core Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012


  Loading...