Abhinav Vishnu

Orcid: 0000-0002-0593-4780

According to our database1, Abhinav Vishnu authored at least 97 papers between 2004 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023

ADARNet: Deep Learning Predicts Adaptive Mesh Refinement.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

2022
NUNet: Deep Learning for Non-Uniform Super-Resolution of Turbulent Flows.
CoRR, 2022

2021
Semantic-Aware Lossless Data Compression for Deep Learning Recommendation Model (DLRM).
Proceedings of the IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, 2021

SURFNet: Super-Resolution of Turbulent Flows with Transfer Learning using Small Datasets.
Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021

2020
Scaling Deep Learning workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing.
Future Gener. Comput. Syst., 2020

CFDNet: a deep learning-based accelerator for fluid simulations.
Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

2019
Guest Editor's Introduction: P2S2: SI 2016.
Parallel Comput., 2019

Parallel programming models and systems software for high-end computing (P2S2 2018).
Parallel Comput., 2019

Foreword to the special issue for the Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2 2017).
Parallel Comput., 2019

Kleio: A Hybrid Memory Page Scheduler with Machine Intelligence.
Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing, 2019

2018
NUMA-Caffe: NUMA-Aware Deep Learning Neural Networks.
ACM Trans. Archit. Code Optim., 2018

ColdRoute: effective routing of cold questions in stack exchange sites.
Data Min. Knowl. Discov., 2018

Multimodal Deep Neural Networks using Both Engineered and Learned Representations for Biodegradability Prediction.
CoRR, 2018

GossipGraD: Scalable Deep Learning using Gossip Communication based Asynchronous Gradient Descent.
CoRR, 2018

How Much Chemistry Does a Deep Neural Network Need to Know to Make Accurate Predictions?
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Using Rule-Based Labels for Weak Supervised Learning: A ChemNet for Transferable Chemical Property Prediction.
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

ParLearning 2018 Invited Talk 1.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Introduction to GraML 2018.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Effective Machine Learning Based Format Selection and Performance Modeling for SpMV on GPUs.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Desh: deep learning for system health prediction of lead times to failure in HPC.
Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, 2018

2017
Deep learning for computational chemistry.
J. Comput. Chem., 2017

ChemNet: A Transferable and Generalizable Deep Neural Network for Small-Molecule Property Prediction.
CoRR, 2017

SMILES2Vec: An Interpretable General-Purpose Deep Neural Network for Predicting Chemical Properties.
CoRR, 2017

User-transparent Distributed TensorFlow.
CoRR, 2017

Chemception: A Deep Neural Network with Minimal Chemistry Knowledge Matches the Performance of Expert-developed QSAR/QSPR Models.
CoRR, 2017

Evaluating On-Node GPU Interconnects for Deep Learning Workloads.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2017

What does fault tolerant deep learning need from MPI?
Proceedings of the 24th European MPI Users' Group Meeting, 2017

Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Generating Performance Models for Irregular Applications.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Enabling scalability-sensitive speculative parallelization for FSM computations.
Proceedings of the International Conference on Supercomputing, 2017

A Learning Framework for Control-Oriented Modeling of Buildings.
Proceedings of the 16th IEEE International Conference on Machine Learning and Applications, 2017

2016
Editorial of the Special issue: SI: E2SC.
Parallel Comput., 2016

Special Issue on Parallel Programming Models and Systems Software for High-End Computing.
Parallel Comput., 2016

Performance analysis of data intensive cloud systems based on data management and replication: a survey.
Distributed Parallel Databases, 2016

Distributed TensorFlow with MPI.
CoRR, 2016

A Data-Driven Approach for Semantic Role Labeling from Induced Grammar Structures in Language.
CoRR, 2016

Performance and power for highly parallel systems.
Concurr. Comput. Pract. Exp., 2016

A survey and taxonomy on energy efficient resource allocation techniques for cloud computing systems.
Computing, 2016

Fault Modeling of Extreme Scale Applications Using Machine Learning.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Fault Tolerant Support Vector Machines.
Proceedings of the 45th International Conference on Parallel Processing, 2016

Accelerating Deep Learning with Shrinkage and Recall.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

Fault Tolerant Frequent Pattern Mining.
Proceedings of the 23rd IEEE International Conference on High Performance Computing, 2016

Adaptive neuron apoptosis for accelerating deep learning on large scale systems.
Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

2015
A work stealing based approach for enabling scalable optimal sequence homology detection.
J. Parallel Distributed Comput., 2015

Predicting the top and bottom ranks of billboard songs using Machine Learning.
CoRR, 2015

A case for application-oblivious energy-efficient MPI runtime.
Proceedings of the International Conference for High Performance Computing, 2015

Diagnosing the causes and severity of one-sided message contention.
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

Accelerating k-NN Algorithm with Hybrid MPI and OpenSHMEM.
Proceedings of the OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies, 2015

On the Impact of Execution Models: A Case Study in Computational Chemistry.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Fast and Accurate Support Vector Machines on Large Scale Systems.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

Large Scale Frequent Pattern Mining Using MPI One-Sided Model.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

2014
A performance comparison of current HPC systems: Blue Gene/Q, Cray XE6 and InfiniBand systems.
Future Gener. Comput. Syst., 2014

Fast Support Vector Machines Using Parallel Adaptive Shrinking on Distributed Systems.
CoRR, 2014

ParLearning Introduction and Committees.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

On the suitability of MPI as a PGAS runtime.
Proceedings of the 21st International Conference on High Performance Computing, 2014

2013
Designing energy efficient communication runtime systems: a view from PGAS models.
J. Supercomput., 2013

Guest Editors' introduction.
J. Supercomput., 2013

A survey on resource allocation in high performance distributed computing systems.
Parallel Comput., 2013

Special issue on programming models, systems software, and tools for High-End Computing.
Parallel Comput., 2013

An overview of energy efficiency techniques in cluster computing systems.
Clust. Comput., 2013

Building Scalable PGAS Communication Subsystem on Blue Gene/Q.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

2012
Comparing the Performance of Blue Gene/Q with Leading Cray XE6 and InfiniBand Systems.
Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

Designing scalable PGAS communication subsystems on cray gemini interconnect.
Proceedings of the 19th International Conference on High Performance Computing, 2012

Global Futures: A Multithreaded Execution Model for Global Arrays-based Applications.
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

2011
Special Issue on Programming Models, Software and Tools for High-End Computing.
Int. J. High Perform. Comput. Appl., 2011

Special Issue on Programming Models and Systems Software Support for High-End Computing Applications.
Int. J. High Perform. Comput. Appl., 2011

Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems.
Comput. Sci. Res. Dev., 2011

Codesign Challenges for Exascale Systems: Performance, Power, and Reliability.
Computer, 2011

Noncollective Communicator Creation in MPI.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

Dynamic Time-Variant Connection Management for PGAS Models on InfiniBand.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Iso-Energy-Efficiency: An Approach to Power-Constrained Parallel Computation.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Tutorial Statement.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Evaluating the Potential of Cray Gemini Interconnect for PGAS Communication Runtime Systems.
Proceedings of the IEEE 19th Annual Symposium on High Performance Interconnects, 2011

Energy Templates: Exploiting Application Information to Save Energy.
Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

2010
Designing topology-aware collective communication algorithms for large scale InfiniBand clusters: Case studies with Scatter and Gather.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Scaling Linear Algebra Kernels Using Remote Memory Access.
Proceedings of the 39th International Conference on Parallel Processing, 2010

Fault-tolerant communication runtime support for data-centric programming models.
Proceedings of the 2010 International Conference on High Performance Computing, 2010

Designing Energy Efficient Communication Runtime Systems for Data Centric Programming Models.
Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010

Efficient On-Demand Connection Management Mechanisms with PGAS Models over InfiniBand.
Proceedings of the 10th IEEE/ACM International Conference on Cluster, 2010

2009
Topology agnostic hot-spot avoidance with InfiniBand.
Concurr. Comput. Pract. Exp., 2009

An efficient hardware-software approach to network fault tolerance with InfiniBand.
Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009

2007
On using connection-oriented vs. connection-less transport for performance and scalability of collective and one-sided operations: trade-offs and impact.
Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

Automatic Path Migration over InfiniBand: Early Experiences.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

High Performance MPI on IBM 12x InfiniBand Architecture.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

High Performance MPI over iWARP: Early Experiences.
Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

Hot-Spot Avoidance With Multi-Pathing Over InfiniBand: An MPI Perspective.
Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), 2007

High Performance Distributed Lock Management Services using Network-based Remote Atomic Operations.
Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), 2007

2006
Scalable systems software - A software based approach for providing network fault tolerance in clusters with uDAPL interface: MPI level design and performance evaluation.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Efficient Shared Memory and RDMA Based Design for MPI_Allgather over InfiniBand.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Memory Scalability Evaluation of the Next-Generation Intel Bensley Platform with InfiniBand.
Proceedings of the 14th IEEE Symposium on High-Performance Interconnects, 2006

2005
Evaluating InfiniBand Performance with PCI Express.
IEEE Micro, 2005

Performance Modeling of Subnet Management on Fat Tree InfiniBand Networks using OpenSM.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems?.
Proceedings of the 13th Annual IEEE Symposium on High Performance Interconnects (HOTIC 2005), 2005

Supporting MPI-2 One Sided Communication on Multi-rail InfiniBand Clusters: Design Challenges and Performance Benefits.
Proceedings of the High Performance Computing, 2005

2004
Building Multirail InfiniBand Clusters: MPI-Level Design and Performance Evaluation.
Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 2004

Performance evaluation of InfiniBand with PCI Express.
Proceedings of the 12th Annual IEEE Symposium on High Performance Interconnects, 2004


  Loading...