Pradeep Dubey

According to our database1, Pradeep Dubey authored at least 112 papers between 1979 and 2019.

Collaborative distances:

Awards

IEEE Fellow

IEEE Fellow 2001, "For contributions to computer architecture supporting multimedia processing.".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2019
SysML: The New Frontier of Machine Learning Systems.
CoRR, 2019

2018
Graphical exchange mechanisms.
Games and Economic Behavior, 2018

Money as minimal complexity.
Games and Economic Behavior, 2018

Mixed Precision Training of Convolutional Neural Networks using Integer Operations.
CoRR, 2018

On Scale-out Deep Learning Training for Cloud and HPC.
CoRR, 2018

Mixed Precision Training of Convolutional Neural Networks using Integer Operations.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
Galactos: Computing the Anisotropic 3-Point Correlation Function for 2 Billion Galaxies.
CoRR, 2017

Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data.
CoRR, 2017

Ternary Neural Networks with Fine-Grained Quantization.
CoRR, 2017

Ternary Residual Networks.
CoRR, 2017

Deep learning at 15PF: supervised and semi-supervised classification for scientific data.
Proceedings of the International Conference for High Performance Computing, 2017

Galactos: computing the anisotropic 3-point correlation function for 2 billion galaxies.
Proceedings of the International Conference for High Performance Computing, 2017

The Quest for The Ultimate Learning Machine.
Proceedings of the 2017 ACM on International Symposium on Physical Design, 2017

ScaleDeep: A Scalable Compute Architecture for Learning and Evaluating Deep Networks.
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Faster CNNs with Direct Sparse Convolutions and Guided Pruning.
Proceedings of the 5th International Conference on Learning Representations, 2017

2016
Full-Stack Architecting to Achieve a Billion-Requests-Per-Second Throughput on a Single Key-Value Store Server Platform.
ACM Trans. Comput. Syst., 2016

Efficient Approximation Algorithms for Weighted b-Matching.
SIAM J. Scientific Computing, 2016

Achieving One Billion Key-Value Requests per Second on a Single Server.
IEEE Micro, 2016

Optimizations in a high-performance conjugate gradient benchmark for IA-based multi- and many-core processors.
IJHPCA, 2016

Scaling up Hartree-Fock calculations on Tianhe-2.
IJHPCA, 2016

Eliciting performance: deterministic versus proportional prizes.
Int. J. Game Theory, 2016

PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures.
CoRR, 2016

Holistic SparseCNN: Forging the Trident of Accuracy, Speed, and Size.
CoRR, 2016

BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies.
Proceedings of the 4th International Conference on Learning Representations, 2016

Parallelizing Word2Vec in Multi-Core and Many-Core Architectures.
CoRR, 2016

Parallelizing Word2Vec in Shared and Distributed Memory.
CoRR, 2016

Distributed Deep Learning Using Synchronous Stochastic Gradient Descent.
CoRR, 2016

High Order Seismic Simulations on the Intel Xeon Phi Processor (Knights Landing).
Proceedings of the High Performance Computing - 31st International Conference, 2016

Designing scalable b-Matching algorithms on distributed memory multiprocessors by approximation.
Proceedings of the International Conference for High Performance Computing, 2016

PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

GraphPad: Optimized Graph Primitives for Parallel and Distributed Platforms.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

2015
GraphMat: High performance graph analytics made productive.
PVLDB, 2015

Beacon: Deployment and Application of Intel Xeon Phi Coprocessorsfor Scientific Computing.
Computing in Science and Engineering, 2015

GraphMat: High performance graph analytics made productive.
CoRR, 2015

Graphical Exchange Mechanisms.
CoRR, 2015

Money as Minimal Complexity.
CoRR, 2015

Decentralization of a Machine: Some Definitions.
CoRR, 2015

Can traditional programming bridge the ninja performance gap for parallel computing applications?
Commun. ACM, 2015

Parallel Efficient Sparse Matrix-Matrix Multiplication on Multicore Platforms.
Proceedings of the High Performance Computing - 30th International Conference, 2015

BD-CATS: big data clustering at trillion particle scale.
Proceedings of the International Conference for High Performance Computing, 2015

High-performance algebraic multigrid solver optimized for multi-core based distributed parallel systems.
Proceedings of the International Conference for High Performance Computing, 2015

Improving graph partitioning for modern graphs and architectures.
Proceedings of the 5th Workshop on Irregular Applications - Architectures and Algorithms, 2015

Architecting to achieve a billion requests per second throughput on a single key-value store server platform.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Exploring Shared-Memory Optimizations for an Unstructured Mesh CFD Application on Modern Parallel Systems.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

2014
Sparsifying Synchronization for High-Performance Shared-Memory Sparse Triangular Solver.
Proceedings of the Supercomputing - 29th International Conference, 2014

Navigating the maze of graph analytics frameworks using massive graph datasets.
Proceedings of the International Conference on Management of Data, 2014

Pardicle: Parallel Approximate Density-Based Clustering.
Proceedings of the International Conference for High Performance Computing, 2014

Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices.
Proceedings of the International Conference for High Performance Computing, 2014

Lattice QCD with Domain Decomposition on Intel® Xeon Phi Co-Processors.
Proceedings of the International Conference for High Performance Computing, 2014

Petascale High Order Dynamic Rupture Earthquake Simulations on Heterogeneous Supercomputers.
Proceedings of the International Conference for High Performance Computing, 2014

Improving the energy efficiency of Big Cores.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

Improving Communication Performance and Scalability of Native Applications on Intel Xeon Phi Coprocessor Clusters.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

2013
Intel "big data" science and technology center vision and execution plan.
SIGMOD Record, 2013

Streaming Similarity Search over one Billion Tweets using Parallel Locality-Sensitive Hashing.
PVLDB, 2013

Lattice QCD on Intel® Xeon PhiTM Coprocessors.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

Tera-scale 1D FFT with low-communication algorithm and Intel® Xeon Phi™ coprocessors.
Proceedings of the International Conference for High Performance Computing, 2013

Design and Implementation of the Linpack Benchmark for Single and Multi-node Systems Based on Intel® Xeon Phi Coprocessor.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Efficient sparse matrix-vector multiplication on x86-based many-core processors.
Proceedings of the International Conference on Supercomputing, 2013

2012
Large-scale fluid simulation using velocity-vorticity domain decomposition.
ACM Trans. Graph., 2012

CloudRAMSort: fast and efficient large-scale distributed RAM sort on shared-nothing cluster.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2012

Optimization of geometric multigrid for emerging multi- and manycore processors.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Analysis and Optimization of Financial Analytics Benchmark on Modern Multi- and Many-core IA-Based Architectures.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Large-scale energy-efficient graph traversal: a path to efficient data-intensive supercomputing.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Billion-particle SIMD-friendly two-point correlation on large-scale HPC cluster systems.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

GPP-Grep: High-Speed Regular Expression Processing Engine on General Purpose Processors.
Proceedings of the Research in Attacks, Intrusions, and Defenses, 2012

Can traditional programming bridge the Ninja performance gap for parallel computing applications?
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012

High Performance Non-uniform FFT on Modern X86-based Multi-core Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Fast and Efficient Graph Traversal Algorithm for CPUs: Maximizing Single-Node Efficiency.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

2011
Designing fast architecture-sensitive tree search on modern multicore/many-core processors.
ACM Trans. Database Syst., 2011

PALM: Parallel Architecture-Friendly Latch-Free Modifications to B+ Trees on Many-Core Processors.
PVLDB, 2011

Fast Updates on Read-Optimized Databases Using Multi-Core CPUs.
PVLDB, 2011

High-Performance 3D Compressive Sensing MRI Reconstruction Using Many-Core Architectures.
Int. J. Biomedical Imaging, 2011

Designing and dynamically load balancing hybrid LU for multi/many-core.
Computer Science - R&D, 2011

Fast Updates on Read-Optimized Databases Using Multi-Core CPUs
CoRR, 2011

Interactive hybrid simulation of large-scale traffic.
Proceedings of the International Conference on Computer Graphics and Interactive Techniques, 2011

High-performance lattice QCD for multi-core based parallel systems using a cache-friendly hybrid threaded-MPI approach.
Proceedings of the Conference on High Performance Computing Networking, 2011

2010
Credit cards and inflation.
Games and Economic Behavior, 2010

A celebration of Robert Aumann's achievements on the occasion of his 80th birthday.
Games and Economic Behavior, 2010

Grading exams: 100, 99, 98, ... or A, B, C?
Games and Economic Behavior, 2010

Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2010

FAST: fast architecture sensitive tree search on modern CPUs and GPUs.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2010

PLEdestrians: A Least-Effort Approach to Crowd Simulation.
Proceedings of the 2010 Eurographics/ACM SIGGRAPH Symposium on Computer Animation, 2010

3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs.
Proceedings of the Conference on High Performance Computing Networking, 2010

Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU.
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

2009
Mapping High-Fidelity Volume Rendering for Medical Imaging to CPU, GPU and Many-Core Architectures.
IEEE Trans. Vis. Comput. Graph., 2009

Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs.
PVLDB, 2009

Larrabee: A Many-Core x86 Architecture for Visual Computing.
IEEE Micro, 2009

Perfect competition in an oligopoly (including bilateral monopoly).
Games and Economic Behavior, 2009

ClearPath: highly parallel collision avoidance for multi-agent simulation.
Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 2009

Interactive Modeling, Simulation and Control of Large-Scale Crowds and Traffic.
Proceedings of the Motion in Games, Second International Workshop, 2009

2008
Larrabee: a many-core x86 architecture for visual computing.
ACM Trans. Graph., 2008

Efficient implementation of sorting on multi-core SIMD CPU architecture.
PVLDB, 2008

Convergence of Recognition, Mining, and Synthesis Workloads and Its Implications.
Proceedings of the IEEE, 2008

Second Life and the New Generation of Virtual Worlds.
IEEE Computer, 2008

2007
Cache-conscious frequent pattern mining on modern and emerging processors.
VLDB J., 2007

Scaling performance of interior-point method on large-scale chip multiprocessor system.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

2006
Strategic complements and substitutes, and potential games.
Games and Economic Behavior, 2006

Competing for Customers in a Social Network: The Quasi-linear Case.
Proceedings of the Internet and Network Economics, Second International Workshop, 2006

Games of Connectivity.
Proceedings of the Internet and Network Economics, Second International Workshop, 2006

2005
Compound voting and the Banzhaf index.
Games and Economic Behavior, 2005

Cache-conscious Frequent Pattern Mining on a Modern Processor.
Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30, 2005

A Characterization of Data Mining Workloads on a Modern Processor.
Proceedings of the Workshop on Data Management on New Hardware, 2005

2004
Learning with perfect information.
Games and Economic Behavior, 2004

2003
Optimal scrutiny in multi-period promotion tournaments.
Games and Economic Behavior, 2003

2000
Compression Tolerant Watermarking for Image Verification.
Proceedings of the 2000 International Conference on Image Processing, 2000

1997
Characterizing vulnerability of parallelism to resource constraints.
Proceedings of the Fourth International on High-Performance Computing, 1997

1986
Inefficiency of Nash Equilibria.
Math. Oper. Res., 1986

1984
Totally balanced games arising from controlled programming problems.
Math. Program., 1984

1981
Information Conditions, Communication and General Equilibrium.
Math. Oper. Res., 1981

Value Theory Without Efficiency.
Math. Oper. Res., 1981

1980
Asymptotic Semivalues and a Short Proof of Kannai's Theorem.
Math. Oper. Res., 1980

1979
Mathematical Properties of the Banzhaf Power Index.
Math. Oper. Res., 1979


  Loading...