Christopher J. Hughes

According to our database1, Christopher J. Hughes authored at least 42 papers between 2001 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

2022
Graphite: optimizing graph neural networks on CPUs through cooperative software-hardware techniques.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

2021
SumMerge: an efficient algorithm and implementation for weight repetition-aware DNN inference.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

RASA: Efficient Register-Aware Systolic Array Matrix Engine for CPU.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

2020
SAVE: Sparsity-Aware Vector Engine for Accelerating DNN Training and Inference on CPUs.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

SuSy: A Programming Model for Productive Construction of High-Performance Systolic Arrays on FPGAs.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

SparseTrain: Leveraging Dynamic Sparsity in Software for Training DNNs on General-Purpose SIMD Processors.
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2019
SparseTrain: Leveraging Dynamic Sparsity in Training DNNs on General-Purpose SIMD Processors.
CoRR, 2019

T2S-Tensor: Productively Generating High-Performance Spatial Hardware for Dense Tensor Computations.
Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019

Forgive-TM: Supporting Lazy Conflict Detection In Eager Hardware Transactional Memory.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018
Dynamic fine-grained sparse memory accesses.
Proceedings of the International Symposium on Memory Systems, 2018

Transactional pre-abort handlers in hardware transactional memory.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

2017
Banshee: bandwidth-efficient DRAM caching via software/hardware cooperation.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

2016
PleaseTM: Enabling transaction conflict management in requester-wins hardware transactional memory.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

2015
Single-Instruction Multiple-Data Execution
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01746-9, 2015

IMP: indirect memory prefetcher.
Proceedings of the 48th International Symposium on Microarchitecture, 2015

2013
Locality-aware task management for unstructured parallelism: a quantitative limit study.
Proceedings of the 25th ACM Symposium on Parallelism in Algorithms and Architectures, 2013

Performance evaluation of Intel® transactional synchronization extensions for high-performance computing.
Proceedings of the International Conference for High Performance Computing, 2013

Location-aware cache management for many-core processors with deep cache hierarchy.
Proceedings of the International Conference for High Performance Computing, 2013

Exploring SIMD for Molecular Dynamics, Using Intel® Xeon® Processors and Intel® Xeon Phi Coprocessors.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

2011
DeFT: Design space exploration for on-the-fly detection of coherence misses.
ACM Trans. Archit. Code Optim., 2011

Moguls: a model to explore the memory hierarchy for bandwidth improvements.
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

LIME: a framework for debugging load imbalance in multi-threaded execution.
Proceedings of the 33rd International Conference on Software Engineering, 2011

2010
Performance and Energy Implications of Many-Core Caches for Throughput Computing.
IEEE Micro, 2010

2009
Parallel scalability in speech recognition.
IEEE Signal Process. Mag., 2009

Scalable HMM based inference engine in large vocabulary continuous speech recognition.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

2008
Convergence of Recognition, Mining, and Synthesis Workloads and Its Implications.
Proc. IEEE, 2008

Atomic Vector Operations on Chip Multiprocessors.
Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008

2007
Carbon: architectural support for fine-grained parallelism on chip multiprocessors.
Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007

Physical simulation for animation and visual effects: parallelization and characterization for chip multiprocessors.
Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007

Computer Vision on Multi-Core Processors: Articulated Body Tracking.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

2006
Hybrid transactional memory.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2006

Incremental approximate matrix factorization for speeding up support vector machines.
Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006

2005
Memory-side prefetching for linked data structures for processor-in-memory systems.
J. Parallel Distributed Comput., 2005

2004
A Formal Approach to Frequent Energy Adaptations for Multimedia Applications.
Proceedings of the 31st International Symposium on Computer Architecture (ISCA 2004), 2004

2003
General -Purpose Processors for Multimedia Applications: Predictability and Energy Efficiency
PhD thesis, 2003

2002
RSIM: Simulating Shared-Memory Multiprocessors with ILP Processors.
Computer, 2002

Soft Real- Time Scheduling on Simultaneous Multithreaded Processors.
Proceedings of the 23rd IEEE Real-Time Systems Symposium (RTSS'02), 2002

Joint local and global hardware adaptations for energy.
Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-X), 2002

2001
Saving energy with architectural and frequency adaptations for multimedia applications.
Proceedings of the 34th Annual International Symposium on Microarchitecture, 2001

Variability in the execution of multimedia applications and implications for architecture.
Proceedings of the 28th Annual International Symposium on Computer Architecture, 2001

Speculative precomputation: long-range prefetching of delinquent loads.
Proceedings of the 28th Annual International Symposium on Computer Architecture, 2001


  Loading...