Tong Chen

Affiliations:
  • IBM T.J. Watson Research Center, Yorktown Heights, NY, USA


According to our database1, Tong Chen authored at least 30 papers between 2005 and 2022.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2022
HeapCheck: Low-cost Hardware Support for Memory Safety.
ACM Trans. Archit. Code Optim., 2022

On the Scalability of HeapCheck.
Proceedings of the 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2022

2021
Hardware Support for Low-Cost Memory Safety.
Proceedings of the 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2021

2020
Compiling ONNX Neural Network Models Using MLIR.
CoRR, 2020

2019
Using Structured Input and Modularity for Improved Learning.
CoRR, 2019


POSTER: CogR: Exploiting Program Structures for Machine-Learning Based Runtime Solutions.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2017
Implementing implicit OpenMP data sharing on GPUs.
Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC, 2017

Leveraging OpenMP 4.5 Support in CLANG for Fortran.
Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

Efficient Fork-Join on GPUs Through Warp Specialization.
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

2016
Performance Analysis and Optimization of Clang's OpenMP 4.5 GPU Support.
Proceedings of the 7th International Workshop on Performance Modeling, 2016

Offloading Support for OpenMP in Clang and LLVM.
Proceedings of the Third Workshop on the LLVM Compiler Infrastructure in HPC, 2016

Automatic Copying of Pointer-Based Data Structures.
Proceedings of the Languages and Compilers for Parallel Computing, 2016

2015
Active Memory Cube: A processing-in-memory architecture for exascale systems.
IBM J. Res. Dev., 2015

Integrating GPU support for OpenMP offloading directives into Clang.
Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, 2015

Performance analysis of OpenMP on a GPU using a CORAL proxy application.
Proceedings of the 6th International Workshop on Performance Modeling, 2015

Progressive Codesign of an Architecture and Compiler Using a Proxy Application.
Proceedings of the 27th International Symposium on Computer Architecture and High Performance Computing, 2015

Exploiting Fine- and Coarse-Grained Parallelism Using a Directive Based Approach.
Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Data access optimization in a processing-in-memory system.
Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

2014
Coordinating GPU threads for OpenMP 4.0 in LLVM.
Proceedings of the 2014 LLVM Compiler Infrastructure in HPC, 2014

2011
Automatic Loop Tiling for Direct Memory Access.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

2010
DMATiler: revisiting loop tiling for direct memory access.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

2009
DBDB: optimizing DMATransfer for the cell be architecture.
Proceedings of the 23rd international conference on Supercomputing, 2009

2008
Supporting OpenMP on Cell.
Int. J. Parallel Program., 2008

Prefetching irregular references for software cache on cell.
Proceedings of the Sixth International Symposium on Code Generation and Optimization (CGO 2008), 2008

Hybrid access-specific software cache techniques for the cell BE architecture.
Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008

2007
A Novel Asynchronous Software Cache Implementation for the Cell-BE Processor.
Proceedings of the Languages and Compilers for Parallel Computing, 2007

2006
Using advanced compiler technology to exploit the performance of the Cell Broadband Engine<sup>TM</sup> architecture.
IBM Syst. J., 2006

Optimizing the Use of Static Buffers for DMA on a CELL Chip.
Proceedings of the Languages and Compilers for Parallel Computing, 2006

2005
Optimizing Compiler for the CELL Processor.
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT 2005), 2005


  Loading...