Xinmin Tian

Orcid: 0000-0001-6228-924X

According to our database1, Xinmin Tian authored at least 56 papers between 1993 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
An Efficient Approach to Resolving Stack Overflow of SYCL Kernel on Intel® CPUs.
Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024

2023
Streamline Ahead-of-Time SYCL CPU Device Implementation through Bypassing SPIR-V.
Proceedings of the 2023 International Workshop on OpenCL, 2023

2021
A Holistic Systems Approach to Leveraging Heterogeneity.
Proceedings of the IEEE/ACM Programming Environments for Heterogeneous Computing, 2021

2019
Developments in memory management in OpenMP.
Int. J. High Perform. Comput. Netw., 2019

2018
Performance Tuning to Close Ninja Gap for Accelerator Physics Emulation System (APES) on Intel® Xeon Phi<sup>TM</sup> Processors.
Proceedings of the Evolving OpenMP for Evolving Architectures, 2018

2017
Deep Learning in Genomic and Medical Image Data Analysis: Challenges and Approaches.
J. Inf. Process. Syst., 2017

LLVM Compiler Implementation for Explicit Parallelization and SIMD Vectorization.
Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC, 2017

2016
LLVM Framework and IR Extensions for Parallelization, SIMD Vectorization and Offloading.
Proceedings of the Third Workshop on the LLVM Compiler Infrastructure in HPC, 2016

A Modern Memory Management System for OpenMP.
Proceedings of the Third Workshop on Accelerator Programming Using Directives, 2016

Reducing the Functionality Gap Between Auto-Vectorization and Explicit Vectorization - Compress/Expand and Histogram.
Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

2015
Effective SIMD Vectorization for Intel Xeon Phi Coprocessors.
Sci. Program., 2015

Programming Models, Languages, and Compilers for Manycore and Heterogeneous Architectures.
Sci. Program., 2015

User-Guided Dynamic Data Race Detection.
Int. J. Parallel Program., 2015

2013
Practical SIMD Vectorization Techniques for Intel® Xeon Phi Coprocessors.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Mis-speculation-Driven Compiler Framework for Aggressive Loop Automatic Parallelization.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

2012
Extending OpenMP* with Vector Constructs for Modern Multicore SIMD Architectures.
Proceedings of the OpenMP in a Heterogeneous World - 8th International Workshop on OpenMP, 2012

Compiling C/C++ SIMD Extensions for Function and Loop Vectorizaion on Multicore-SIMD Processors.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Performance Study of SIMD Programming Models on Intel Multicore Processors.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

2010
On the efficacy of call graph-level thread-level speculation.
Proceedings of the first joint WOSP/SIPEW International Conference on Performance Engineering, 2010

Exploitation of nested thread-level speculative parallelism on multi-core systems.
Proceedings of the 7th Conference on Computing Frontiers, 2010

2009
On the exploitation of loop-level parallelism in embedded applications.
ACM Trans. Embed. Comput. Syst., 2009

NePaLTM: Design and Implementation of Nested Parallelism for Transactional Memory Systems.
Proceedings of the ECOOP 2009, 2009

2008
A Case Study on Compiler Optimizations for the Intel<sup>®</sup> Core<sup>TM</sup> 2 Duo Processor.
Int. J. Parallel Program., 2008

Comparative architectural characterization of SPEC CPU2000 and CPU2006 benchmarks on the intel® Core<sup>TM</sup> 2 Duo processor.
Proceedings of the 2008 International Conference on Embedded Computer Systems: Architectures, 2008

Exploring the Emerging Applications for Transactional Memory.
Proceedings of the Ninth International Conference on Parallel and Distributed Computing, 2008

Design and implementation of transactional constructs for C/C++.
Proceedings of the 23rd Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2008

2007
Parallel protein secondary structure prediction schemes using Pthread and OpenMP over hyper-threading technology.
J. Supercomput., 2007

Tight analysis of the performance potential of thread speculation using spec CPU 2006.
Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system.
Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, 2007

2006
Multimedia vectorization of floating-point MIN/MAX reductions.
Concurr. Comput. Pract. Exp., 2006

A general approach for partitioning N-dimensional parallel nested loops with conditionals.
Proceedings of the SPAA 2006: Proceedings of the 18th Annual ACM Symposium on Parallelism in Algorithms and Architectures, Cambridge, Massachusetts, USA, July 30, 2006

On the performance potential of different types of speculative thread-level parallelism: The DL version of this paper includes corrections that were not made available in the printed proceedings.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006

Lightweight lock-free synchronization methods for multithreading.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006

Probablistic Self-Scheduling.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

Challenges in exploitation of loop parallelism in embedded applications.
Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis, 2006

2005
A compiler for exploiting nested parallelism in OpenMP programs.
Parallel Comput., 2005

Practical Compiler Techniques on Efficient Multithreaded Code Generation for OpenMP Programs.
Comput. J., 2005

Impact of Compiler-based Data-Prefetching Techniques on SPEC OMP Application Performance.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

2004
Towards Efficient Multi-Level Threading of H.264 Encoder on Intel Hyper-Threading Architectures.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Effect of Optimizations on Performance of OpenMP Programs.
Proceedings of the High Performance Computing, 2004

Physical Experimentation with Prefetching Helper Threads on Intel's Hyper-Threaded Processors.
Proceedings of the 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 2004

2003
Performance Study of a Whole Genome Comparison Tool on a Hyper-Threading Multiprocessor.
Proceedings of the High Performance Computing, 5th International Symposium, 2003

Exploring the Use of Hyper-Threading Technology for Multimedia Applications with Intel® OpenMP* Compiler.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Compiler and Runtime Support for Running OpenMP Programs on Pentium-and Itanium-Architectures.
Proceedings of the Eighth International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS'03), 2003

2002
Automatic Intra-Register Vectorization for the Intel? Architecture.
Int. J. Parallel Program., 2002

Automatic Detection of Saturation and Clipping Idioms.
Proceedings of the Languages and Compilers for Parallel Computing, 15th Workshop, 2002

1997
Experiences with Non-numeric Applications on Multithreaded Architectures.
Proceedings of the Sixth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), 1997

1996
A Study of the EARTH-MANNA Multithreaded System.
Int. J. Parallel Program., 1996

Polling Watchdog: Combining Polling and Interrupts for Efficient Message Handling.
Proceedings of the 23rd Annual International Symposium on Computer Architecture, 1996

Quantitive studies of data-locality sensitivity on the EARTH multithreaded architecture: preliminary results.
Proceedings of the 3rd International Conference on High Performance Computing, 1996

Multithreading implementation of a distributed shortest path algorithm on EARTH multiprocessor.
Proceedings of the 3rd International Conference on High Performance Computing, 1996

Data locality sensitivity of multithreaded computations on a distributed-memory multiprocessor.
Proceedings of the 1996 conference of the Centre for Advanced Studies on Collaborative Research, 1996

1995
A design study of the EARTH multiprocessor.
Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques, 1995

1994
Compiling CIL rewriting language for multiprocessors.
J. Comput. Sci. Technol., 1994

Granularity analysis for exploiting adaptive parallelism of declarative programs on multiprocessors.
J. Comput. Sci. Technol., 1994

1993
Optimized parallel execution of declarative programs on distributed memory multiprocessors.
J. Comput. Sci. Technol., 1993


  Loading...