Milind Girkar

According to our database1, Milind Girkar authored at least 33 papers between 1988 and 2021.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2021
Extending LLVM IR for DPC++ Matrix Support: A Case Study with Intel<sup>®</sup> Advanced Matrix Extensions (Intel<sup>®</sup> AMX).
Proceedings of the 7th IEEE/ACM Workshop on the LLVM Compiler Infrastructure in HPC, 2021

2015
Can traditional programming bridge the ninja performance gap for parallel computing applications?
Commun. ACM, 2015

2012
Compiling C/C++ SIMD Extensions for Function and Loop Vectorizaion on Multicore-SIMD Processors.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

2010
On the efficacy of call graph-level thread-level speculation.
Proceedings of the first joint WOSP/SIPEW International Conference on Performance Engineering, 2010

Exploitation of nested thread-level speculative parallelism on multi-core systems.
Proceedings of the 7th Conference on Computing Frontiers, 2010

2009
On the exploitation of loop-level parallelism in embedded applications.
ACM Trans. Embed. Comput. Syst., 2009

2008
Comparative architectural characterization of SPEC CPU2000 and CPU2006 benchmarks on the intel® Core<sup>TM</sup> 2 Duo processor.
Proceedings of the 2008 International Conference on Embedded Computer Systems: Architectures, 2008

2007
Tight analysis of the performance potential of thread speculation using spec CPU 2006.
Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system.
Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, 2007

2006
Multimedia vectorization of floating-point MIN/MAX reductions.
Concurr. Comput. Pract. Exp., 2006

A general approach for partitioning N-dimensional parallel nested loops with conditionals.
Proceedings of the SPAA 2006: Proceedings of the 18th Annual ACM Symposium on Parallelism in Algorithms and Architectures, Cambridge, Massachusetts, USA, July 30, 2006

On the performance potential of different types of speculative thread-level parallelism: The DL version of this paper includes corrections that were not made available in the printed proceedings.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006

Lightweight lock-free synchronization methods for multithreading.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006

Probablistic Self-Scheduling.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

Challenges in exploitation of loop parallelism in embedded applications.
Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis, 2006

2005
A compiler for exploiting nested parallelism in OpenMP programs.
Parallel Comput., 2005

Practical Compiler Techniques on Efficient Multithreaded Code Generation for OpenMP Programs.
Comput. J., 2005

Impact of Compiler-based Data-Prefetching Techniques on SPEC OMP Application Performance.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

2004
Towards Efficient Multi-Level Threading of H.264 Encoder on Intel Hyper-Threading Architectures.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Effect of Optimizations on Performance of OpenMP Programs.
Proceedings of the High Performance Computing, 2004

Physical Experimentation with Prefetching Helper Threads on Intel's Hyper-Threaded Processors.
Proceedings of the 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 2004

2003
Performance Study of a Whole Genome Comparison Tool on a Hyper-Threading Multiprocessor.
Proceedings of the High Performance Computing, 5th International Symposium, 2003

Exploring the Use of Hyper-Threading Technology for Multimedia Applications with Intel® OpenMP* Compiler.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Compiler and Runtime Support for Running OpenMP Programs on Pentium-and Itanium-Architectures.
Proceedings of the Eighth International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS'03), 2003

2002
Automatic Intra-Register Vectorization for the Intel? Architecture.
Int. J. Parallel Program., 2002

Automatic Detection of Saturation and Clipping Idioms.
Proceedings of the Languages and Compilers for Parallel Computing, 15th Workshop, 2002

1999
Incorporating Intel MMX technology into a Java JIT compiler.
Sci. Program., 1999

1995
Extracting Task-Level Parallelism
ACM Trans. Program. Lang. Syst., 1995

1994
The hierarchical task graph as a universal intermediate representation.
Int. J. Parallel Program., 1994

1992
Automatic Extraction of Functional Parallelism from Ordinary Programs.
IEEE Trans. Parallel Distributed Syst., 1992

1991
Optimization of Data/Control Conditions in Task Graphs.
Proceedings of the Languages and Compilers for Parallel Computing, 1991

1989
Parafrase-2: an Environment for Parallelizing, Partitioning, Synchronizing, and Scheduling Programs on Multiprocessors.
Int. J. High Speed Comput., 1989

1988
Partitioning programs for parallel execution.
Proceedings of the 2nd international conference on Supercomputing, 1988


  Loading...