Jingling Xue

According to our database1, Jingling Xue authored at least 199 papers between 1988 and 2019.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepage:

On csauthors.net:

Bibliography

2019
Understanding and Analyzing Java Reflection.
ACM Trans. Softw. Eng. Methodol., 2019

SCP: Shared Cache Partitioning for High-Performance GEMM.
TACO, 2019

LCCFS: a lightweight distributed file system for cloud computing without journaling and metadata services.
SCIENCE CHINA Information Sciences, 2019

Event trace reduction for effective bug replay of Android apps via differential GUI state analysis.
Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019

Incremental precision-preserving symbolic inference for probabilistic programs.
Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2019

WCET-aware hyper-block construction for clustered VLIW processors.
Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, 2019

Detecting memory errors at runtime with source-level instrumentation.
Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2019

Precise Static Happens-Before Analysis for Detecting UAF Order Violations in Android.
Proceedings of the 12th IEEE Conference on Software Testing, Validation and Verification, 2019

VFix: value-flow-guided precise program repair for null pointer dereferences.
Proceedings of the 41st International Conference on Software Engineering, 2019

A Feature-Oriented Corpus for Understanding, Evaluating and Improving Fuzz Testing.
Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security, 2019

PPOpenCL: a performance-portable OpenCL compiler with host and kernel thread code fusion.
Proceedings of the 28th International Conference on Compiler Construction, 2019

2018
Loop-Oriented Pointer Analysis for Automatic SIMD Vectorization.
ACM Trans. Embedded Comput. Syst., 2018

Ripple: Reflection analysis for Android apps in incomplete information environments.
Softw., Pract. Exper., 2018

Parallel construction of interprocedural memory SSA form.
Journal of Systems and Software, 2018

TDroid: exposing app switching attacks in Android with control flow specialization.
Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018

Understanding and detecting evolution-induced compatibility issues in Android apps.
Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018

Launch-mode-aware context-sensitive activity transition analysis.
Proceedings of the 40th International Conference on Software Engineering, 2018

Spatio-temporal context reduction: a pointer-analysis-based static approach for detecting use-after-free vulnerabilities.
Proceedings of the 40th International Conference on Software Engineering, 2018

Live path control flow integrity.
Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, 2018

Revisiting Loop Tiling for Datacenters: Live and Let Live.
Proceedings of the 32nd International Conference on Supercomputing, 2018

May-happen-in-parallel analysis with static vector clocks.
Proceedings of the 2018 International Symposium on Code Generation and Optimization, 2018

Poker: permutation-based SIMD execution of intensive tree search by path encoding.
Proceedings of the 2018 International Symposium on Code Generation and Optimization, 2018

Live Path CFI Against Control Flow Hijacking Attacks.
Proceedings of the Information Security and Privacy - 23rd Australasian Conference, 2018

Towards concurrency race debugging: an integrated approach for constraint solving and dynamic slicing.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

2017
Durable Address Translation in PCM-Based Flash Storage Systems.
IEEE Trans. Parallel Distrib. Syst., 2017

An Efficient WCET-Aware Instruction Scheduling and Register Allocation Approach for Clustered VLIW Processors.
ACM Trans. Embedded Comput. Syst., 2017

Fine grained, direct access file system support for storage class memory.
Journal of Systems Architecture - Embedded Systems Design, 2017

Incremental Analysis for Probabilistic Programs.
Proceedings of the Static Analysis - 24th International Symposium, 2017

Efficient and precise points-to analysis: modeling the heap by merging equivalent automata.
Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2017

Boosting the precision of virtual call integrity protection with partial pointer analysis for C++.
Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, Santa Barbara, CA, USA, July 10, 2017

Reflection Analysis for Java: Uncovering More Reflective Targets Precisely.
Proceedings of the 28th IEEE International Symposium on Software Reliability Engineering, 2017

Ripple: Reflection Analysis for Android Apps in Incomplete Information Environments.
Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, 2017

Automatic generation of fast BLAS3-GEMM: a portable compiler approach.
Proceedings of the 2017 International Symposium on Code Generation and Optimization, 2017

Dynamic symbolic execution for polymorphism.
Proceedings of the 26th International Conference on Compiler Construction, 2017

Machine-Learning-Guided Typestate Analysis for Static Use-After-Free Detection.
Proceedings of the 33rd Annual Computer Security Applications Conference, 2017

2016
Eliminating Redundant Bounds Checks in Dynamic Buffer Overflow Detection Using Weakest Preconditions.
IEEE Trans. Reliability, 2016

Predicting Cross-Core Performance Interference on Multicore Processors with Regression Analysis.
IEEE Trans. Parallel Distrib. Syst., 2016

An Efficient GPU Implementation of Inclusion-Based Pointer Analysis.
IEEE Trans. Parallel Distrib. Syst., 2016

Reducing Static Energy in Supercomputer Interconnection Networks Using Topology-Aware Partitioning.
IEEE Trans. Computers, 2016

A Compiler Approach for Exploiting Partial SIMD Parallelism.
TACO, 2016

Program Tailoring: Slicing by Sequential Criteria (Artifact).
DARTS, 2016

Energy Wall for Exascale Supercomputing.
Computing and Informatics, 2016

On-demand strong update analysis via value-flow refinement.
Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016

Making k-Object-Sensitive Pointer Analysis More Precise with Still k-Limiting.
Proceedings of the Static Analysis - 23rd International Symposium, 2016

Automated memory leak fixing on value-flow slices for C programs.
Proceedings of the 31st Annual ACM Symposium on Applied Computing, 2016

Loop-oriented array- and field-sensitive pointer analysis for automatic SIMD vectorization.
Proceedings of the 17th ACM SIGPLAN/SIGBED Conference on Languages, 2016

RegTT: Accelerating Tree Traversals on GPUs by Exploiting Regularities.
Proceedings of the 45th International Conference on Parallel Processing, 2016

An Energy-Efficient Implementation of LU Factorization on Heterogeneous Systems.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

Program Tailoring: Slicing by Sequential Criteria.
Proceedings of the 30th European Conference on Object-Oriented Programming, 2016

Exploiting mixed SIMD parallelism by reducing data reorganization overhead.
Proceedings of the 2016 International Symposium on Code Generation and Optimization, 2016

Sparse flow-sensitive pointer analysis for multithreaded programs.
Proceedings of the 2016 International Symposium on Code Generation and Optimization, 2016

SVF: interprocedural static value-flow analysis in LLVM.
Proceedings of the 25th International Conference on Compiler Construction, 2016

Masking Soft Errors with Static Bitwise Analysis.
Proceedings of the 23rd Asia-Pacific Software Engineering Conference, 2016

2015
Enhancement of cooperation between file systems and applications - on VFS extensions for optimized performance.
SCIENCE CHINA Information Sciences, 2015

Effective Soundness-Guided Reflection Analysis.
Proceedings of the Static Analysis - 22nd International Symposium, 2015

File system-independent block device support for storage class memory.
Proceedings of the 2015 IEEE Conference on Computer Communications Workshops, 2015

Hadoop+: Modeling and Evaluating the Heterogeneity for MapReduce Applications in Heterogeneous Clusters.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

Design and Implementation of a Highly Efficient DGEMM for 64-Bit ARMv8 Multi-core Processors.
Proceedings of the 44th International Conference on Parallel Processing, 2015

Region-Based May-Happen-in-Parallel Analysis for C Programs.
Proceedings of the 44th International Conference on Parallel Processing, 2015

Contention-Aware Scheduling for Asymmetric Multicore Processors.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

Performance Modeling of Multithreaded Programs for Mobile Asymmetric Chip Multiprocessors.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

2014
Detecting Memory Leaks Statically with Full-Sparse Value-Flow Analysis.
IEEE Trans. Software Eng., 2014

Making context-sensitive inclusion-based pointer analysis practical for compilers using parameterised summarisation.
Softw., Pract. Exper., 2014

OpenMC: Towards Simplifying Programming for TianHe Supercomputers.
J. Comput. Sci. Technol., 2014

Acyclic orientation graph coloring for software-managed memory allocation.
SCIENCE CHINA Information Sciences, 2014

Region-Based Selective Flow-Sensitive Pointer Analysis.
Proceedings of the Static Analysis - 21st International Symposium, 2014

WPBOUND: Enforcing Spatial Memory Safety Efficiently at Runtime with Weakest Preconditions.
Proceedings of the 25th IEEE International Symposium on Software Reliability Engineering, 2014

Parallel Pointer Analysis with CFL-Reachability.
Proceedings of the 43rd International Conference on Parallel Processing, 2014

Self-inferencing Reflection Resolution for Java.
Proceedings of the ECOOP 2014 - Object-Oriented Programming - 28th European Conference, Uppsala, Sweden, July 28, 2014

Lifetime holes aware register allocation for clustered VLIW processors.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

Accelerating Dynamic Detection of Uses of Undefined Values with Static Value-Flow Analysis.
Proceedings of the 12th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2014

A collaborative divide-and-conquer K-means clustering algorithm for processing large data.
Proceedings of the Computing Frontiers Conference, CF'14, 2014

2013
SEED: A Statically Greedy and Dynamically Adaptive Approach for Speculative Loop Execution.
IEEE Trans. Computers, 2013

Layout-oblivious compiler optimization for matrix computations.
TACO, 2013

Acculock: accurate and efficient detection of data races.
Softw., Pract. Exper., 2013

Epipe: A low-cost fault-tolerance technique considering WCET constraints.
Journal of Systems Architecture - Embedded Systems Design, 2013

Instruction scheduling with k-successor tree for clustered VLIW processors.
Design Autom. for Emb. Sys., 2013

Accelerating inclusion-based pointer analysis on heterogeneous CPU-GPU systems.
Proceedings of the 20th Annual International Conference on High Performance Computing, 2013

Structural Lock Correlation with Ownership Types.
Proceedings of the Programming Languages and Systems, 2013

Query-directed adaptive heap cloning for optimizing compilers.
Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013

An Incremental Points-to Analysis with CFL-Reachability.
Proceedings of the Compiler Construction - 22nd International Conference, 2013

Scratchpad Memory aware task scheduling with minimum number of preemptions on a single processor.
Proceedings of the 18th Asia and South Pacific Design Automation Conference, 2013

An empirical model for predicting cross-core performance interference on multicore processors.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
Optimally Maximizing Iteration-Level Loop Parallelism.
IEEE Trans. Parallel Distrib. Syst., 2012

Optimizing modulo scheduling to achieve reuse and concurrency for stream processors.
The Journal of Supercomputing, 2012

The Reliability Wall for Exascale Supercomputing.
IEEE Trans. Computers, 2012

Comparability Graph Coloring for Optimizing Utilization of Software-Managed Stream Register Files for Stream Processors.
TACO, 2012

Parallelizing SOR for GPGPUs using alternate loop tiling.
Parallel Computing, 2012

A Hybrid Circular Queue Method for Iterative Stencil Computations on GPUs.
J. Comput. Sci. Technol., 2012

PartialRC: A Partial Recomputing Method for Efficient Fault Recovery on GPGPUs.
J. Comput. Sci. Technol., 2012

WCET-aware data selection and allocation for scratchpad memory.
Proceedings of the SIGPLAN/SIGBED Conference on Languages, 2012

Fast and precise points-to analysis with incremental CFL-reachability summarisation: preliminary experience.
Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, 2012

Static memory leak detection using full-sparse value-flow analysis.
Proceedings of the International Symposium on Software Testing and Analysis, 2012

What Is System Hang and How to Handle It.
Proceedings of the 23rd IEEE International Symposium on Software Reliability Engineering, 2012

A Fast Parallel Implementation of Molecular Dynamics with the Morse Potential on a Heterogeneous Petascale Supercomputer.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

A Highly Parallel Reuse Distance Analysis Algorithm on GPUs.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Automatic Parallelization of Tiled Loop Nests with Enhanced Fine-Grained Parallelism on GPUs.
Proceedings of the 41st International Conference on Parallel Processing, 2012

A Type and Effect System for Determinism in Multithreaded Programs.
Proceedings of the Programming Languages and Systems, 2012

On-demand dynamic summary-based points-to analysis.
Proceedings of the 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2012

Ownership Types for Object Synchronisation.
Proceedings of the Programming Languages and Systems - 10th Asian Symposium, 2012

Layout-oblivious optimization for matrix computations.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011
Leakage-Aware Modulo Scheduling for Embedded VLIW Processors.
J. Comput. Sci. Technol., 2011

Automatic Library Generation for BLAS3 on GPUs.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Efficient Energy Balancing Aware Multiple Base Station Deployment for WSNs.
Proceedings of the Wireless Sensor Networks - 8th European Conference, 2011

Model-Driven Tile Size Selection for DOACROSS Loops on GPUs.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

Acculock: Accurate and efficient detection of data races.
Proceedings of the CGO 2011, 2011

Extendable pattern-oriented optimization directives.
Proceedings of the CGO 2011, 2011

An efficient heuristic for instruction scheduling on clustered vliw processors.
Proceedings of the 14th International Conference on Compilers, 2011

SPAS: Scalable Path-Sensitive Pointer Analysis on Full-Sparse SSA.
Proceedings of the Programming Languages and Systems - 9th Asian Symposium, 2011

2010
Scratchpad memory allocation for data aggregates via interval coloring in superperfect graphs.
ACM Trans. Embedded Comput. Syst., 2010

Exploiting the reuse supplied by loop-dependent stream references for stream processors.
TACO, 2010

Loop recreation for thread-level speculation on multicore processors.
Softw., Pract. Exper., 2010

Gather/scatter hardware support for accelerating Fast Fourier Transform.
Journal of Systems Architecture - Embedded Systems Design, 2010

Software-Hardware Cooperative DRAM Bank Partitioning for Chip Multiprocessors.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2010

Toward Harnessing DOACROSS Parallelism for Multi-GPGPUs.
Proceedings of the 39th International Conference on Parallel Processing, 2010

Optimal WCET-aware code selection for scratchpad memory.
Proceedings of the 10th International conference on Embedded software, 2010

Reuse-aware modulo scheduling for stream processors.
Proceedings of the Design, Automation and Test in Europe, 2010

Level by level: making flow- and context-sensitive pointer analysis scalable for millions of lines of code.
Proceedings of the CGO 2010, 2010

Improving scratchpad allocation with demand-driven data tiling.
Proceedings of the 2010 International Conference on Compilers, 2010

2009
Compiler-directed scratchpad memory management via graph coloring.
TACO, 2009

PARBLO: Page-Allocation-Based DRAM Row Buffer Locality Optimization.
J. Comput. Sci. Technol., 2009

Comparability graph coloring for optimizing utilization of stream register files in stream processors.
Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009

A Cache-Efficient Parallel Gauss-Seidel Solver with Alternating Tiling.
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

Exploiting Speculative TLP in Recursive Programs by Dynamic Thread Prediction.
Proceedings of the Compiler Construction, 18th International Conference, 2009

Optimal loop parallelization for maximizing iteration-level parallelism.
Proceedings of the 2009 International Conference on Compilers, 2009

Ownership Downgrading for Ownership Types.
Proceedings of the Programming Languages and Systems, 7th Asian Symposium, 2009

2008
Improving the parallelism of iterative methods by aggressive loop fusion.
The Journal of Supercomputing, 2008

Advances in high performance computing.
The Journal of Supercomputing, 2008

Minimal placement of bank selection instructions for partitioned memory architectures.
ACM Trans. Embedded Comput. Syst., 2008

Optimizing scientific application loops on stream processors.
Proceedings of the 2008 ACM SIGPLAN/SIGBED Conference on Languages, 2008

Thread-Sensitive Modulo Scheduling for Multicore Processors.
Proceedings of the 2008 International Conference on Parallel Processing, 2008

ACS: An Addressless Configuration Support for efficient partial reconfigurations.
Proceedings of the 2008 International Conference on Field-Programmable Technology, 2008

Hardware Support for Efficient Sparse Matrix Vector Multiplication.
Proceedings of the 2008 IEEE/IPIP International Conference on Embedded and Ubiquitous Computing (EUC 2008), 2008

A gather/scatter hardware support for efficient Fast Fourier Transform.
Proceedings of the 13th Asia-Pacific Computer Systems Architecture Conference, 2008

Exploiting loop-dependent stream reuse for stream processors.
Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008

2007
Data cache locking for tight timing calculations.
ACM Trans. Embedded Comput. Syst., 2007

Interprocedural side-effect analysis for incomplete object-oriented software modules.
Journal of Systems and Software, 2007

Trace-based leakage energy optimisations at link time.
Journal of Systems Architecture, 2007

Scratchpad allocation for data aggregates in superperfect graphs.
Proceedings of the 2007 ACM SIGPLAN/SIGBED Conference on Languages, 2007

Toward Automatic Data Distribution for Migrating Computations.
Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

Loop recreation for thread-level speculation.
Proceedings of the 13th International Conference on Parallel and Distributed Systems, 2007

Validity Invariants and Effects.
Proceedings of the ECOOP 2007 - Object-Oriented Programming, 21st European Conference, Berlin, Germany, July 30, 2007

Towards Data Tiling for Whole Programs in Scratchpad Memory Allocation.
Proceedings of the Advances in Computer Systems Architecture, 2007

2006
A lifetime optimal algorithm for speculative PRE.
TACO, 2006

A Fresh Look at Partial Redundancy Elimination as a Maximum Flow Problem.
Softwaretechnik-Trends, 2006

Partial dead code elimination on predicated code regions.
Softw., Pract. Exper., 2006

Instruction Scheduling with Release Times and Deadlines on ILP Processors.
Proceedings of the 12th IEEE Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2006), 2006

CoopStream: A Cooperative Cache Based Streaming Schedule Scheme for On-demand Media Services on Overlay Networks.
Proceedings of the 2006 International Conference on Parallel Processing (ICPP 2006), 2006

A Fresh Look at PRE as a Maximum Flow Problem.
Proceedings of the Compiler Construction, 15th International Conference, 2006

Minimizing bank selection instructions for partitioned memory architecture.
Proceedings of the 2006 International Conference on Compilers, 2006

Trace-Based Data Cache Leakage Reduction at Link Time.
Proceedings of the Advances in Computer Systems Architecture, 11th Asia-Pacific Conference, 2006

2005
Cache exploitation in embedded systems.
J. Embedded Computing, 2005

Forword.
J. Comput. Sci. Technol., 2005

Aggressive Loop Fusion for Improving Locality and Parallelism.
Proceedings of the Parallel and Distributed Processing and Applications, 2005

Enabling Loop Fusion and Tiling for Cache Performance by Fixing Fusion-Preventing Data Dependences.
Proceedings of the 34th International Conference on Parallel Processing (ICPP 2005), 2005

Fast Parallel DNA-Based Algorithms for Molecular Computation: Determining a Prime Number.
Proceedings of the Third International Conference on Information Technology and Applications (ICITA 2005), 2005

Compiler-Directed Scratchpad Memory Management.
Proceedings of the Embedded Software and Systems, Second International Conference, 2005

Completeness Analysis for Incomplete Object-Oriented Programs.
Proceedings of the Compiler Construction, 14th International Conference, 2005

Interprocedural Side-Effect Analysis and Optimisation in the Presence of Dynamic Class Loading.
Proceedings of the Computer Science 2005, 2005

Improving the Performance of GCC by Exploiting IA-64 Architectural Features.
Proceedings of the Advances in Computer Systems Architecture, 10th Asia-Pacific Conference, 2005

Memory Coloring: A Compiler Approach for Scratchpad Memory Management.
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT 2005), 2005

2004
Efficient and Accurate Analytical Modeling of Whole-Program Data Cache Behavior.
IEEE Trans. Computers, 2004

A trace-based binary compilation framework for energy-aware computing.
Proceedings of the 2004 ACM SIGPLAN/SIGBED Conference on Languages, 2004

Region-Based Partial Dead Code Elimination on Predicated Code.
Proceedings of the Compiler Construction, 13th International Conference, 2004

A Comparative Study of Web Application Design Models Using the Java Technologies.
Proceedings of the Advanced Web Technologies and Applications, 2004

Strength Reduction for Loop-Invariant Types.
Proceedings of the Computer Science 2004, 2004

2003
Data cache locking for higher program predictability.
Proceedings of the International Conference on Measurements and Modeling of Computer Systems, 2003

Data Caches in Multitasking Hard Real-Time Systems.
Proceedings of the 24th IEEE Real-Time Systems Symposium (RTSS 2003), 2003

Code Tiling for Improving the Cache Performance of PDE Solvers.
Proceedings of the 32nd International Conference on Parallel Processing (ICPP 2003), 2003

Optimal and Efficient Speculation-Based Partial Redundancy Elimination.
Proceedings of the 1st IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2003), 2003

2002
Time-minimal tiling when rise is larger than zero.
Parallel Computing, 2002

Eigenvectors-based parallelisation of nested loops with affine dependences.
Parallel Algorithms Appl., 2002

Space-Time Equations for Non-Unimodular Mappings.
Int. J. Comput. Math., 2002

Let's Study Whole-Program Cache Behaviour Analytically.
Proceedings of the Eighth International Symposium on High-Performance Computer Architecture (HPCA'02), 2002

2001
Communication Overhead on Distributed Memory Machines.
Scalable Computing: Practice and Experience, 2001

2000
Generating efficient tiled code for distributed memory machines.
Parallel Computing, 2000

Loop Tiling for Parallelism
Kluwer International Series in Engineering and Computer Science 575, Kluwer, ISBN: 0-7923-7933-0, 2000

1999
Partitioning and scheduling loops on NOWs.
Computer Communications, 1999

1998
Reuse-Driven Tiling for Improving Data Locality.
International Journal of Parallel Programming, 1998

1997
On Tiling as a Loop Transformation.
Parallel Processing Letters, 1997

Unimodular Transformations of Non-Perfectly Nested Loops.
Parallel Computing, 1997

Reuse-Driven Tiling for Data Locality.
Proceedings of the Languages and Compilers for Parallel Computing, 1997

1996
Generalising the Unimodular Approach to Restructure Imperfectly Nested Loops.
Parallel Processing Letters, 1996

Transformations of Nested Loops with Non-Convex Iteration Spaces.
Parallel Computing, 1996

Communication-Minimal Tiling of Uniform Dependence Loops.
Proceedings of the Languages and Compilers for Parallel Computing, 1996

Affine-by-Statement Transformations of Imperfectly Nested Loops.
Proceedings of IPPS '96, 1996

1995
Closed-form mapping conditions for the synthesis of linear processor arrays.
VLSI Signal Processing, 1995

Constructing DO loops for non-convex iteration spaces in compiling for parallel machines.
Proceedings of IPPS '95, 1995

1994
Automating Non-Unimodular Loop Transformations for Massive Parallelism.
Parallel Computing, 1994

Avoiding Data Link and Computational Conflicts in Mapping Nested Loop Algorithms to Lower-Dimensional Processor Arrays.
Proceedings of the Proceedings 1994 International Conference on Parallel and Distributed Systems, 1994

1993
An Algorithm to Automate Non-Unimodular Transformations of Loop Nests.
Proceedings of the Fifth IEEE Symposium on Parallel and Distributed Processing, 1993

1992
Formal synthesis of control signals for systolic arrays.
PhD thesis, 1992

A systolic array for pyramidal algorithms.
VLSI Signal Processing, 1992

The synthesis of control signals for one-dimensional systolic arrays.
Integration, 1992

On the Loading, Recovery and Access of Stationary Data in Systolic Arrays.
Proceedings of the Parallel Processing: CONPAR 92, 1992

1991
A systolic array for pyramidal algorithms.
VLSI Signal Processing, 1991

Specifying control signals for Systolic Arrays by Uniform Recurrence Equations.
Parallel Processing Letters, 1991

Specifying control signals for one-dimensional systolic arrays by uniform recurrence equations.
Proceedings of the Algorithms and Parallel VLSI Architectures II, 1991

1988
A new data structure for representing cell hierarchy in layout design.
Computers & Graphics, 1988


  Loading...