2019

Supernode transformation on GPGPUs.

IJPEDS, 2019

2014

A Case Study of Implementing Supernode Transformations.

International Journal of Parallel Programming, 2014

On optimal media/video distribution in closed P2P-based IPTV networks.

Computer Networks, 2014

Parallelized feature extraction and acoustic model training.

Proceedings of the 19th International Conference on Digital Signal Processing, 2014

2013

ReShape: Towards a High-Level Approach to Design and Operation of Modular Reconfigurable Systems.

TRETS, 2013

Optimized MFCC feature extraction on GPU.

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

On Optimizing the Longest Common Subsequence Problem by Loop Unrolling Along Wavefronts.

Proceedings of the 20th Euromicro International Conference on Parallel, 2012

2010

Context Adaptive Lagrange Multiplier (CALM) for Rate-Distortion Optimal Motion Estimation in Video Coding.

IEEE Trans. Circuits Syst. Video Techn., 2010

On minimizing register usage of linearly scheduled algorithms with uniform dependencies.

Computer Languages, Systems & Structures, 2010

Flexible and Modular Support for Timing Functions in High Performance Networking Acceleration.

Proceedings of the International Conference on Field Programmable Logic and Applications, 2010

ShapeUp: A High-Level Design Approach to Simplify Module Interconnection on FPGAs.

Proceedings of the 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2010

2009

Compiler Optimization Pass Visualization: The Procedural Abstraction Case.

TOCE, 2009

Visualization of Procedural Abstraction.

Electr. Notes Theor. Comput. Sci., 2009

Optimizing the stack size of recursive functions.

Computer Languages, Systems & Structures, 2009

Optimal dissemination of layered videos in P2P-Based IPTV networks.

Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Procedural Abstraction with Reverse Prefix Trees.

Proceedings of the CGO 2009, 2009

2007

Coefficient Conversion for Transform Domain VC-1 TO H.264 Transcoding.

Proceedings of the IEEE Workshop on Signal Processing Systems, 2007

Chroma Coding Efficiency Improvement with Context Adaptive Lagrange Multiplier (CALM).

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2007), 2007

Stack size reduction of recursive programs.

Proceedings of the 2007 International Conference on Compilers, 2007

2006

Bit rate distribution for motion estimation in H.264 coding.

IEEE Trans. Consumer Electronics, 2006

2004

Performance Trade-offs of DCT with Variable Length Carry Chains in FPGAs.

Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2004

2003

On Data Locality in Supernode Transformation.

Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2003

2002

Bit-level two's complement matrix multiplication.

Integration, 2002

A faster distributed arithmetic architecture for FPGAs.

Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2002

2000

A Comparison of FPGA Implementations of Bit-Level and Word-Level Matrix Multipliers.

Proceedings of the Field-Programmable Logic and Applications, 2000

1999

On Time Optimal Supernode Shape.

Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 1999

1996

On Uniformization of Affine Dependence Algorithms.

IEEE Trans. Computers, 1996

On Optimal Size and Shape of Supernode Transformations.

Proceedings of the 1996 International Conference on Parallel Processing, 1996

On Supernode Transformation with Minimized Total Running Time.

Proceedings of the 1996 International Conference on Application-Specific Systems, 1996

1994

Algorithm-Specific Parallel Processing with Linear Processing Arrays.

Advances in Computers, 1994

Queueing performance analysis of co-scheduling in a pool of processors environment.

Proceedings of the 8th international conference on Supercomputing, 1994

Data alignment of loop nests without nonlocal communications.

Proceedings of the International Conference on Application Specific Array Processors, 1994

1993

Mapping of Uniform Dependence Algorithm onto Fixed Size Processor Arrays.

Proceedings of the Seventh International Parallel Processing Symposium, 1993

Dependence Analysis and Architecture Design for Bit-Level Algorithms.

Proceedings of the 1993 International Conference on Parallel Processing, 1993

An algorithm for accurate data dependence test.

Proceedings of the International Conference on Application-Specific Array Processors, 1993

1992

On Time Mapping of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays.

IEEE Trans. Parallel Distrib. Syst., 1992

Independent Partitioning of Algorithms with Uniform Dependencies.

IEEE Trans. Computers, 1992

On Uniformization of Affine Dependence Algorithms.

Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing, 1992

Conflict-Free Scheduling of Nested Loop Algorithms on Lower Dimensional Processor Arrays.

Proceedings of the 6th International Parallel Processing Symposium, 1992

1991

Time Optimal Linear Schedules for Algorithms with Uniform Dependencies.

IEEE Trans. Computers, 1991

On Loop Transformations for Generalized Cycle Shrinking.

Proceedings of the International Conference on Parallel Processing, 1991

Generalized cycle shrinking.

Proceedings of the Algorithms and Parallel VLSI Architectures II, 1991

1990

Time-Optimal and Conflict-Free Mappings of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays.

Proceedings of the 1990 International Conference on Parallel Processing, 1990

1989

On the optimality of linear schedules.

VLSI Signal Processing, 1989

1988

Systematic Designs of Buffers in Macropipelines of Systolic Arrays.

J. Parallel Distrib. Comput., 1988

Independent Partitioning of Algorithms With Uniform Data Dependencies.

Proceedings of the International Conference on Parallel Processing, 1988