Yong Dou

According to our database1, Yong Dou authored at least 176 papers between 2003 and 2019.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2019
Exploring frame segmentation networks for temporal action localization.
J. Visual Communication and Image Representation, 2019

A Novel Memory-Scheduling Strategy for Large Convolutional Neural Network on Memory-Limited Devices.
Comp. Int. and Neurosc., 2019

GBCNN: A Full GPU-Based Batch Multi-Task Cascaded Convolutional Networks.
IEEE Access, 2019

Accelerated Inference Framework of Sparse Neural Network Based on Nested Bitmask Structure.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Heavy-ball Algorithms Always Escape Saddle Points.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Spatial Attention Network for Few-Shot Learning.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2019: Deep Learning, 2019

2018
CaFPGA: An automatic generation model for CNN accelerator.
Microprocessors and Microsystems - Embedded Hardware Design, 2018

Distributed sparse bundle adjustment algorithm based on three-dimensional point partition and asynchronous communication.
Frontiers of IT & EE, 2018

Local kernel alignment based multi-view clustering using extreme learning machine.
Neurocomputing, 2018

An efficient CPU-GPU hybrid parallel implementation for DVB-RCS2 receiver.
Concurrency and Computation: Practice and Experience, 2018

High performance robust audio event recognition system based on FPGA platform.
Cognitive Systems Research, 2018

A Community Detection Approach to Cleaning Extremely Large Face Database.
Comp. Int. and Neurosc., 2018

SpinMag: A New Fingerprinting Method for Robot Indoor Localization with Geomagnetic Field.
Ad Hoc & Sensor Wireless Networks, 2018

Frame Segmentation Networks for Temporal Action Localization.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Image Translation Between High-Resolution Remote Sensing Optical and SAR Data Using Conditional GAN.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Spatial Attention Network for Head Detection.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Visual Tree Convolutional Neural Network in Image Classification.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

Visual Confusion Label Tree for Image Classification.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Temporal Pyramid Relation Network for Video-Based Gesture Recognition.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Deep Image Clustering Using Convolutional Autoencoder Embedding with Inception-Like Block.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

mmCNN: A Novel Method for Large Convolutional Neural Network on Memory-Limited Devices.
Proceedings of the 2018 IEEE 42nd Annual Computer Software and Applications Conference, 2018

Learning Generic Diffusion Processes for Image Restoration.
Proceedings of the British Machine Vision Conference 2018, 2018

paraSNF: An Parallel Approach for Large-Scale Similarity Network Fusion.
Proceedings of the Advanced Computer Architecture - 12th Conference, 2018

Research on Acceleration Method of Speech Recognition Training.
Proceedings of the Advanced Computer Architecture - 12th Conference, 2018

Exploring Temporal Preservation Networks for Precise Temporal Action Localization.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Throughput-Optimized FPGA Accelerator for Deep Convolutional Neural Networks.
TRETS, 2017

Qualitative Action Recognition by Wireless Radio Signals in Human-Machine Systems.
IEEE Trans. Human-Machine Systems, 2017

Variational single image interpolation with time-varying regularization.
Sig. Proc.: Image Comm., 2017

Multiple kernel learning with hybrid kernel alignment maximization.
Pattern Recognition, 2017

Airport Detection on Optical Satellite Images Using Deep Convolutional Neural Networks.
IEEE Geosci. Remote Sensing Lett., 2017

An optimized design of CAN FD for automotive cyber-physical systems.
Journal of Systems Architecture - Embedded Systems Design, 2017

Heterogeneous blocked CPU-GPU accelerate scheme for large scale extreme learning machine.
Neurocomputing, 2017

A fast and memory saved GPU acceleration algorithm of convolutional neural networks for target detection.
Neurocomputing, 2017

Multiple kernel clustering with corrupted kernels.
Neurocomputing, 2017

Robust regularized extreme learning machine for regression using iteratively reweighted least squares.
Neurocomputing, 2017

Ranking Support Vector Machine with Kernel Approximation.
Comp. Int. and Neurosc., 2017

Adaptive Energy-Aware Computation Offloading for Cloud of Things Systems.
IEEE Access, 2017

Learning Non-local Image Diffusion for Image Denoising.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Approximate Large-scale Multiple Kernel k-means Using Deep Neural Network.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Multiple Kernel Clustering Framework with Improved Kernels.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Confusion Graph: Detecting Confusion Communities in Large Scale Image Classification.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

An FPGA-based processor for training convolutional neural networks.
Proceedings of the International Conference on Field Programmable Technology, 2017

Accuracy Evaluation of Long Short Term Memory Network Based Language Model with Fixed-Point Arithmetic.
Proceedings of the Applied Reconfigurable Computing - 13th International Symposium, 2017

Platform-Adaptive High-Throughput Surveillance Video Condensation on Heterogeneous Processor Clusters.
Proceedings of the Advanced Parallel Processing Technologies, 2017

Optimal Neighborhood Kernel Clustering with Multiple Kernels.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Multiple Kernel k-Means with Incomplete Kernels.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Coarse-Grained Architecture for Fingerprint Matching.
TRETS, 2016

An FPGA Implementation for Solving the Large Single-Source-Shortest-Path Problem.
IEEE Trans. on Circuits and Systems, 2016

Affine-Transformation Parameters Regression for Face Alignment.
IEEE Signal Process. Lett., 2016

Classification of Hyperspectral Remote Sensing Image Using Hierarchical Local-Receptive-Field-Based Extreme Learning Machine.
IEEE Geosci. Remote Sensing Lett., 2016

Relative distance features for gait recognition with Kinect.
J. Visual Communication and Image Representation, 2016

Leveraging local receptive fields based random weights networks for hyperspectral image classification.
Journal of Intelligent and Fuzzy Systems, 2016

A novel multi-view clustering method via low-rank and matrix-induced regularization.
Neurocomputing, 2016

An efficient and effective convolutional auto-encoder extreme learning machine network for 3d feature learning.
Neurocomputing, 2016

Multi-view clustering with extreme learning machine.
Neurocomputing, 2016

PR-ELM: Parallel regularized extreme learning machine based on cluster.
Neurocomputing, 2016

Joint diversity regularization and graph regularization for multiple kernel k-means clustering via latent variables.
Neurocomputing, 2016

Face Verification Algorithm with Exploiting Feature Distribution.
Proceedings of the PRICAI 2016: Trends in Artificial Intelligence, 2016

ELM based multiple kernel k-means with diversity-induced regularization.
Proceedings of the 2016 International Joint Conference on Neural Networks, 2016

Airport detection from remote sensing images using transferable convolutional neural networks.
Proceedings of the 2016 International Joint Conference on Neural Networks, 2016

Multiple Kernel Clustering with Local Kernel Alignment Maximization.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Region-based convolutional neural networks for object detection in very high resolution remote sensing images.
Proceedings of the 12th International Conference on Natural Computation, 2016

Hyperspectral image classification via kernel extreme learning machine using local receptive fields.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Localized region context and object feature fusion for people head detection.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Optimized GPU Acceleration Algorithm of Convolutional Neural Networks for Target Detection.
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016

Automatic code generation of convolutional neural networks in FPGA implementation.
Proceedings of the 2016 International Conference on Field-Programmable Technology, 2016

Multiple Kernel k-Means Clustering with Matrix-Induced Regularization.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Urban Land Use and Land Cover Classification Using Remotely Sensed SAR Data through Deep Belief Networks.
J. Sensors, 2015

A deeply-pipelined FPGA-based SpMV accelerator with a hardware-friendly storage scheme.
IEICE Electronic Express, 2015

An efficient multi-standard QC-LDPC decoder based on the row-layered decoding algorithm.
IEICE Electronic Express, 2015

Efficient graphics processing unit based layered decoders for quasicyclic low-density parity-check codes.
Concurrency and Computation: Practice and Experience, 2015

An Efficient Robust Eye Localization by Learning the Convolution Distribution Using Eye Template.
Comp. Int. and Neurosc., 2015

Accelerating Molecular Dynamics Simulations on Heterogeneous Architecture.
Proceedings of the Computer Engineering and Technology - 19th CCF Conference, 2015

Designing Parallel Sparse Matrix Transposition Algorithm Using ELLPACK-R for GPUs.
Proceedings of the Computer Engineering and Technology - 19th CCF Conference, 2015

Optimized deep belief networks on CUDA GPUs.
Proceedings of the 2015 International Joint Conference on Neural Networks, 2015

Exploring Relative Motion Features for Gait Recognition with Kinect.
Proceedings of the Neural Information Processing - 22nd International Conference, 2015

Hyperspectral image classification via local receptive fields based random weights networks.
Proceedings of the 11th International Conference on Natural Computation, 2015

Classification of Tiangong-1 hyperspectral remote sensing image via contextual sparse coding.
Proceedings of the 2015 International Conference on Machine Learning and Cybernetics, 2015

Depth enhancement via non-local means filter.
Proceedings of the Seventh International Conference on Advanced Computational Intelligence, 2015

Absent Multiple Kernel Learning.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
FPGA Implementation of a Special-Purpose VLIW Structure for Double-Precision Elementary Function.
TRETS, 2014

CuSora: Real-time software radio using multi-core graphics processing unit.
Journal of Systems Architecture - Embedded Systems Design, 2014

Design and Implement of High Performance Crypto Coprocessor.
IEICE Transactions, 2014

Efficient Parallel Interference Cancellation MIMO Detector for Software Defined Radio on GPUs.
IEICE Transactions, 2014

An Efficient Parallel SOVA-Based Turbo Decoder for Software Defined Radio on GPU.
IEICE Transactions, 2014

Parallel graph traversal for FPGA.
IEICE Electronic Express, 2014

Transpose-free variable-size FFT accelerator based on-chip SRAM.
IEICE Electronic Express, 2014

Supernodal sparse Cholesky factorization on graphics processing units.
Concurrency and Computation: Practice and Experience, 2014

CPU-GPU hybrid parallel strategy for cosmological simulations.
Concurrency and Computation: Practice and Experience, 2014

Efficient parallel implementation of three-point viterbi decoding algorithm on CPU, GPU, and FPGA.
Concurrency and Computation: Practice and Experience, 2014

A piecewise-based contrast enhancement framework for low lighting video.
Proceedings of the Proceedings IEEE International Conference on Security, 2014

Efficient parallel implementation of morphological operation on GPU and FPGA.
Proceedings of the Proceedings IEEE International Conference on Security, 2014

3D pipeline contention: Asymmetric full duplex in wireless networks.
Proceedings of the 2014 IEEE Conference on Computer Communications, 2014

Classification of land cover based on deep belief networks using polarimetric RADARSAT-2 data.
Proceedings of the 2014 IEEE Geoscience and Remote Sensing Symposium, 2014

Vehicle Recognition for Surveillance Video Using Sparse Coding.
Proceedings of the Pattern Recognition - 6th Chinese Conference, 2014

A Study on Layer Connection Strategies in Stacked Convolutional Deep Belief Networks.
Proceedings of the Pattern Recognition - 6th Chinese Conference, 2014

A high throughput K-best detector on FPGA.
Proceedings of the IEEE International Black Sea Conference on Communications and Networking, 2014

A Novel Design of Flexible Crypto Coprocessor and Its Application.
Proceedings of the Advanced Computer Architecture - 10th Annual Conference, 2014

2013
FPGA implementation of an exact dot product and its application in variable-precision floating-point arithmetic.
The Journal of Supercomputing, 2013

High-Performance Architecture for the Conjugate Gradient Solver on FPGAs.
IEEE Trans. on Circuits and Systems, 2013

VLIW coprocessor for IEEE-754 quadruple-precision elementary functions.
TACO, 2013

From WiFi to WiMAX: Efficient GPU-based Parameterized Transceiver across Different OFDM Protocols.
TIIS, 2013

Parallel Sparse Cholesky Factorization on a Heterogeneous Platform.
IEICE Transactions, 2013

Window Memory Layout Scheme for Alternate Row-Wise/Column-Wise Matrix Access.
IEICE Transactions, 2013

High performance sparse matrix-vector multiplication on FPGA.
IEICE Electronic Express, 2013

A fully parallel truncated Viterbi decoder for Software Defined Radio on GPUs.
Proceedings of the 2013 IEEE Wireless Communications and Networking Conference (WCNC), 2013

A multi-standard efficient column-layered LDPC decoder for Software Defined Radio on GPUs.
Proceedings of the 14th IEEE Workshop on Signal Processing Advances in Wireless Communications, 2013

Design and Implementation of Novel Flexible Crypto Coprocessor and Its Application in Security Protocol.
Proceedings of the Computer Engineering and Technology - 17th CCF Conference, 2013

Direction-Optimizing Breadth-First Search on CPU-GPU Heterogeneous Platforms.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013

Empirical Evaluation of Fixed-Point Arithmetic for Deep Belief Networks.
Proceedings of the Reconfigurable Computing: Architectures, Tools and Applications, 2013

2012
The unified accelerator architecture for RNA secondary structure prediction on FPGA.
The Journal of Supercomputing, 2012

A High Performance and Memory Efficient LU Decomposer on FPGAs.
IEEE Trans. Computers, 2012

Design and Implementation of the Parameterized Multi-Standard High-Throughput Radix-4 Viterbi Decoder on FPGA.
IEICE Transactions, 2012

Optimization schemes and performance evaluation of Smith-Waterman algorithm on CPU, GPU and FPGA.
Concurrency and Computation: Practice and Experience, 2012

CPU-GPU hybrid accelerating the Zuker algorithm for RNA secondary structure prediction applications.
BMC Genomics, 2012

Parallelizing sparse LU decomposition on FPGAs.
Proceedings of the 2012 International Conference on Field-Programmable Technology, 2012

A self-organizing and self-adaptive French flag Organism based on lateral activation model.
Proceedings of the IEEE Congress on Evolutionary Computation, 2012

A bio-inspired self-organizing approach for multicellular embryonic architecture.
Proceedings of the 2012 NASA/ESA Conference on Adaptive Hardware and Systems, 2012

2011
FPGA-Specific Custom VLIW Architecture for Arbitrary Precision Floating-Point Arithmetic.
IEICE Transactions, 2011

FPGA accelerator for protein secondary structure prediction based on the GOR algorithm.
BMC Bioinformatics, 2011

A high-throughput reconfigurable Viterbi decoder.
Proceedings of the 2011 International Conference on Wireless Communications & Signal Processing, 2011

Special-purposed VLIW architecture for IEEE-754 quadruple precision elementary functions on FPGA.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011

VPFPAP: A Special-Purpose VLIW Processor for Variable-Precision Floating-Point Arithmetic.
Proceedings of the International Conference on Field Programmable Logic and Applications, 2011

FPGA Implementation of Variable-Precision Floating-Point Arithmetic.
Proceedings of the Advanced Parallel Processing Technologies - 9th International Symposium, 2011

Etissue: A bio-inspired match-based reconfigurable hardware architecture supporting hierarchical self-healing and self-evolution.
Proceedings of the 2011 NASA/ESA Conference on Adaptive Hardware and Systems, 2011

2010
Fine-grained parallel RNA secondary structure prediction using SCFGs on FPGA.
Parallel Computing, 2010

A Unified Co-Processor Architecture for Matrix Decomposition.
J. Comput. Sci. Technol., 2010

Fpqrna: Hardware-Accelerated Qrna Package for noncoding RNA Gene Detecting on FPGA.
J. Bioinformatics and Computational Biology, 2010

FPGA accelerating double/quad-double high precision floating-point applications for ExaScale computing.
Proceedings of the 24th International Conference on Supercomputing, 2010

Automatic synthesis of processor arrays with local memories on FPGAs.
Proceedings of the International Conference on Field-Programmable Technology, 2010

High performance and memory efficient implementation of matrix multiplication on FPGAs.
Proceedings of the International Conference on Field-Programmable Technology, 2010

Blocking LU Decomposition for FPGAs.
Proceedings of the 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2010

2009
Loop Kernel Pipelining Mapping onto Coarse-Grained Reconfigurable Architecture for Data-Intensive Applications.
JSW, 2009

FPGA Accelerator for Wavelet-Based Automated Global Image Registration.
EURASIP J. Emb. Sys., 2009

A Reconfigurable Architecture for Rotation Invariant Multi-View Face Detection Based on a Novel Two-Stage Boosting Method.
EURASIP J. Adv. Sig. Proc., 2009

A coarse-grained reconfigurable computing architecture with loop self-pipelining.
Science in China Series F: Information Sciences, 2009

Fine-grained parallel RNAalifold algorithm for RNA secondary structure prediction on FPGA.
BMC Bioinformatics, 2009

FPGA-based Memory-efficient Parallel RNA Secondary Structure Prediction Accelerator Using SCFGs.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2009

Exploiting Fine-Grained Pipeline Parallelism for Wavefront Computations on Multicore Platforms.
Proceedings of the ICPPW 2009, 2009

FPGA accelerating three QR decomposition algorithms in the unified pipelined framework.
Proceedings of the 19th International Conference on Field Programmable Logic and Applications, 2009

A Fine-grained Pipelined Implementation of the LINPACK Benchmark on FPGAs.
Proceedings of the FCCM 2009, 2009

Fine-grained parallel application specific computing for RNA secondary structure prediction using SCFGS on FPGA.
Proceedings of the 2009 International Conference on Compilers, 2009

A Fine-Grained Pipelined Implementation for Large-Scale Matrix Inversion on FPGA.
Proceedings of the Advanced Parallel Processing Technologies, 8th International Symposium, 2009

Implementation of Rotation Invariant Multi-View Face Detection on FPGA.
Proceedings of the Advanced Parallel Processing Technologies, 8th International Symposium, 2009

2008
Rectangularly Multi-Module Memory System with Table-Based Dynamic Addressing Scheme.
Proceedings of The 2008 IEEE International Conference on Networking, 2008

DMA Performance Analysis and Multi-core Memory Optimization for SWIM Benchmark on the Cell Processor.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2008

Subblock-Based BPE Scheme to Conquer Mismatch in Memory Access Pattern.
Proceedings of the 4th International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2008), 2008

Dynamic Configurable Floating-Point FFT Pipelines and Hybrid-Mode CORDIC on FPGA.
Proceedings of the International Conference on Embedded Software and Systems, 2008

Fine-grained parallel application specific computing for RNA secondary structure prediction on FPGA.
Proceedings of the 26th International Conference on Computer Design, 2008

Double Precision Hybrid-Mode Floating-Point FPGA CORDIC Co-processor.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008

Dimensional Bubble Flow Control and Fully Adaptive Routing in the 2-D Mesh Network on Chip.
Proceedings of the 2008 IEEE/IPIP International Conference on Embedded and Ubiquitous Computing (EUC 2008), 2008

Families of FPGA-Based Accelerators for BLAST Algorithm with Multi-seeds Detection and Parallel Extension.
Proceedings of the Bioinformatics Research and Development, 2008

Fine-Grained Parallel Zuker Algorithm Accelerator with Storage Optimization on FPGA.
Proceedings of the International Conference on Bioinformatics & Computational Biology, 2008

Collaborative hardware/software partition of coarse-grained reconfigurable system using evolutionary ant colony optimization.
Proceedings of the 13th Asia South Pacific Design Automation Conference, 2008

Hybrid-Mode Floating-Point FPGA CORDIC Co-processor.
Proceedings of the Reconfigurable Computing: Architectures, 2008

Hardware BLAST Algorithms with Multi-seeds Detection and Parallel Extension.
Proceedings of the Reconfigurable Computing: Architectures, 2008

Multi-access memory architecture for image applications with multiple interested regions.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2008

Area and throughput trade-offs in design of arithmetic encoder for JPEG2000.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2008

Computation rotating for data reuse.
Proceedings of the 13th Asia-Pacific Computer Systems Architecture Conference, 2008

2007
FIDP: A Novel Architecture for Lifting-Based 2D DWT in JPEG2000.
Proceedings of the Advances in Multimedia Modeling, 2007

FPGA Accelerating Algorithms of Active Shape Model in People Tracking Applications.
Proceedings of the Tenth Euromicro Conference on Digital System Design: Architectures, 2007

Distributed Collaborative Partition Method of Reconfigurable SoC Using Ant Colony Optimization.
Proceedings of the 11th International Conference on Computer Supported Cooperative Work in Design, 2007

A Parameterized Architecture Model in High Level Synthesis for Image Processing Applications.
Proceedings of the 12th Conference on Asia South Pacific Design Automation, 2007

FPGA SAR Processor with Window Memory Accesses.
Proceedings of the IEEE International Conference on Application-Specific Systems, 2007

FPGA-Accelerated Molecular Dynamics Simulations: An Overview.
Proceedings of the Reconfigurable Computing: Architectures, 2007

The Implementation of a Coarse-Grained Reconfigurable Architecture with Loop Self-pipelining.
Proceedings of the Reconfigurable Computing: Architectures, 2007

Optimized Generation of Memory Structure in Compiling Window Operations onto Reconfigurable Hardware.
Proceedings of the Reconfigurable Computing: Architectures, 2007

Reducing Storage Requirements in Accelerating Algorithm of Global BioSequence Alignment on FPGA.
Proceedings of the Advanced Parallel Processing Technologies, 7th International Symposium, 2007

FPGA-Accelerated Active Shape Model for Real-Time People Tracking.
Proceedings of the Advances in Computer Systems Architecture, 2007

2006
Progress and Challenges in High Performance Computer Technology.
J. Comput. Sci. Technol., 2006

Clustering Multicast on Hypercube Network.
Proceedings of the High Performance Computing and Communications, 2006

Robust and real-time automatic target recognition using partial hausdorff distance measure on reconfigurable hardware.
Proceedings of the 2006 IEEE International Conference on Field Programmable Technology, 2006

Designing a Coarse-Grained Reconfigurable Architecture Using Loop Self-Pipelining.
Proceedings of the Advances in Computer Systems Architecture, 11th Asia-Pacific Conference, 2006

2005
64-bit floating-point FPGA matrix multiplication.
Proceedings of the ACM/SIGDA 13th International Symposium on Field Programmable Gate Arrays, 2005

RIMP: Runtime Implicit Predication.
Proceedings of the Advanced Parallel Processing Technologies, 6th International Workshop, 2005

2003
LEAP: A Data Driven Loop Engine on Array Processor.
Proceedings of the Advanced Parallel Programming Technologies, 5th International Workshop, 2003


  Loading...