Chao Wang

Lei Gong

Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

WiderFrame: An Automatic Customization Framework for Building CNN Accelerators on FPGAs: Work-in-Progress.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2020

OctCNN: An Energy-Efficient FPGA Accelerator for CNNs using Octave Convolution Algorithm.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2020

2019

DCW: A Reactive and Predictable Programming Framework for LET-Based Distributed Real-Time Systems.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2019

WGAN-Based Synthetic Minority Over-Sampling Technique: Improving Semantic Fine-Grained Classification for Lung Nodules in CT Images.

[BibT_eX]

[DOI]

IEEE Access, 2019

FPNet: Customized Convolutional Neural Network for FPGA Platforms.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Field-Programmable Technology, 2019

An Overview of FPGA Based Deep Learning Accelerators: Challenges and Opportunities.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

GRU-ES: Resource Usage Prediction of Cloud Workloads Using a Novel Hybrid Method.

[BibT_eX]

[DOI]

Drama: A high efficient neural network accelerator on FPGA using dynamic reconfiguration: work-in-progress.

[BibT_eX]

[DOI]

Yang Yang

Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis Companion, 2019

Design Exploration of Multi-FPGAs for Accelerating Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019

Higher-order Transfer Learning for Pulmonary Nodule Attribute Prediction in Chest CT Images.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine, 2019

RV-CNN: Flexible and Efficient Instruction Set for CNNs Based on RISC-V Processors.

[BibT_eX]

[DOI]

Proceedings of the Advanced Parallel Processing Technologies, 2019

2018

MALOC: A Fully Pipelined FPGA Accelerator for Convolutional Neural Networks With All Layers Mapped on Chip.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

UniCNN: A Pipelined Accelerator Towards Uniformed Computing for CNNs.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2018

SparseNN: A Performance-Efficient Accelerator for Large-Scale Sparse Neural Networks.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2018

Chinese Language Processing Based on Stroke Representation and Multidimensional Representation.

[BibT_eX]

[DOI]

IEEE Access, 2018

Cambricon-S: Addressing Irregularity in Sparse Neural Networks through A Cooperative Software/Hardware Approach.

[BibT_eX]

[DOI]

Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

Domino: Graph Processing Services on Energy-Efficient Hardware Accelerator.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Web Services, 2018

Low-Shot Multi-label Incremental Learning for Thoracic Diseases Diagnosis.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing - 25th International Conference, 2018

MuDBN: An Energy-Efficient and High-Performance Multi-FPGA Accelerator for Deep Belief Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 on Great Lakes Symposium on VLSI, 2018

Domino: An Asynchronous and Energy-efficient Accelerator for Graph Processing: (Abstract Only).

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

RTMUS<sup><i>RT</i></sup>: a real-time testbed for empirically comparing real-time multicore schedulers: work-in-progress.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Embedded Software, 2018

WinoNN: optimising FPGA-based neural network accelerators using fast winograd algorithm (work-in-progress).

[BibT_eX]

[DOI]

Xuan Wang

Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2018

Furion: alleviating overheads for deep learning framework on single machine (work-in-progress).

[BibT_eX]

[DOI]

Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2018

Multi-order Transfer Learning for Pathologic Diagnosis of Pulmonary Nodule Malignancy.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2018

2017

A Classroom Scheduling Service for Smart Classes.

[BibT_eX]

[DOI]

IEEE Trans. Serv. Comput., 2017

Service-Oriented Architecture on FPGA-Based MPSoC.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2017

SuperMIC: Analyzing Large Biological Datasets in Bioinformatics with Maximal Information Coefficient.

[BibT_eX]

[DOI]

IEEE ACM Trans. Comput. Biol. Bioinform., 2017

DLAU: A Scalable Deep Learning Accelerator Unit on FPGA.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017

Hot spots profiling and dataflow analysis in custom dataflow computing SoftProcessors.

[BibT_eX]

[DOI]

J. Syst. Softw., 2017

New trends for pattern recognition: Theory and applications.

[BibT_eX]

[DOI]

Fernando Buarque

Neurocomputing, 2017

Reconfigurable Hardware Accelerators: Opportunities, Trends, and Challenges.

[BibT_eX]

[DOI]

CoRR, 2017

Editorial Soft Computing Applied to Swarm Robotics.

[BibT_eX]

[DOI]

Plamen Angelov

Oscar Castillo

Appl. Soft Comput., 2017

Work-in-Progress: TTI: A Timing ISA for LET Model in Safety-Critical Systems.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Real-Time Systems Symposium, 2017

Implementation and Optimization of the Accelerator Based on FPGA Hardware for LSTM Network.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Rethinking Energy-Efficiency of Heterogeneous Computing for CNN-Based Mobile Applications.

[BibT_eX]

[DOI]

Building a Game Benchmark for Cooperative CPU-GPU with Pseudo User-Interaction.

[BibT_eX]

[DOI]

A Predictable Servant-Based Execution Model for Safety-Critical Systems.

[BibT_eX]

[DOI]

Exploiting Aperiodic Server to Improve Aperiodic Responsiveness for LET-Based Real-Time Systems.

[BibT_eX]

[DOI]

A High-Performance Accelerator for Large-Scale Convolutional Neural Networks.

[BibT_eX]

[DOI]

Tickwerk: Design of a LET-Based SoC for Temporal Programming.

[BibT_eX]

[DOI]

Natural Language Processing Service Based on Stroke-Level Convolutional Networks for Chinese Text Classification.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Web Services, 2017

Evaluation and Trade-offs of Graph Processing for Cloud Services.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Web Services, 2017

xFilter: A Temporal Locality Accelerator for Intrusion Detection System Services.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Web Services, 2017

GenServ: Genome Sequencing Services on Scalable Energy Efficient Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Web Services, 2017

Light Weight Key-Value Store for Efficient Services on Local Distributed Mobile Devices.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Web Services, 2017

A Time-Aware Programming Framework for Constructing Predictable Real-Time Systems.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Conference on High Performance Computing and Communications; 15th IEEE International Conference on Smart City; 3rd IEEE International Conference on Data Science and Systems, 2017

Performance evaluation and optimization of HBM-Enabled GPU for data-intensive applications.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

FPGA Based Big Data Accelerator Design in Teaching Computer Architecture and Organization.

[BibT_eX]

[DOI]

Proceedings of the Cyber Physical Systems. Design, Modeling, and Evaluation, 2017

A power-efficient and high performance FPGA accelerator for convolutional neural networks: work-in-progress.

[BibT_eX]

[DOI]

Proceedings of the Twelfth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis Companion, 2017

A Power-Efficient Accelerator Based on FPGAs for LSTM Network.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

OmniGraph: A Scalable Hardware Accelerator for Graph Processing.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

A Power-Efficient Accelerator for Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

Mermaid: Integrating Vertex-Centric with Edge-Centric for Real-World Graph Processing.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

TuNao: A High-Performance and Energy-Efficient Reconfigurable Accelerator for Graph Processing.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

A high-performance FPGA accelerator for sparse neural networks: work-in-progress.

[BibT_eX]

[DOI]

Proceedings of the 2017 International Conference on Compilers, 2017

Distributed gene clinical decision support system based on cloud computing.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Bioinformatics and Biomedicine, 2017

Clockwerk: A Predictable and Efficient Extension of Logical Execution Time Model.

[BibT_eX]

[DOI]

Proceedings of the 24th Asia-Pacific Software Engineering Conference, 2017

2016

Evaluation and Tradeoffs for Out-of-Order Execution on Reconfigurable Heterogeneous MPSoC.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2016

Hardware Implementation on FPGA for Task-Level Parallel Dataflow Execution Engine.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2016

Guest Editorial for Special Section on Big Data Computing and Processing in Computational Biology and Bioinformatics.

[BibT_eX]

[DOI]

IEEE ACM Trans. Comput. Biol. Bioinform., 2016

Definitions of predictability for Cyber Physical Systems.

[BibT_eX]

[DOI]

J. Syst. Archit., 2016

Preface to the Special Issue on Sequential Code Parallelization.

[BibT_eX]

[DOI]

Aili Wang

Int. J. Parallel Program., 2016

A Parallel Yet Pipelined Architecture for Efficient Implementation of the Advanced Encryption Standard Algorithm on Reconfigurable Hardware.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2016

Parallel Implementations of the Cooperative Particle Swarm Optimization on Many-core and Multi-core Architectures.

[BibT_eX]

[DOI]

Rogério De Moraes Calazan

Int. J. Parallel Program., 2016

KUMMS: optimising DRAM locality with Kernel-user behaviours.

[BibT_eX]

[DOI]

Int. J. High Perform. Syst. Archit., 2016

CNNLab: a Novel Parallel Framework for Neural Networks using GPU and FPGA-a Practical Study with Trade-off Analysis.

[BibT_eX]

[DOI]

CoRR, 2016

Soft computing in big data intelligent transportation systems.

[BibT_eX]

[DOI]

Appl. Soft Comput., 2016

SCADIS: A Scalable Accelerator for Data-Intensive String Set Matching on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, 2016

Brief Announcement: MIC++: Accelerating Maximal Information Coefficient Calculation with GPUs and FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, 2016

Behavior-Aware Integrated CPU-GPU Power Management for Mobile Games.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Modeling, 2016

FairPlay: Services Migration with Lock-Free Mechanisms for Load Balancing in Cloud Architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Web Services, 2016

SOLAR: Services-Oriented Learning Architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Web Services, 2016

PIE: A Pipeline Energy-Efficient Accelerator for Inference Process in Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

FCM: Towards Fine-Grained GPU Power Management for Closed Source Mobile Games.

[BibT_eX]

[DOI]

Proceedings of the 26th edition on Great Lakes Symposium on VLSI, 2016

Display power reduction for mobile closed-source games.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE International Conference on Application-specific Systems, 2016

2015

FreeRider: Non-Local Adaptive Network-on-Chip Routing with Packet-Carried Propagation of Congestion Information.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2015

Heterogeneous Cloud Framework for Big Data Genome Sequencing.

[BibT_eX]

[DOI]

IEEE ACM Trans. Comput. Biol. Bioinform., 2015

Architecture Support for Task Out-of-Order Execution in MPSoCs.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2015

A case study of parallel JPEG encoding on an FPGA.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2015

CRAIS: A Crossbar-Based Interconnection Scheme on FPGA for Big Data.

[BibT_eX]

[DOI]

Xi Li

J. Comput. Sci. Technol., 2015

XEMU: a cross-ISA full-system emulator on multiple processor architectures.

[BibT_eX]

[DOI]

Huang Wang

Huaping Chen

Int. J. High Perform. Syst. Archit., 2015

Fast approximate hash table using extended counting Bloom filter.

[BibT_eX]

[DOI]

Int. J. Comput. Sci. Eng., 2015

SAKMA: Specialized FPGA-Based Accelerator Architecture for Data-Intensive K-Means Algorithms.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

RapidPath: Accelerating Constrained Shortest Path Finding in Graphs on FPGA (Abstract Only).

[BibT_eX]

[DOI]

Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015

SODA: software defined FPGA based accelerators for big data.

[BibT_eX]

[DOI]

Xi Li

Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

An FPGA-Based Accelerator for Neighborhood-Based Collaborative Filtering Recommendation Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

A Deep Learning Prediction Process Accelerator Based FPGA.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

2014

Accelerating the Next Generation Long Read Mapping with the FPGA-Based System.

[BibT_eX]

[DOI]

IEEE ACM Trans. Comput. Biol. Bioinform., 2014

Colored Petri Net model with automatic parallelization on real-time multicore architectures.

[BibT_eX]

[DOI]

J. Syst. Archit., 2014

Amdahl's and Hill-Marty laws revisited for FPGA-based MPSoCs: from theory to practice.

[BibT_eX]

[DOI]

Int. J. High Perform. Syst. Archit., 2014

Memory power optimisation on low-bit multi-access cross memory address mapping schema.

[BibT_eX]

[DOI]

Int. J. Embed. Syst., 2014

Memory power optimization on different memory address mapping schemas.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications, 2014

Multi-objective aware design flow for coarse-grained systems on chip.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications, 2014

Kernel-User Space Separation in DRAM Memory.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2014

Trade-offs between the sensitivity and the speed of the FPGA-based sequence aligner.

[BibT_eX]

[DOI]

Proceedings of the 2014 International Symposium on Integrated Circuits (ISIC), 2014

Big data genome sequencing on Zynq based clusters (abstract only).

[BibT_eX]

[DOI]

Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014

Co-processing with dynamic reconfiguration on heterogeneous MPSoC: practices and design tradeoffs (abstract only).

[BibT_eX]

[DOI]

Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014

Instruction Extension and Generation for Adaptive Processors.

[BibT_eX]

[DOI]

Proceedings of the Reconfigurable Computing: Architectures, Tools, and Applications, 2014

2013

MP-Tomasulo: A Dependency-Aware Automatic Parallel Execution Engine for Sequential Programs.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2013

Heterothread: hybrid thread level parallelism on heterogeneous multicore architectures.

[BibT_eX]

[DOI]

Xi Li