Bo-Cheng Lai

Jhih-Yong Mai

IEEE ACM Trans. Comput. Biol. Bioinform., 2023

GRONA : A Framework for Gather-and-Reduce On Near-Memory Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 16th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2023

REGAL: Reprogrammable Engines for Genome Analysis on LPDDR4x-based Stacked DRAM.

[BibT_eX]

[DOI]

Yuhao Fang

Proceedings of the IEEE International Symposium on Circuits and Systems, 2023

Low Latency Edge Classification GNN for Particle Trajectory Tracking on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 33rd International Conference on Field-Programmable Logic and Applications, 2023

2022

Graph Neural Networks for Charged Particle Tracking on FPGAs.

[BibT_eX]

[DOI]

Frontiers Big Data, 2022

DSIM: Distributed Sequence Matching on Near-DRAM Accelerator for Genome Assembly.

[BibT_eX]

[DOI]

IEEE J. Emerg. Sel. Topics Circuits Syst., 2022

Distributed Sorting Architecture on Multiple FPGA.

[BibT_eX]

[DOI]

Yi-Da Hsin

Yen-Shi Kuo

Proceedings of the 2022 International Symposium on VLSI Design, Automation and Test, 2022

MSIM: A Highly Parallel Near-Memory Accelerator for MinHash Sketch.

[BibT_eX]

[DOI]

Jhih-Yong Mai

Proceedings of the 35th IEEE International System-on-Chip Conference, 2022

DLPrPPG: Development and Design of Deep Learning Platform for Remote Photoplethysmography.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

Anime Character Recognition using Intermediate Features Aggregation.

[BibT_eX]

[DOI]

Edwin Arkel Rios

Min-Chun Hu

Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

A Highly Parallel Fine-Grained Sort-Merge Join on Near Memory Computing.

[BibT_eX]

[DOI]

Po-Yen Lin

Yen-Shi Kuo

Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

DASC: A DRAM Data Mapping Methodology for Sparse Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

2021

Graph Neural Networks for Charged Particle Tracking on FPGAs.

[BibT_eX]

[DOI]

CoRR, 2021

DAF: re: A Challenging, Crowd-Sourced, Large-Scale, Long-Tailed Dataset For Anime Character Recognition.

[BibT_eX]

[DOI]

Edwin Arkel Rios

Wen-Huang Cheng

CoRR, 2021

Reconfigurable Database Processor for Query Acceleration on FPGA.

[BibT_eX]

[DOI]

Bo-En Chen

Bo-Yen Lin

Proceedings of the International Symposium on VLSI Design, Automation and Test, 2021

On Reconfiguring Memory-Centric AI Edge Devices for CIM.

[BibT_eX]

[DOI]

Proceedings of the 18th International SoC Design Conference, 2021

Parametric Study of Performance of Remote Photopletysmography System.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

2020

REMAP+: An Efficient Banking Architecture for Multiple Writes of Algorithmic Memory.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2020

A Novel Smart Assistance System for Blood Vessel Approaching: A Technical Report Based on Oximetry.

[BibT_eX]

[DOI]

Sensors, 2020

Selective bypassing and mapping for heterogeneous applications on GPGPUs.

[BibT_eX]

[DOI]

Moustafa Emara

J. Parallel Distributed Comput., 2020

Dataflow and microarchitecture co-optimisation for sparse CNN on distributed processing element accelerator.

[BibT_eX]

[DOI]

Duc-An Pham

IET Circuits Devices Syst., 2020

A Two-Directional BigData Sorting Architecture on FPGAs.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2020

On EDA Solutions for Reconfigurable Memory-Centric AI Edge Applications.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

2019

Enhancing Utilization of SIMD-Like Accelerator for Sparse Convolutional Neural Networks.

[BibT_eX]

[DOI]

Jyun-Wei Pan

Chien-Yu Lin

IEEE Trans. Very Large Scale Integr. Syst., 2019

Efficient Write Scheme for Algorithm-Based Multi-Ported Memory.

[BibT_eX]

[DOI]

Bo-Ya Chen

Bo-En Chen

Proceedings of the International Symposium on VLSI Design, Automation and Test, 2019

DP2: A Highly Parallel Range Join for Genome Analysis on Distributed Computing Platform.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on High Performance Computing & Simulation, 2019

2018

Towards high performance data analytic on heterogeneous many-core systems: A study on Bayesian Sequential Partitioning.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2018

Supporting compressed-sparse activations and weights on SIMD-like accelerator for sparse convolutional neural networks.

[BibT_eX]

[DOI]

Chien-Yu Lin

Proceedings of the 23rd Asia and South Pacific Design Automation Conference, 2018

2017

Efficient Designs of Multiported Memory on FPGA.

[BibT_eX]

[DOI]

Jiun-Liang Lin

IEEE Trans. Very Large Scale Integr. Syst., 2017

An Efficient Hierarchical Banking Structure for Algorithmic Multiported Memory on FPGA.

[BibT_eX]

[DOI]

Kun-Hua Huang

IEEE Trans. Very Large Scale Integr. Syst., 2017

A Hadoop-based Principle Component Analysis on embedded heterogeneous platform.

[BibT_eX]

[DOI]

Proceedings of the 2017 International Symposium on VLSI Design, Automation and Test, 2017

2016

Unified Designs for High Performance LDPC Decoding on GPGPU.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2016

A Quantitative Method to Data Reuse Patterns of SIMT Applications.

[BibT_eX]

[DOI]

Luis Garrido Platero

IEEE Comput. Archit. Lett., 2016

Enhancing Data Reuse in Cache Contention Aware Thread Scheduling on GPGPU.

[BibT_eX]

[DOI]

Chin-Fu Lu

Proceedings of the 10th International Conference on Complex, 2016

2015

A High-Performance Double-Layer Counting Bloom Filter for Multicore Systems.

[BibT_eX]

[DOI]

Kuan-Ting Chen

Ping-Ru Wu

IEEE Trans. Very Large Scale Integr. Syst., 2015

Scalable Global Power Management Policy Based on Combinatorial Optimization for Multiprocessors.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., 2015

A Cache Hierarchy Aware Thread Mapping Methodology for GPGPUs.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2015

Self adaptable multithreaded object detection on embedded multicore systems.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2015

Power-Efficient Instancy Aware DRAM Scheduling.

[BibT_eX]

[DOI]

IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2015

BRAM efficient multi-ported memory on FPGA.

[BibT_eX]

[DOI]

Jiun-Liang Lin

Proceedings of the VLSI Design, Automation and Test, 2015

Design of Application Specific Throughput Processor for Matrix Operations.

[BibT_eX]

[DOI]

Ping-Ju Wu

Chien-Yu Lin

Proceedings of the 18th International Conference on Network-Based Information Systems, 2015

Computation and Communication Aware task graph Scheduling on multi-GPU systems.

[BibT_eX]

[DOI]

Yun-Ting Wang

Jia-Ying Lee

Proceedings of the 2015 IEEE International Conference on Digital Signal Processing, 2015

2014

Scalable Power Management Using Multilevel Reinforcement Learning for Multiprocessors.

[BibT_eX]

[DOI]

Gung-Yu Pan

ACM Trans. Design Autom. Electr. Syst., 2014

Reducing Contention in Shared Last-Level Cache for Throughput Processors.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2014

Automatic Data Layout Transformation for Heterogeneous Many-Core Systems.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 2014

A Cache Aware Multithreading Decision Scheme on GPGPUs.

[BibT_eX]

[DOI]

Ta Kang Yen

Bo Yao Yu

Proceedings of the IEEE 8th International Symposium on Embedded Multicore/Manycore SoCs, 2014

A learning-on-cloud power management policy for smart devices.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2014

2013

A distributed thread scheduler for dynamic multithreading on throughput processors.

[BibT_eX]

[DOI]

Ta-Kan Yen

Proceedings of the 2013 International Symposium on VLSI Design, Automation, and Test, 2013

Memory capacity aware non-blocking data transfer on GPGPU.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Signal Processing Systems, 2013

A Locality-Aware Dynamic Thread Scheduler for GPGPUs.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel and Distributed Computing, 2013

Cache Capacity Aware Thread Scheduling for Irregular Memory Access on many-core GPGPUs.

[BibT_eX]

[DOI]

Proceedings of the 18th Asia and South Pacific Design Automation Conference, 2013

2012

A highly parallel design of image surface layout recovering on GPGPU.

[BibT_eX]

[DOI]

Guan-Ru Li

Proceedings of Technical Program of 2012 VLSI Design, Automation and Test, 2012

Reduce Data Coherence Cost with an Area Efficient Double Layer Counting Bloom Filter.

[BibT_eX]

[DOI]

Kuan-Ting Chen

Ping-Ru Wu

Proceedings of the Fifth International Symposium on Parallel Architectures, 2012

Thread affinity mapping for irregular data access on shared Cache GPGPU.

[BibT_eX]

[DOI]

Proceedings of the 17th Asia and South Pacific Design Automation Conference, 2012

A highly parallel design for irregular LDPC decoding on GPGPUs.

[BibT_eX]

[DOI]

Tsou-Han Chiu

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011

Classifier Grouping to Enhance Data Locality for a Multi-threaded Object Detection Algorithm.

[BibT_eX]

[DOI]

Chih-Hsuan Chiang

Guan-Ru Li

Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

2010

Unleash the parallelism of 3DIC partitioning on GPGPU.

[BibT_eX]

[DOI]

Proceedings of the Annual IEEE International SoC Conference, SoCC 2010, 2010

2008

A Cost-Effective Latency-Aware Memory Bus for Symmetric Multiprocessor Systems.

[BibT_eX]

[DOI]

Jongsun Kim

Mau-Chung Frank Chang

Ingrid Verbauwhede

IEEE Trans. Computers, 2008

2006

AES-Based Security Coprocessor IC in 0.18-$muhbox m$CMOS With Resistance to Differential Power Analysis Side-Channel Attacks.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2006

Cross Layer Design to Multi-thread a Data-Pipelining Application on a Multi-processor on Chip.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Application-Specific Systems, 2006

2005

Energy and Performance Analysis of Mapping Parallel Multithreaded Tasks for An On-Chip Multi-Processor System.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Computer Design (ICCD 2005), 2005

A 3.84 gbits/s AES crypto coprocessor with modes of operation in a 0.18-µm CMOS technology.

[BibT_eX]

[DOI]

Proceedings of the 15th ACM Great Lakes Symposium on VLSI 2005, 2005

A side-channel leakage free coprocessor IC in 0.18µm CMOS for embedded AES-based cryptographic and biometric processing.

[BibT_eX]

[DOI]

Proceedings of the 42nd Design Automation Conference, 2005

Cooperative multithreading on 3mbedded multiprocessor architectures enables energy-scalable design.

[BibT_eX]

[DOI]

Proceedings of the 42nd Design Automation Conference, 2005

Prototype IC with WDDL and Differential Routing - DPA Resistance Assessment.

[BibT_eX]

[DOI]

Proceedings of the Cryptographic Hardware and Embedded Systems - CHES 2005, 7th International Workshop, Edinburgh, UK, August 29, 2005

Security for Ambient Intelligent Systems.

[BibT_eX]

[DOI]

Proceedings of the Ambient Intelligence, 2005

2004

Reducing radio energy consumption of key management protocols for wireless sensor networks.

[BibT_eX]

[DOI]

Proceedings of the 2004 International Symposium on Low Power Electronics and Design, 2004

Energy-Memory-Security Tradeoffs in Distributed Sensor Networks.

[BibT_eX]

[DOI]

David Hwang

Ingrid Verbauwhede

Proceedings of the Ad-Hoc, Mobile, and Wireless Networks: Third International Conference, 2004

2003

Testing ThumbPod: Softcore bugs are hard to find.

[BibT_eX]

[DOI]

Proceedings of the Eighth IEEE International High-Level Design Validation and Test Workshop 2003, 2003

Design flow for HW / SW acceleration transparency in the thumbpod secure embedded system.

[BibT_eX]

[DOI]

Proceedings of the 40th Design Automation Conference, 2003

Leakage power analysis of a 90nm FPGA.

[BibT_eX]

[DOI]

Tim Tuan