We stand with Ukraine

We stand with Ukraine

Weng-Fai Wong

Orcid: 0000-0002-4281-2053

According to our database¹, Weng-Fai Wong authored at least 181 papers between 1989 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

SparrowSNN: A Hardware/software Co-design for Energy Efficient ECG Classification.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2024

Integrating Deep Learning and Synthetic Biology: A Co-Design Approach for Enhancing Gene Expression via N-terminal Coding Sequences.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2024

OneSpike: Ultra-low latency spiking neural networks.

[BibT_eX]

[DOI]

,

,

Proceedings of the International Joint Conference on Neural Networks, 2024

IMI: In-memory Multi-job Inference Acceleration for Large Language Models.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 53rd International Conference on Parallel Processing, 2024

Table-Lookup MAC: Scalable Processing of Quantised Neural Networks in FPGA Soft Logic.

[BibT_eX]

[DOI]

Daniel Gerlinghoff

,

Benjamin Chen Ming Choong

,

Rick Siow Mong Goh

,

,

Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

NOVA: NoC-based Vector Unit for Mapping Attention Layers on a CNN Accelerator.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

2023

HongTu: Scalable Full-Graph GNN Training on Multiple GPUs.

[BibT_eX]

[DOI]

,

,

,

Proc. ACM Manag. Data, December, 2023

Desire backpropagation: A lightweight training algorithm for multi-layer spiking neural networks based on spike-timing-dependent plasticity.

[BibT_eX]

[DOI]

Daniel Gerlinghoff

,

,

Rick Siow Mong Goh

,

Neurocomputing, December, 2023

Simeuro: A Hybrid CPU-GPU Parallel Simulator for Neuromorphic Computing Chips.

[BibT_eX]

[DOI]

,

,

Dogukan Yigit Polat

,

,

,

Truong Thao Nguyen

,

,

Rick Siow Mong Goh

,

Satoshi Matsuoka

,

,

IEEE Trans. Parallel Distributed Syst., October, 2023

DeepFire2: A Convolutional Spiking Neural Network Accelerator on FPGAs.

[BibT_eX]

[DOI]

Myat Thu Linn Aung

,

Daniel Gerlinghoff

,

,

,

,

Rick Siow Mong Goh

,

,

IEEE Trans. Computers, October, 2023

CQ$^{+}$+ Training: Minimizing Accuracy Loss in Conversion From Convolutional Neural Networks to Spiking Neural Networks.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

Achieving Green AI with Energy-Efficient Deep Learning Using Neuromorphic Computing.

[BibT_eX]

[DOI]

,

,

Rick Siow Mong Goh

,

,

,

,

,

Commun. ACM, July, 2023

Benchmarking Quantum(-Inspired) Annealing Hardware on Practical Use Cases.

[BibT_eX]

[DOI]

,

,

,

,

Rick Siow Mong Goh

,

IEEE Trans. Computers, June, 2023

LightRW: FPGA Accelerated Graph Dynamic Random Walks.

[BibT_eX]

[DOI]

,

,

,

,

Proc. ACM Manag. Data, 2023

HongTu: Scalable Full-Graph GNN Training on Multiple GPUs (via communication-optimized CPU data offloading).

[BibT_eX]

[DOI]

,

,

,

CoRR, 2023

HyperSNN: A new efficient and robust deep learning model for resource constrained control applications.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2023

Efficient Hyperdimensional Computing.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

1.7pJ/SOP Neuromorphic Processor with Integrated Partial Sum Routers for In-Network Computing.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Ananta Narayanan Balaji

,

Proceedings of the IEEE International Symposium on Circuits and Systems, 2023

OpenEmbedding: A Distributed Parameter Server for Deep Learning Recommendation Models using Persistent Memory.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

Towards a Better 16-Bit Number Representation for Training Neural Networks.

[BibT_eX]

[DOI]

Himeshi De Silva

,

,

,

John L. Gustafson

,

Proceedings of the Next Generation Arithmetic - 4th International Conference, 2023

Bedot: Bit Efficient Dot Product for Deep Generative Models.

[BibT_eX]

[DOI]

,

Duy Thanh Nguyen

,

John L. Gustafson

,

Proceedings of the Next Generation Arithmetic - 4th International Conference, 2023

2022

ThunderGP: Resource-Efficient Graph Processing Framework on FPGAs with HLS.

[BibT_eX]

[DOI]

,

,

,

,

,

,

ACM Trans. Reconfigurable Technol. Syst., 2022

Tensorox: Accelerating GPU Applications via Neural Approximation on Unused Tensor Cores.

[BibT_eX]

[DOI]

,

IEEE Trans. Parallel Distributed Syst., 2022

NC-Net: Efficient Neuromorphic Computing Using Aggregated Subnets on a Crossbar-Based Architecture With Nonvolatile Memory.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Rick Siow Mong Goh

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Corrigendum to "Coreset: Hierarchical neuromorphic computing supporting large-scale neural networks with improved resource efficiency" [Neurocomputing (2022) 128-140].

[BibT_eX]

[DOI]

,

,

,

,

Myat Thu Linn Aung

,

,

,

,

,

,

Rick Siow Mong Goh

,

Neurocomputing, 2022

Coreset: Hierarchical neuromorphic computing supporting large-scale neural networks with improved resource efficiency.

[BibT_eX]

[DOI]

,

,

,

,

Myat Thu Linn Aung

,

,

,

,

,

,

Rick Siow Mong Goh

,

Neurocomputing, 2022

Low Latency Conversion of Artificial Neural Network Models to Rate-encoded Spiking Neural Networks.

[BibT_eX]

[DOI]

,

,

CoRR, 2022

ReGraph: Scaling Graph Processing on HBM-enabled FPGAs with Heterogeneous Pipelines.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Network-on-Chip-Centric Accelerator Architectures for Edge AI Computing.

[BibT_eX]

[DOI]

,

,

Nurul Akhira Binte Zakaria

,

,

,

Proceedings of the 19th International SoC Design Conference, 2022

REACT: a heterogeneous reconfigurable neural network accelerator with software-configurable NoCs for training and inference on wearables.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Qtorch+: Next Generation Arithmetic for Pytorch Machine Learning.

[BibT_eX]

[DOI]

,

Himeshi De Silva

,

John L. Gustafson

,

Proceedings of the Next Generation Arithmetic - Third International Conference, 2022

2021

Synthesis of the Dynamical Properties of Feedback Loops in Bio-Pathways.

[BibT_eX]

[DOI]

,

,

IEEE ACM Trans. Comput. Biol. Bioinform., 2021

GRAM: A Framework for Dynamically Mixing Precisions in GPU Applications.

[BibT_eX]

[DOI]

,

Himeshi De Silva

,

ACM Trans. Archit. Code Optim., 2021

OBET: On-the-Fly Byte-Level Error Tracking for Correcting and Detecting Faults in Unreliable DRAM Systems.

[BibT_eX]

[DOI]

Duy Thanh Nguyen

,

,

,

Sensors, 2021

Optimizing An In-memory Database System For AI-powered On-line Decision Augmentation Using Persistent Memory.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Proc. VLDB Endow., 2021

Optimizing for In-memory Deep Learning with Emerging Memory Technology.

[BibT_eX]

[DOI]

,

,

Rick Siow Mong Goh

,

,

CoRR, 2021

DTNN: Energy-efficient Inference with Dendrite Tree Inspired Neural Networks for Edge Vision Applications.

[BibT_eX]

[DOI]

,

,

Matthew Kay Fei Lee

,

,

,

Rick Siow Mong Goh

CoRR, 2021

Energy efficient ECG classification with spiking neural network.

[BibT_eX]

[DOI]

,

,

Biomed. Signal Process. Control., 2021

ZEM: Zero-Cycle Bit-Masking Module for Deep Learning Refresh-Less DRAM.

[BibT_eX]

[DOI]

Duy Thanh Nguyen

,

,

,

,

IEEE Access, 2021

ThundeRiNG: generating multiple independent random number sequences on FPGAs.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

DeepFire: Acceleration of Convolutional Spiking Neural Network on Modern Field Programmable Gate Arrays.

[BibT_eX]

[DOI]

Myat Thu Linn Aung

,

,

,

,

Rick Siow Mong Goh

,

Proceedings of the 31st International Conference on Field-Programmable Logic and Applications, 2021

ThunderGP: HLS-based Graph Processing Framework on FPGAs.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

Posit Arithmetic for the Training and Deployment of Generative Adversarial Networks.

[BibT_eX]

[DOI]

,

Duy Thanh Nguyen

,

Himeshi De Silva

,

John L. Gustafson

,

,

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

Skew-Oblivious Data Routing for Data Intensive Applications on FPGAs with HLS.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

Near Lossless Transfer Learning for Spiking Neural Networks.

[BibT_eX]

[DOI]

,

,

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

An FPGA-Based Hardware Emulator for Neuromorphic Chip With RRAM.

[BibT_eX]

[DOI]

,

,

,

Matthew Kay Fei Lee

,

,

,

Rick Siow Mong Goh

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

NV-Journaling: Locality-Aware Journaling Using Byte-Addressable Non-Volatile Memory.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Computers, 2020

A future intelligent traffic system with mixed autonomous vehicles and human-driven vehicles.

[BibT_eX]

[DOI]

,

,

,

,

Inf. Sci., 2020

NCPower: Power Modelling for NVM-based Neuromorphic Chip.

[BibT_eX]

[DOI]

,

,

,

,

,

Paramasivam Vishnu

,

,

Rick Siow Mong Goh

Proceedings of the International Conference on Neuromorphic Systems, 2020

Shenjing: A low power reconfigurable neuromorphic accelerator with partial-sum and spike networks-on-chip.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

Is FPGA Useful for Hash Joins?

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 10th Conference on Innovative Data Systems Research, 2020

2019

Fault Tolerant Stencil Computation on Cloud-Based GPU Spot Instances.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Cloud Comput., 2019

MemepiC: Towards a Unified In-Memory Big Data Management System.

[BibT_eX]

[DOI]

,

,

,

,

,

,

IEEE Trans. Big Data, 2019

A System-Level Simulator for RRAM-Based Neuromorphic Computing Chips.

[BibT_eX]

[DOI]

Matthew Kay Fei Lee

,

,

Thannirmalai Somu

,

,

,

,

,

Rick Siow Mong Goh

ACM Trans. Archit. Code Optim., 2019

ApproxSymate: path sensitive program approximation using symbolic execution.

[BibT_eX]

[DOI]

Himeshi De Silva

,

Andrew E. Santosa

,

,

Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, 2019

On-The-Fly Parallel Data Shuffling for Graph Processing on OpenCL-Based FPGAs.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

Multi-objective Precision Optimization of Deep Neural Networks for Edge Devices.

[BibT_eX]

[DOI]

,

,

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Resource Efficient Personalized ECG Beat Classification via Temporal Logic Synthesis.

[BibT_eX]

[DOI]

,

Proceedings of the 19th IEEE International Conference on Bioinformatics and Bioengineering, 2019

Compilation and Other Software Techniques Enabling Approximate Computing.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Approximate Circuits, Methodologies and CAD., 2019

2018

Making Strassen Matrix Multiplication Safe.

[BibT_eX]

[DOI]

Himeshi De Silva

,

John L. Gustafson

,

Proceedings of the 25th IEEE International Conference on High Performance Computing, 2018

Gloss: Seamless Live Reconfiguration and Reoptimization of Stream Programs.

[BibT_eX]

[DOI]

Sumanaruban Rajadurai

,

Jeffrey Bosboom

,

,

Saman P. Amarasinghe

Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017

Parallelizing Skip Lists for In-Memory Multi-Core Database Systems.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

Exploiting half precision arithmetic in Nvidia GPUs.

[BibT_eX]

[DOI]

,

Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

Automated Property Synthesis of ODEs Based Bio-pathways Models.

[BibT_eX]

[DOI]

,

,

,

P. S. Thiagarajan

Proceedings of the Computational Methods in Systems Biology, 2017

Efficient floating point precision tuning for approximate computing.

[BibT_eX]

[DOI]

,

Elavarasi Manogaran

,

,

Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

2016

Exploiting Single-Threaded Model in Multi-Core In-Memory Systems.

[BibT_eX]

[DOI]

,

Divyakant Agrawal

,

,

,

,

,

IEEE Trans. Knowl. Data Eng., 2016

TreeFTL: An Efficient Workload-Adaptive Algorithm for RAM Buffer Management of NAND Flash-Based Devices.

[BibT_eX]

[DOI]

,

IEEE Trans. Computers, 2016

PI : a Parallel in-memory skip list based Index.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2016

2015

A Family of Bit-Representation-Optimized Formats for Fast Sparse Matrix-Vector Multiplication on the GPU.

[BibT_eX]

[DOI]

,

,

Rick Siow Mong Goh

,

Stephen John Turner

,

IEEE Trans. Parallel Distributed Syst., 2015

A Code Generation Framework for Targeting Optimized Library Calls for Multiple Platforms.

[BibT_eX]

[DOI]

,

,

Rick Siow Mong Goh

,

Stephen John Turner

,

IEEE Trans. Parallel Distributed Syst., 2015

Multi-agent simulation on multiple GPUs.

[BibT_eX]

[DOI]

,

,

Simul. Model. Pract. Theory, 2015

In-memory Databases: Challenges and Opportunities From Software and Hardware Perspectives.

[BibT_eX]

[DOI]

,

,

,

,

,

SIGMOD Rec., 2015

3DFTL: a three-level demand-based translation strategy for flash device.

[BibT_eX]

[DOI]

Peera Thontirawong

,

,

,

Mongkol Ekpanyapong

,

Prabhas Chongstitvatana

IEICE Electron. Express, 2015

DGCC: A New Dependency Graph based Concurrency Control Protocol for Multicore Database Systems.

[BibT_eX]

[DOI]

,

Divyakant Agrawal

,

,

,

,

,

CoRR, 2015

"Anti-Caching"-based elastic memory management for Big Data.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 31st IEEE International Conference on Data Engineering, 2015

Parallelized Parameter Estimation of Biological Pathway Models.

[BibT_eX]

[DOI]

,

,

,

Benjamin M. Gyori

,

,

P. S. Thiagarajan

Proceedings of the Hybrid Systems Biology - Fourth International Workshop, 2015

PAC: Program Analysis for Approximation-aware Compilation.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2015 International Conference on Compilers, 2015

2014

STT-RAM Cache Hierarchy With Multiretention MTJ Designs.

[BibT_eX]

[DOI]

,

,

,

,

IEEE Trans. Very Large Scale Integr. Syst., 2014

Mapping Streaming Applications onto GPU Systems.

[BibT_eX]

[DOI]

Huynh Phung Huynh

,

Andrei Hagiescu

,

Zhong-Liang Ong

,

,

Rick Siow Mong Goh

IEEE Trans. Parallel Distributed Syst., 2014

StreamJIT: a commensal compiler for high-performance stream programming.

[BibT_eX]

[DOI]

Jeffrey Bosboom

,

Sumanaruban Rajadurai

,

,

Saman P. Amarasinghe

Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014

ASAC: automatic sensitivity analysis for approximate computing.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the SIGPLAN/SIGBED Conference on Languages, 2014

Optimizing MLC-based STT-RAM caches by dynamic block size reconfiguration.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

EnVM: Virtual memory design for new memory architectures.

[BibT_eX]

[DOI]

,

Manmohan Manoharan

,

Proceedings of the 2014 International Conference on Compilers, 2014

A coherent hybrid SRAM and STT-RAM L1 cache architecture for shared memory multicores.

[BibT_eX]

[DOI]

,

,

,

Zhong-Liang Ong

,

,

Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

2013

GPU code generation for ODE-based applications with phased shared-data access patterns.

[BibT_eX]

[DOI]

Andrei Hagiescu

,

,

,

Sucheendra K. Palaniappan

,

,

Bipasa Chattopadhyay

,

P. S. Thiagarajan

,

ACM Trans. Archit. Code Optim., 2013

On-chip caches built on multilevel spin-transfer torque RAM cells and its optimizations.

[BibT_eX]

[DOI]

,

,

,

,

,

ACM J. Emerg. Technol. Comput. Syst., 2013

Accelerating sparse matrix-vector multiplication on GPUs using bit-representation-optimized schemes.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Rick Siow Mong Goh

,

Stephen John Turner

,

Proceedings of the International Conference for High Performance Computing, 2013

A practical low-power memristor-based analog neural branch predictor.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013

Optimizing and Auto-Tuning Iterative Stencil Loops for GPUs with the In-Plane Method.

[BibT_eX]

[DOI]

,

,

Ratna Krishnamoorthy

,

,

,

Rick Siow Mong Goh

,

Stephen John Turner

,

Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

TreeFTL: efficient RAM management for high performance of NAND flash-based storage systems.

[BibT_eX]

[DOI]

,

Proceedings of the Design, Automation and Test in Europe, 2013

SAW: system-assisted wear leveling on the write endurance of NAND flash devices.

[BibT_eX]

[DOI]

,

Proceedings of the 50th Annual Design Automation Conference 2013, 2013

2012

Approximate probabilistic analysis of biopathway dynamics.

[BibT_eX]

[DOI]

,

Andrei Hagiescu

,

Sucheendra K. Palaniappan

,

Bipasa Chattopadhyay

,

,

,

P. S. Thiagarajan

Bioinform., 2012

Poster: Automated Mapping Streaming Applications onto GPUs.

[BibT_eX]

[DOI]

Huynh Phung Huynh

,

Andrei Hagiescu

,

,

Rick Siow Mong Goh

,

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Mapping Streaming Applications onto GPU Systems.

[BibT_eX]

[DOI]

Huynh Phung Huynh

,

Andrei Hagiescu

,

,

Rick Siow Mong Goh

,

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Scalable framework for mapping streaming applications onto multi-GPU systems.

[BibT_eX]

[DOI]

Huynh Phung Huynh

,

Andrei Hagiescu

,

,

Rick Siow Mong Goh

Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

ADAPT: Efficient workload-sensitive flash management based on adaptation, prediction and aggregation.

[BibT_eX]

[DOI]

,

Proceedings of the IEEE 28th Symposium on Mass Storage Systems and Technologies, 2012

Automatic Refactoring of Legacy Fortran Code to the Array Slicing Notation.

[BibT_eX]

[DOI]

Chandrasehar Rajaseharan

,

,

,

Stephen John Turner

,

,

Rick Siow Mong Goh

,

Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

Guppy: A GPU-like soft-core processor.

[BibT_eX]

[DOI]

Abdullah Al-Dujaili

,

Florian Deragisch

,

Andrei Hagiescu

,

Proceedings of the 2012 International Conference on Field-Programmable Technology, 2012

Tulipse: A Visualization Framework for User-Guided Parallelization.

[BibT_eX]

[DOI]

,

Tomasz Dubrownik

,

,

,

,

Rick Siow Mong Goh

,

,

Stephen John Turner

,

Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

Extending the lifetime of NAND flash memory by salvaging bad blocks.

[BibT_eX]

[DOI]

,

Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

Observational wear leveling: an efficient algorithm for flash memory management.

[BibT_eX]

[DOI]

,

Proceedings of the 49th Annual Design Automation Conference 2012, 2012

2011

Guest Editorial - BSN2010 Special Issue.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Biomed. Circuits Syst., 2011

Internet-based hardware/software co-design framework for embedded 3D graphics applications.

[BibT_eX]

[DOI]

,

,

,

EURASIP J. Adv. Signal Process., 2011

Dynamic cache contention detection in multi-threaded applications.

[BibT_eX]

[DOI]

,

,

,

,

,

Saman P. Amarasinghe

Proceedings of the 7th International Conference on Virtual Execution Environments, 2011

Multi retention level STT-RAM cache designs with a dynamic refresh scheme.

[BibT_eX]

[DOI]

,

,

,

,

Zhong-Liang Ong

,

,

Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011

Processor caches with multi-level spin-transfer torque ram cells.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 2011 International Symposium on Low Power Electronics and Design, 2011

Automated Architecture-Aware Mapping of Streaming Applications Onto GPUs.

[BibT_eX]

[DOI]

Andrei Hagiescu

,

Huynh Phung Huynh

,

,

Rick Siow Mong Goh

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Co-synthesis of FPGA-based application-specific floating point simd accelerators.

[BibT_eX]

[DOI]

Andrei Hagiescu

,

Proceedings of the ACM/SIGDA 19th International Symposium on Field Programmable Gate Arrays, 2011

A UML 2-based hardware-software co-design framework for body sensor network applications.

[BibT_eX]

[DOI]

,

,

Proceedings of the Design, Automation and Test in Europe, 2011

2010

PiPA: Pipelined profiling and analysis on multicore systems.

[BibT_eX]

[DOI]

,

Ioana Cutcutache

,

ACM Trans. Archit. Code Optim., 2010

Interprocedural Placement-Aware Configuration Prefetching for FPGA-Based Systems.

[BibT_eX]

[DOI]

Joon Edward Sim

,

,

,

Tobias Ziermann

,

Proceedings of the 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2010

2009

Tolerating process variations in large, set-associative caches: The buddy cache.

[BibT_eX]

[DOI]

,

,

,

ACM Trans. Archit. Code Optim., 2009

Automatically patching errors in deployed software.

[BibT_eX]

[DOI]

Jeff H. Perkins

,

,

,

Saman P. Amarasinghe

,

Jonathan Bachrach

,

,

,

,

Stelios Sidiroglou

,

Gregory T. Sullivan

,

,

,

Michael D. Ernst

,

Martin C. Rinard

Proceedings of the 22nd ACM Symposium on Operating Systems Principles 2009, 2009

The salvage cache: A fault-tolerant cache architecture for next-generation memory technologies.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 27th International Conference on Computer Design, 2009

Optimal Placement-aware Trace-Based Scheduling of Hardware Reconfigurations for FPGA Accelerators.

[BibT_eX]

[DOI]

Joon Edward Sim

,

,

Proceedings of the FCCM 2009, 2009

A computing origami: folding streams in FPGAs.

[BibT_eX]

[DOI]

Andrei Hagiescu

,

,

,

Rodric M. Rabbah

Proceedings of the 46th Design Automation Conference, 2009

A DVS-based pipelined reconfigurable instruction memory.

[BibT_eX]

[DOI]

,

,

Proceedings of the 46th Design Automation Conference, 2009

BSN Simulator: Optimizing Application Using System Level Simulation.

[BibT_eX]

[DOI]

Ioana Cutcutache

,

Thi Thanh Nga Dang

,

,

,

Kathy Dang Nguyen

,

Linh Thi Xuan Phan

,

Joon Edward Sim

,

,

,

,

Francis Eng Hock Tay

,

Proceedings of the Sixth International Workshop on Wearable and Implantable Body Sensor Networks, 2009

A UML-based approach for heterogeneous IP integration.

[BibT_eX]

[DOI]

,

Proceedings of the 14th Asia South Pacific Design Automation Conference, 2009

2008

Fast, frequency-based, integrated register allocation and instruction scheduling.

[BibT_eX]

[DOI]

Ioana Cutcutache

,

Softw. Pract. Exp., 2008

Defining neighborhood relations for fast spatial-temporal partitioning of applications on reconfigurable architectures.

[BibT_eX]

[DOI]

Joon Edward Sim

,

,

Proceedings of the 2008 International Conference on Field-Programmable Technology, 2008

Pipa: pipelined profiling and analysis on multi-core systems.

[BibT_eX]

[DOI]

,

Ioana Cutcutache

,

Proceedings of the Sixth International Symposium on Code Generation and Optimization (CGO 2008), 2008

How to Do a Million Watchpoints: Efficient Debugging Using Dynamic Instrumentation.

[BibT_eX]

[DOI]

,

Rodric M. Rabbah

,

Saman P. Amarasinghe

,

,

Proceedings of the Compiler Construction, 17th International Conference, 2008

2007

Editorial for the Special Issue on Field Programmable Technology.

[BibT_eX]

[DOI]

Gordon J. Brebner

,

Samarjit Chakraborty

,

J. VLSI Signal Process., 2007

A UML-Based Design Framework for Time-Triggered Applications.

[BibT_eX]

[DOI]

Kathy Dang Nguyen

,

P. S. Thiagarajan

,

Proceedings of the 28th IEEE Real-Time Systems Symposium (RTSS 2007), 2007

VOSCH: Voltage scaled cache hierarchies.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 25th International Conference on Computer Design, 2007

DRIM: a low power dynamically reconfigurable instruction memory hierarchy for embedded systems.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007

Ubiquitous Memory Introspection.

[BibT_eX]

[DOI]

,

Rodric M. Rabbah

,

Saman P. Amarasinghe

,

,

Proceedings of the Fifth International Symposium on Code Generation and Optimization (CGO 2007), 2007

An Inter-Core Communication Enabled Multi-Core Simulator Based on SimpleScalar.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 21st International Conference on Advanced Information Networking and Applications (AINA 2007), 2007

2006

Generating hardware from OpenMP programs.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2006 IEEE International Conference on Field Programmable Technology, 2006

Co-optimization of Performance and Power in a Superscalar Processor Design.

[BibT_eX]

[DOI]

,

,

Proceedings of the Emerging Directions in Embedded and Ubiquitous Computing, 2006

DEP: detailed execution profile.

[BibT_eX]

[DOI]

,

Joon Edward Sim

,

,

Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques (PACT 2006), 2006

2005

Dynamic memory optimization using pool allocation and prefetching.

[BibT_eX]

[DOI]

,

Rodric M. Rabbah

,

SIGARCH Comput. Archit. News, 2005

Using UML 2.0 for System Level Design of Real Time SoC Platforms for Stream Processing.

[BibT_eX]

[DOI]

,

,

Alexander Maxiaguine

,

Proceedings of the 11th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2005), 2005

Sensor Grid: Integration ofWireless Sensor Networks and the Grid.

[BibT_eX]

[DOI]

,

,

Protik Mukherjee

,

,

,

Proceedings of the 30th Annual IEEE Conference on Local Computer Networks (LCN 2005), 2005

Cooperative Instruction Scheduling with Linear Scan Register Allocation.

[BibT_eX]

[DOI]

Khaing Khaing Kyi Win

,

Proceedings of the High Performance Computing, 2005

A Reconfigurable Instruction Memory Hierarchy for Embedded Systems.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2005 International Conference on Field Programmable Logic and Applications (FPL), 2005

An integrated performance and power model for superscalar processor designs.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005

Design of clocked circuits using UML.

[BibT_eX]

[DOI]

,

,

,

Santhosh Kumar Pilakkat

Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005

Targeted Data Prefetching.

[BibT_eX]

[DOI]

Proceedings of the Advances in Computer Systems Architecture, 10th Asia-Pacific Conference, 2005

A Performance and Power Co-optimization Approach for Modern Processors.

[BibT_eX]

[DOI]

,

,

Proceedings of the Fifth International Conference on Computer and Information Technology (CIT 2005), 2005

2004

Data Integrity Framework and Language Support for Active Web Intermediaries.

[BibT_eX]

[DOI]

,

,

,

Chen (Cherie) Ding

,

Proceedings of the Web Content Caching and Distribution: 9th International Workshop, 2004

Model-Driven SoC Design via Executable UML to SystemC.

[BibT_eX]

[DOI]

Kathy Dang Nguyen

,

,

P. S. Thiagarajan

,

Proceedings of the 25th IEEE Real-Time Systems Symposium (RTSS 2004), 2004

Adaptive Compiler Directed Prefetching for EPIC Processors.

[BibT_eX]

,

Rodric M. Rabbah

,

Krishna V. Palem

,

Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2004

Configuration bitstream compression for dynamically reconfigurable FPGAs.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2004 International Conference on Computer-Aided Design, 2004

Windows CE for a reconfigurable system-on-a-chip processor.

[BibT_eX]

[DOI]

Mariam Reeny George

,

Proceedings of the 2004 IEEE International Conference on Field-Programmable Technology, 2004

Tuning SoC platforms for multimedia processing: identifying limits and tradeoffs.

[BibT_eX]

[DOI]

Alexander Maxiaguine

,

,

Samarjit Chakraborty

,

Proceedings of the 2nd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2004

Static Identification of Delinquent Loads.

[BibT_eX]

[DOI]

Vlad-Mihai Panait

,

,

Proceedings of the 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 2004

Compiler orchestrated prefetching via speculation and predication.

[BibT_eX]

[DOI]

Rodric M. Rabbah

,

Hariharan Sandanagobalane

,

Mongkol Ekpanyapong

,

Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, 2004

2003

SilkRoad II: mixed paradigm cluster computing with RC_dag consistency.

[BibT_eX]

[DOI]

,

,

Chung-Kwong Yuen

Parallel Comput., 2003

Compiling to FPGAs via an EPIC compiler's intermediate representation.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2003 IEEE International Conference on Field-Programmable Technology, 2003

A Model for Hardware Realization of Kernel Loops.

[BibT_eX]

[DOI]

,

,

Proceedings of the Field Programmable Logic and Application, 13th International Conference, 2003

The Performance Model of SilkRoad - A Multithreaded DSM System for Clusters.

[BibT_eX]

[DOI]

,

,

Chung-Kwong Yuen

Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2003), 2003

2002

A Framework for Data Prefetching Using Off-Line Training of Markovian Predictors.

[BibT_eX]

[DOI]

,

Krishna V. Palem

,

Proceedings of the 20th International Conference on Computer Design (ICCD 2002), 2002

PD-XML: extensible markup language for processor description.

[BibT_eX]

[DOI]

,

Krishna V. Palem

,

Rodric M. Rabbah

,

,

,

Peter Y. K. Cheung

Proceedings of the 2002 IEEE International Conference on Field-Programmable Technology, 2002

A co-simulation study of adaptive EPIC computing.

[BibT_eX]

[DOI]

Stefan Valentin Gheorghita

,

,

,

Surendranath Talla

Proceedings of the 2002 IEEE International Conference on Field-Programmable Technology, 2002

Shell over a Cluster (SHOC): Towards Achieving Single System Image via the Shell.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2002 IEEE International Conference on Cluster Computing (CLUSTER 2002), 2002

SilkRoad II: A Multi-Paradigm Runtime System for Cluster Computing.

[BibT_eX]

[DOI]

,

,

Chung-Kwong Yuen

Proceedings of the 2002 IEEE International Conference on Cluster Computing (CLUSTER 2002), 2002

2001

Compiler Optimizations for Adaptive EPIC Processors.

[BibT_eX]

[DOI]

Krishna V. Palem

,

Surendranath Talla

,

Proceedings of the Embedded Software, First International Workshop, 2001

The emerging power crisis in embedded processors: what can a poor compiler do?

[BibT_eX]

[DOI]

Lakshmi N. Chakrapani

,

,

Vincent John Mooney III

,

Krishna V. Palem

,

Kiran Puttaswamy

,

Proceedings of the 2001 International Conference on Compilers, 2001

2000

Multiple context multithreaded superscalar processor architecture.

[BibT_eX]

[DOI]

,

J. Syst. Archit., 2000

ORION: An Adaptive Home-Based Software Distributed Shared Memory System.

[BibT_eX]

[DOI]

,

Proceedings of the Seventh International Conference on Parallel and Distributed Systems, 2000

SilkRoad: A Multithreaded Runtime System with Software Distributed Shared Memory for SMP Clusters.

[BibT_eX]

[DOI]

,

,

,

Chung-Kwong Yuen

Proceedings of the 2000 IEEE International Conference on Cluster Computing (CLUSTER 2000), November 28th, 2000

1999

Optimizing floating point operations in Scheme.

[BibT_eX]

[DOI]

Comput. Lang., 1999

Source Level Static Branch Prediction.

[BibT_eX]

[DOI]

Comput. J., 1999

tmPVM - Task Migratable PVM.

[BibT_eX]

[DOI]

,

,

Chung-Kwong Yuen

Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999

1996

BaLinda Lisp: Design and Implementation.

[BibT_eX]

[DOI]

,

,

Chung-Kwong Yuen

Comput. Lang., 1996

1995

Fast Evaluation of the Elementary Functions in Single Precision.

[BibT_eX]

[DOI]

,

IEEE Trans. Computers, 1995

Evaluation of the Hitachi S-3800 Supercomputer Using Six Benchmarks.

[BibT_eX]

[DOI]

,

,

Int. J. High Perform. Comput. Appl., 1995

Compiling Parallel Lisp for a Shared Memory Multiprocessor.

[BibT_eX]

,

,

Chung-Kwong Yuen

Proceedings of the Seventh IASTED/ISMM International Conference on Parallel and Distributed Computing and Systems, 1995

Highy Efficient Parallel Lisp Implementation on Distributed Systems.

[BibT_eX]

,

,

Chung-Kwong Yuen

Proceedings of the Parallel Computing: State-of-the-Art and Perspectives, 1995

Design and Implementation of Abstract Machine for Parallel Lisp Compilation.

[BibT_eX]

,

,

Chung-Kwong Yuen

Proceedings of the 1995 International Conference on Parallel Processing, 1995

1994

Fast Hardware-Based Algorithms for Elementary Function Computations Using Rectangular Multipliers.

[BibT_eX]

[DOI]

,

IEEE Trans. Computers, 1994

A Simulation Study on the Interactions between Multithreaded Architectures and the Cache.

[BibT_eX]

[DOI]

,

Int. J. High Speed Comput., 1994

Fast Evaluation of the Elementary Functions in Double Precision.

[BibT_eX]

,

Proceedings of the 27th Annual Hawaii International Conference on System Sciences (HICSS-27), 1994

1992

A Model of Speculative Parallelism.

[BibT_eX]

[DOI]

,

Chung-Kwong Yuen

Parallel Process. Lett., 1992

Evaluation of the continuation bit in the Cyclic Pipeline Computer.

[BibT_eX]

[DOI]

,

,

,

Parallel Comput., 1992

1991

Effects of Multiple Instruction Stream Execution on Cache Performance.

[BibT_eX]

[DOI]

,

,

Int. J. High Speed Comput., 1991

1990

A self interpreter for BaLinda Lisp.

[BibT_eX]

[DOI]

Chung-Kwong Yuen

,

ACM SIGPLAN Notices, 1990

A preliminary evaluation of a massively parallel processor: GAPP.

[BibT_eX]

[DOI]

,

Microprocessing and Microprogramming, 1990

1989

BIDDLE: a bidirectional data driven Lisp engine.

[BibT_eX]

[DOI]

,

Chung-Kwong Yuen

Proceedings of the IEEE International Workshop on Tools for Artificial Intelligence: Architectures, 1989

Loading...