Weiwu Hu

According to our database1, Weiwu Hu authored at least 79 papers between 1994 and 2016.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2016
An introduction to CPU and DSP design in China.
Sci. China Inf. Sci., 2016

2015
HERMES: a fast cross-ISA binary translator with post-optimization.
Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015

2014
Pre-Silicon Bug Forecast.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2014

An 8-Core MIPS-Compatible Processor in 32/28 nm Bulk CMOS.
IEEE J. Solid State Circuits, 2014

Auxiliary stream for optimizing memory access of video decoders.
Sci. China Inf. Sci., 2014

A 0.8V, 560fJ/bit, 14Gb/s injection-locked receiver with input duty-cycle distortion tolerable edge-rotating 5/4X sub-rate CDR in 65nm CMOS.
Proceedings of the Symposium on VLSI Circuits, 2014

2013
LDet: Determinizing Asynchronous Transfer for Postsilicon Debugging.
IEEE Trans. Computers, 2013

Deterministic Replay Using Global Clock.
ACM Trans. Archit. Code Optim., 2013

Microarchitectural design space exploration made fast.
Microprocess. Microsystems, 2013

Godson-3B1500: A 32nm 1.35GHz 40W 172.8GFLOPS 8-core processor.
Proceedings of the 2013 IEEE International Solid-State Circuits Conference, 2013

A 1.2 pJ/b 6.4 Gb/s 8+1-lane forwarded-clock receiver with PVT-variation-tolerant all-digital clock and data recovery in 28nm CMOS.
Proceedings of the IEEE 2013 Custom Integrated Circuits Conference, 2013

2012
Program Regularization in Memory Consistency Verification.
IEEE Trans. Parallel Distributed Syst., 2012

Linear Time Memory Consistency Verification.
IEEE Trans. Computers, 2012

Statistical performance comparisons of computers.
Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

2011
Sinusoidal Clock Sampling for Multigigahertz ADCs.
IEEE Trans. Circuits Syst. I Regul. Pap., 2011

Physical Implementation of the Eight-Core Godson-3B Microprocessor.
J. Comput. Sci. Technol., 2011

Design for Testability Features of Godson-3 Multicore Microprocessor.
J. Comput. Sci. Technol., 2011

An FFT Performance Model for Optimizing General-Purpose Processor Architecture.
J. Comput. Sci. Technol., 2011

The Godson Processors: Its Research, Development, and Contributions.
J. Comput. Sci. Technol., 2011

Brief announcement: program regularization in verifying memory consistency.
Proceedings of the SPAA 2011: Proceedings of the 23rd Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2011

Godson-3B: A 1GHz 40W 8-core 128GFLOPS processor in 65nm CMOS.
Proceedings of the IEEE International Solid-State Circuits Conference, 2011

Effective and Efficient Microprocessor Design Space Exploration Using Unlabeled Design Configurations.
Proceedings of the IJCAI 2011, 2011

Alpha Compression with Variable Data Formats.
Proceedings of the 32nd Annual Conference of the European Association for Computer Graphics, 2011

Empirical design bugs prediction for verification.
Proceedings of the Design, Automation and Test in Europe, 2011

2010
System Architecture of Godson-3 Multi-Core Processors.
J. Comput. Sci. Technol., 2010

Design of Low-Cost High-Performance Floating-Point Fused Multiply-Add with Reduced Power.
Proceedings of the VLSI Design 2010: 23rd International Conference on VLSI Design, 2010

Optimizing power and throughput for m-out-of-n encoded asynchronous circuits.
Proceedings of the 11th International Symposium on Quality of Electronic Design (ISQED 2010), 2010

LReplay: a pending period based deterministic replay scheme.
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

DMA cache: Using on-chip storage to architecturally separate I/O data from CPU data for improving I/O performance.
Proceedings of the 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 2010

A multi-FPGA based platform for emulating a 100m-transistor-scale processor with high-speed peripherals (abstract only).
Proceedings of the ACM/SIGDA 18th International Symposium on Field Programmable Gate Arrays, 2010

On-the-Fly Reduction of Stimuli for Functional Verification.
Proceedings of the 19th IEEE Asian Test Symposium, 2010

2009
Godson-3: A Scalable Multicore RISC Processor with x86 Emulation.
IEEE Micro, 2009

Global Clock, Physical Time Order and Pending Period Analysis in Multiprocessor Systems
CoRR, 2009

Measuring and Compensating for Process Mismatch-induced, Reference Spurs in Phase-locked Loops using a Sub-sampled DSP.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

A case study of improving at-speed testing coverage of a gigahertz microprocessor.
Proceedings of the 16th IEEE International Conference on Electronics, 2009

Efficient binary translation system with low hardware cost.
Proceedings of the 27th International Conference on Computer Design, 2009

Fast complete memory consistency verification.
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

A Scalable Scan Architecture for Godson-3 Multicore Microprocessor.
Proceedings of the Eighteentgh Asian Test Symposium, 2009

2008
Chip Multithreaded Consistency Model.
J. Comput. Sci. Technol., 2008

Making Effective Decisions in Computer Architects' Real-World: Lessons and Experiences with Godson-2 Processor Designs.
J. Comput. Sci. Technol., 2008

Fetching Primary and Redundant Instructions in Turn for a Fault-Tolerant Embedded Microprocessor.
Proceedings of the 14th IEEE Pacific Rim International Symposium on Dependable Computing, 2008

A synchronized variable frequency clock scheme in chip multiprocessors.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2008), 2008

An interconnect-aware power efficient cache coherence protocol for CMPs.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Testing content addressable memories using instructions and march-like algorithms.
Proceedings of the 15th IEEE International Conference on Electronics, Circuits and Systems, 2008

A High Speed CMOS Transmitter and Rail-to-Rail Receiver.
Proceedings of the 4th IEEE International Symposium on Electronic Design, 2008

2007
Accelerating sequential programs on Chip Multiprocessors via Dynamic Prefetching Thread.
Microprocess. Microsystems, 2007

Implementing a 1GHz Four-Issue Out-of-Order Execution Microprocessor in a Standard Cell ASIC Methodology.
J. Comput. Sci. Technol., 2007

An Efficient Error Control Scheme for Chip-to-Chip Optical Interconnects.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2007), 2007

CREA: A Checkpoint Based Reliable Micro-architecture for Superscalar Processors.
Proceedings of the 16th Asian Test Symposium, 2007

Concerning with On-Chip Network Features to Improve Cache Coherence Protocols for CMPs.
Proceedings of the Advances in Computer Systems Architecture, 2007

2006
Parallel Error Detection for Leading Zero Anticipation.
J. Comput. Sci. Technol., 2006

High Performance General-Purpose Microprocessors: Past and Future.
J. Comput. Sci. Technol., 2006

Microarchitecture and Performance Analysis of Godson-2 SMT Processor.
Proceedings of the 24th International Conference on Computer Design (ICCD 2006), 2006

A Hybrid Hardware/Software Generated Prefetching Thread Mechanism on Chip Multiprocessors.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

Processor Directed Dynamic Page Policy.
Proceedings of the Advances in Computer Systems Architecture, 11th Asia-Pacific Conference, 2006

2005
Microarchitecture of the Godson-2 Processor.
J. Comput. Sci. Technol., 2005

A novel design of leading zero anticipation circuit with parallel error detection.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2005), 2005

A Memory Bandwidth Effective Cache Store Miss Policy.
Proceedings of the Advances in Computer Systems Architecture, 10th Asia-Pacific Conference, 2005

2004
A shared virtual memory network with fast remote direct memory access and message passing.
Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004

2001
Dynamic Data Prefetching in Home-Based Software DSMs.
J. Comput. Sci. Technol., 2001

Optimizing Home-Based Software DSM Protocols.
Clust. Comput., 2001

A Comparison of Two Strategies of Dynamic Data Prefetching in Software DSM.
Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

Communication with Threads in Software DSM.
Proceedings of the 2001 IEEE International Conference on Cluster Computing (CLUSTER 2001), 2001

2000
A New Home-Based Software DSM Protocol for SMP Clusters.
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

1999
Where does the time go in software DSMs? - Experiences with JIAJIA.
J. Comput. Sci. Technol., 1999

Reducing System Overheads in Home-based Software DSMs.
Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999

Dynamic Task Migration in Home-based Software DSM Systems.
Proceedings of the Eighth IEEE International Symposium on High Performance Distributed Computing, 1999

Adaptive Write Detection in Home-based Software DSMs.
Proceedings of the Eighth IEEE International Symposium on High Performance Distributed Computing, 1999

JIAJIA: A Software DSM System Based on a New Cache Coherence Protocol.
Proceedings of the High-Performance Computing and Networking, 7th International Conference, 1999

Evaluation of the JIAJIA Software DSM System on High Performance Computer Architectures.
Proceedings of the 32nd Annual Hawaii International Conference on System Sciences (HICSS-32), 1999

Write Detection in Home-Based Software DSMs.
Proceedings of the Euro-Par '99 Parallel Processing, 5th International Euro-Par Conference, Toulouse, France, August 31, 1999

1998
Out-of-order execution in sequentially consistent shared-memory systems: Theory and experiments.
J. Comput. Sci. Technol., 1998

A lock-based cache coherence protocol for scope consistency.
J. Comput. Sci. Technol., 1998

A framework of memory consistency models.
J. Comput. Sci. Technol., 1998

1997
An Interaction of Coherence Protocols and Memory Consistency Models in DSM Systems.
ACM SIGOPS Oper. Syst. Rev., 1997

An innovative implementation for directory-based cache coherence in shared memory multiprocessors.
SIGARCH Comput. Archit. News, 1997

Out-of-order execution in sequentially consistent shared-memory systems.
SIGARCH Comput. Archit. News, 1997

1996
Event Ordering Condition for Correct Executions in Shared-Memory Systems.
Proceedings of the 1996 International Symposium on Parallel Architectures, 1996

1994
A Graph Model for Investigating Memory Consistency.
Proceedings of the Proceedings 1994 International Conference on Parallel and Distributed Systems, 1994


  Loading...