Youfeng Wu

According to our database1, Youfeng Wu authored at least 66 papers between 1989 and 2019.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2019
Multi-objective Exploration for Practical Optimization Decisions in Binary Translation.
ACM Trans. Embed. Comput. Syst., 2019

2017
Enabling Cross-ISA Offloading for COTS Binaries.
Proceedings of the 15th Annual International Conference on Mobile Systems, 2017

Efficient support of position independence on non-volatile memory.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

Mozart : Efficient Composition of Library Functions for Heterogeneous Execution.
Proceedings of the Languages and Compilers for Parallel Computing, 2017

Fault-Tolerant Execution on COTS Multi-core Processors with Hardware Transactional Memory Support.
Proceedings of the Architecture of Computing Systems - ARCS 2017, 2017

2016
FlexVec: auto-vectorization for irregular loops.
Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2016

POSTER: Fault-tolerant Execution on COTS Multi-core Processors with Hardware Transactional Memory Support.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2014
Microcode Compression Using Structured-Constrained Clustering.
Int. J. Parallel Program., 2014

Call sequence prediction through probabilistic calling automata.
Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014

Just-In-Time Software Pipelining.
Proceedings of the 12th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2014

2013
Allocating rotating registers by scheduling.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

HW/SW co-designed acceleration of dynamic languages.
Proceedings of the SIGPLAN/SIGBED Conference on Languages, 2013

Acceldroid: Co-designed acceleration of Android bytecode.
Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013

TSO_ATOMICITY: efficient hardware primitive for TSO-preserving region optimizations.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

Concurrent predicates: A debugging technique for every parallel programmer.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
From Locks to Correct and Efficient Transactional Memory.
J. Circuits Syst. Comput., 2012

SMARQ: Software-Managed Alias Register Queue for Dynamic Optimizations.
Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012

HiRe: using hint & release to improve synchronization of speculative threads.
Proceedings of the International Conference on Supercomputing, 2012

2011
Structure-Constrained Microcode Compression.
Proceedings of the 23rd International Symposium on Computer Architecture and High Performance Computing, 2011

CoreRacer: a practical memory race recorder for multicore x86 TSO processors.
Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011

A HW/SW co-designed heterogeneous multi-core virtual machine for energy-efficient general purpose computing.
Proceedings of the CGO 2011, 2011

LAR-CC: Large atomic regions with conditional commits.
Proceedings of the CGO 2011, 2011

Modeling and Performance Evaluation of TSO-Preserving Binary Optimization.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010
Trace Execution Automata in Dynamic Binary Translation.
Proceedings of the Computer Architecture, 2010

TAO: two-level atomicity for dynamic binary optimizations.
Proceedings of the CGO 2010, 2010

2009
Characterization of DBT overhead.
Proceedings of the 2009 IEEE International Symposium on Workload Characterization, 2009

Dynamic parallelization of single-threaded binary programs using speculative slicing.
Proceedings of the 23rd international conference on Supercomputing, 2009

2008
A Segmented Bloom Filter Algorithm for Efficient Predictors.
Proceedings of the 20th International Symposium on Computer Architecture and High Performance Computing, 2008

Supporting Legacy Binary Code in a Software Transaction Compiler with Dynamic Binary Translation and Optimization.
Proceedings of the Compiler Construction, 17th International Conference, 2008

2007
Impacts of Multiprocessor Configurations on Workloads in Bioinformatics.
Proceedings of the 19th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2007), 2007

Compiler-Managed Software-based Redundant Multi-Threading for Transient Fault Detection.
Proceedings of the Fifth International Symposium on Code Generation and Optimization (CGO 2007), 2007

Code Generation and Optimization for Transactional Memory Constructs in an Unmanaged Language.
Proceedings of the Fifth International Symposium on Code Generation and Optimization (CGO 2007), 2007

StarDBT: An Efficient Multi-platform Dynamic Binary Translation System.
Proceedings of the Advances in Computer Systems Architecture, 2007

2006
Dynamic-Compiler-Driven Control for Microprocessor Energy and Performance.
IEEE Micro, 2006

A hierarchical model of data locality.
Proceedings of the 33rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2006

LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks.
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 2006

Clustering-Based Microcode Compression.
Proceedings of the 24th International Conference on Computer Design (ICCD 2006), 2006

Software-Based Transparent and Comprehensive Control-Flow Error Detection.
Proceedings of the Fourth IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2006), 2006

Performance Characterization of the 64-bit x86 Architecture from Compiler Optimizations' Perspective.
Proceedings of the Compiler Construction, 15th International Conference, 2006

Selective Runtime Memory Disambiguation in a Dynamic Binary Translator.
Proceedings of the Compiler Construction, 15th International Conference, 2006

2005
Dynamic binary control-flow errors detection.
SIGARCH Comput. Archit. News, 2005

Hardware-Software Collaborative Techniques for Runtime Profiling and Phase Transition Detection.
J. Comput. Sci. Technol., 2005

A Dynamic Compilation Framework for Controlling Microprocessor Energy and Performance.
Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-38 2005), 2005

Enhanced code density of embedded CISC processors with echo technology.
Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2005

2004
The Accuracy of Initial Prediction in Two-Phase Dynamic Binary Translators.
Proceedings of the 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 2004

Exploiting Free Execution Slots on EPIC Processors for Efficient and Accurate Runtime Profiling.
Proceedings of the Advances in Computer Systems Architecture, 9th Asia-Pacific Conference, 2004

Continuous Trip Count Profiling for Loop Optimizations in Two-Phase Dynamic Binary Translato.
Proceedings of the 8th Annual Workshop on Interaction between Compilers and Computer Architecture (INTERACT-8 2004), 2004

2003
Performance potentials of compiler-directed data speculation.
Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software, 2003

Aggressive Compiler Optimization and Parallelization with Thread-Level Speculation.
Proceedings of the 32nd International Conference on Parallel Processing (ICPP 2003), 2003

2002
Efficient Discovery of Regular Stride Patterns in Irregular Programs.
Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2002

Compiler managed micro-cache bypassing for high performance EPIC processors.
Proceedings of the 35th Annual International Symposium on Microarchitecture, 2002

Value-Profile Guided Stride Prefetching for Irregular Code.
Proceedings of the Compiler Construction, 11th International Conference, 2002

Accuracy of Profile Maintenance in Optimizing Compilers.
Proceedings of the 6th Annual Workshop on Interaction between Compilers and Computer Architecture (INTERACT-6 2002), 2002

2001
Better exploration of region-level value locality with integrated computation reuse and value prediction.
Proceedings of the 28th Annual International Symposium on Computer Architecture, 2001

Calculation of Load Invalidation Rates for Data Speculation.
Proceedings of the ISCA 14th International Conference on Parallel and Distributed Computing Systems, 2001

2000
Quantifying instruction-level parallelism limits on an EPIC architecture.
Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software, 2000

1999
Comprehensive Redundant Load Elimination for the IA-64 Architecture.
Proceedings of the Languages and Compilers for Parallel Computing, 1999

1996
Bidirectional Scheduling: A New Global Code Scheduling Approach.
Proceedings of the Languages and Compilers for Parallel Computing, 1996

1995
Strength Reduction of Multiplications by Integer Constants.
ACM SIGPLAN Notices, 1995

1994
Static branch frequency and program profile analysis.
Proceedings of the 27th Annual International Symposium on Microarchitecture, San Jose, California, USA, November 30, 1994

A study of pointer aliasing for software pipelining using run-time disambiguation.
Proceedings of the 27th Annual International Symposium on Microarchitecture, San Jose, California, USA, November 30, 1994

1992
Ordering functions for improving memory reference locality in a shared memory multiprocessor system.
Proceedings of the 25th Annual International Symposium on Microarchitecture, 1992

1990
Parallel Algorithms for Decomposable Linear Programs.
Proceedings of the 1990 International Conference on Parallel Processing, 1990

Parallelism Encapsulation in C++.
Proceedings of the 1990 International Conference on Parallel Processing, 1990

Parallelizing WHILE Loops.
Proceedings of the 1990 International Conference on Parallel Processing, 1990

1989
Parallel processor balance through loop spreading.
Proceedings of the Proceedings Supercomputing '89, Reno, NV, USA, November 12-17, 1989, 1989


  Loading...