Xiaobing Feng

According to our database1, Xiaobing Feng authored at least 62 papers between 2004 and 2019.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepage:

On csauthors.net:

Bibliography

2019
Cacheap: Portable and Collaborative I/O Optimization for Graph Processing.
J. Comput. Sci. Technol., 2019

ElasticActor: An Actor System with Automatic Granularity Adjustment.
International Journal of Parallel Programming, 2019

Understanding Node Change Bugs for Distributed Systems.
Proceedings of the 26th IEEE International Conference on Software Analysis, 2019

Exploiting the input sparsity to accelerate deep neural networks: poster.
Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2019

Panthera: holistic memory management for big data processing over hybrid memories.
Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2019

Accelerating GPU Computing at Runtime with Binary Optimization.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019

PPOpenCL: a performance-portable OpenCL compiler with host and kernel thread code fusion.
Proceedings of the 28th International Conference on Compiler Construction, 2019

2018
Using Local Clocks to Reproduce Concurrency Bugs.
IEEE Trans. Software Eng., 2018

NVM Streaker: a fast and reconfigurable performance simulator for non-volatile memory-based memory architecture.
The Journal of Supercomputing, 2018

RARE: An Efficient Static Fault Detection Framework for Definition-Use Faults in Large Programs.
IEEE Access, 2018

CloudRaid: hunting concurrency bugs in the cloud via log-mining.
Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018

Lazygraph: lazy data coherency for replicas in distributed graph-parallel computation.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

On Retargeting the AI Programming Framework to New Hardwares.
Proceedings of the Network and Parallel Computing, 2018

Background Subtraction on Depth Videos with Convolutional Neural Networks.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Characterizing DNN Models for Edge-Cloud Computing.
Proceedings of the 2018 IEEE International Symposium on Workload Characterization, 2018

Revisiting Loop Tiling for Datacenters: Live and Let Live.
Proceedings of the 32nd International Conference on Supercomputing, 2018

Auto-tuning Neural Network Quantization Framework for Collaborative Inference Between the Cloud and Edge.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2018, 2018

Fast CNN Pruning via Redundancy-Aware Training.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2018, 2018

May-happen-in-parallel analysis with static vector clocks.
Proceedings of the 2018 International Symposium on Code Generation and Optimization, 2018

2017
Locating Software Faults Based on Minimum Debugging Frontier Set.
IEEE Trans. Software Eng., 2017

An Accelerator for High Efficient Vision Processing.
IEEE Trans. on CAD of Integrated Circuits and Systems, 2017

Parallel Incremental Frequent Itemset Mining for Large Data.
J. Comput. Sci. Technol., 2017

Two-Level Task Scheduling for Irregular Applications on GPU Platform.
International Journal of Parallel Programming, 2017

2016
Predicting Cross-Core Performance Interference on Multicore Processors with Regression Analysis.
IEEE Trans. Parallel Distrib. Syst., 2016

Pragma Directed Shared Memory Centric Optimizations on GPUs.
J. Comput. Sci. Technol., 2016

Articulation points guided redundancy elimination for betweenness centrality.
Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016

Efficient Management for Hybrid Memory in Managed Language Runtime.
Proceedings of the Network and Parallel Computing, 2016

2015
WiseThrottling: a new asynchronous task scheduler for mitigating I/O bottleneck in large-scale datacenter servers.
The Journal of Supercomputing, 2015

Practical Iterative Optimization for the Data Center.
TACO, 2015

ShiDianNao: shifting vision processing closer to the sensor.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

ReCBuLC: Reproducing Concurrency Bugs Using Local Clocks.
Proceedings of the 37th IEEE/ACM International Conference on Software Engineering, 2015

Hadoop+: Modeling and Evaluating the Heterogeneity for MapReduce Applications in Heterogeneous Clusters.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

PuDianNao: A Polyvalent Machine Learning Accelerator.
Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015

2014
Dynamic I/O-Aware Scheduling for Batch-Mode Applications on Chip Multiprocessor Systems of Cluster Platforms.
J. Comput. Sci. Technol., 2014

Concurrency bug localization using shared memory access pairs.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014

Localization of concurrency bugs using shared memory access pairs.
Proceedings of the ACM/IEEE International Conference on Automated Software Engineering, 2014

A collaborative divide-and-conquer K-means clustering algorithm for processing large data.
Proceedings of the Computing Frontiers Conference, CF'14, 2014

2013
Layout-oblivious compiler optimization for matrix computations.
TACO, 2013

Effective fault localization based on minimum debugging frontier set.
Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013

An empirical model for predicting cross-core performance interference on multicore processors.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
A Hybrid Circular Queue Method for Iterative Stencil Computations on GPUs.
J. Comput. Sci. Technol., 2012

Can We Make It Faster? Efficient May-Happen-in-Parallel Analysis Revisited.
Proceedings of the 13th International Conference on Parallel and Distributed Computing, 2012

A Highly Parallel Reuse Distance Analysis Algorithm on GPUs.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Layout-oblivious optimization for matrix computations.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

Making it practical and effective: fast and precise may-happen-in-parallel analysis.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011
Dependence-based multi-level tracing and replay for wireless sensor networks debugging.
Proceedings of the ACM SIGPLAN/SIGBED 2011 conference on Languages, 2011

Automatic Library Generation for BLAS3 on GPUs.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Parallelizing a machine translation decoder for multicore computer.
Proceedings of the Seventh International Conference on Natural Computation, 2011

Extendable pattern-oriented optimization directives.
Proceedings of the CGO 2011, 2011

2010
Landing Stencil Code on Godson-T.
J. Comput. Sci. Technol., 2010

Continuous speculative program parallelization in software.
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

Software-Hardware Cooperative DRAM Bank Partitioning for Chip Multiprocessors.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2010

Level by level: making flow- and context-sensitive pointer analysis scalable for millions of lines of code.
Proceedings of the CGO 2010, 2010

An adaptive task creation strategy for work-stealing scheduling.
Proceedings of the CGO 2010, 2010

2009
PARBLO: Page-Allocation-Based DRAM Row Buffer Locality Optimization.
J. Comput. Sci. Technol., 2009

Detecting and Eliminating Potential Violations of Sequential Consistency for Concurrent C/C++ Programs.
Proceedings of the CGO 2009, 2009

2008
Exploiting idle register classes for fast spill destination.
Proceedings of the 22nd Annual International Conference on Supercomputing, 2008

Global Tiling for Communication Minimal Parallelization on Distributed Memory Systems.
Proceedings of the Euro-Par 2008, 2008

2006
Library Function Disposing Approach in Binary Translation.
Journal of Computer Research and Development, 2006

Global Partial Replicate Computation Partitioning.
Journal of Computer Research and Development, 2006

2005
Integrating Parallelizing Compilation Technologies for SMP Clusters.
J. Comput. Sci. Technol., 2005

2004
An Overview of the Open Research Compiler.
Proceedings of the Languages and Compilers for High Performance Computing, 2004


  Loading...