Weixia Xu

According to our database1, Weixia Xu authored at least 30 papers between 1997 and 2019.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


On csauthors.net:


SketchDLC: A Sketch on Distributed Deep Learning Communication via Trace Capturing.
TACO, 2019

Statistical learning with group invariance: problem, method and consistency.
Int. J. Machine Learning & Cybernetics, 2019

A Power Efficient Hardware Implementation of the IF Neuron Model.
Proceedings of the Advanced Computer Architecture - 12th Conference, 2018

Delay Compensated Asynchronous Adam Algorithm for Deep Neural Networks.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Monaural Speech Separation on Many Integrated Core Architecture.
Proceedings of the Computer Engineering and Technology - 20th CCF Conference, 2016

Accelerating Nyström Kernel Independent Component Analysis with Many Integrated Core Architecture.
Proceedings of the Computer Engineering and Technology - 20th CCF Conference, 2016

Graphein: A Novel Optical High-Radix Switch Architecture for 3D Integration.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2016

A low-latency fine-grained dynamic shared cache management scheme for chip multi-processor.
Proceedings of the 34th IEEE International Performance Computing and Communications Conference, 2015

An incentive compatible reputation mechanism for P2P systems.
The Journal of Supercomputing, 2014

Hybrid hierarchy storage system in MilkyWay-2 supercomputer.
Frontiers Comput. Sci., 2014

Low-latency last-level cache structure based on grouped cores in Chip Multi-Processor.
Proceedings of the IEEE 33rd International Performance Computing and Communications Conference, 2014

Fast NIC based RDMA implementation for adaptive unreliable networks.
Proceedings of the 11th IEEE/ACS International Conference on Computer Systems and Applications, 2014

Scalable NIC Architecture to Support Offloading of Large Scale MPI Barrier.
Proceedings of the Advanced Parallel Processing Technologies, 2013

State space reduction in modeling checking parameterized cache coherence protocol by two-dimensional abstraction.
The Journal of Supercomputing, 2012

Distributed Coverage in Wireless Ad Hoc and Sensor Networks by Topological Graph Approaches.
IEEE Trans. Computers, 2012

Frame Error Rate Testing for High Speed Optical Interconnect.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2012

Optimizing Private Memory Performance by Dynamically Deactivating Cache Coherence.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

PIM: A Policy-Based Incentive Mechanism for Promoting Honest Recommendations in Reputation Systems.
Proceedings of the 12th IEEE International Conference on Computer and Information Technology, 2012

Optimizing Linpack Benchmark on GPU-Accelerated Petascale Supercomputer.
J. Comput. Sci. Technol., 2011

Extracting minimal unsatisfiable subformulas in satisfiability modulo theories.
Comput. Sci. Inf. Syst., 2011

A Formalization of an Emulation Based Co-designed Virtual Machine.
Proceedings of the Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, 2011

Finding First-Order Minimal Unsatisfiable Cores with a Heuristic Depth-First-Search Algorithm.
Proceedings of the Intelligent Data Engineering and Automated Learning - IDEAL 2011, 2011

A novel shared-buffer router for network-on-chip based on Hierarchical Bit-line Buffer.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011

A Parallel Processing Scheme for Large-Size Sliding-Window Applications.
Proceedings of the 13th IEEE International Conference on High Performance Computing & Communication, 2011

Exploiting Loop-Carried Stream Reuse for Scientific Computing Applications on the Stream Processor.
IJCNS, 2010

TH-1: China's first petaflop supercomputer.
Frontiers Comput. Sci. China, 2010

A Novel Chaining Approach for Direct Control Transfer Instructions.
Proceedings of the 16th IEEE International Conference on Parallel and Distributed Systems, 2010

An efficient stream memory architecture for heterogeneous multicore processor.
Proceedings of the 14th IEEE Symposium on Computers and Communications (ISCC 2009), 2009

DTM: Decoupled Hardware Transactional Memory to Support Unbounded Transaction and Operating System.
Proceedings of the ICPP 2009, 2009

A Dual-Processors Multithreaded Architecture and Its Driven Execution Model.
Proceedings of the 1997 Advances in Parallel and Distributed Computing Conference (APDC '97), 1997