Xiaochun Ye

Orcid: 0000-0003-4598-1685

According to our database1, Xiaochun Ye authored at least 121 papers between 2008 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
MoDSE: A High-Accurate Multiobjective Design Space Exploration Framework for CPU Microarchitectures.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., May, 2024

Improving Utilization of Dataflow Unit for Multi-Batch Processing.
ACM Trans. Archit. Code Optim., March, 2024

Revisiting Edge Perturbation for Graph Neural Network in Graph Data Augmentation and Attack.
CoRR, 2024

2023
Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation.
IEEE Trans. Parallel Distributed Syst., December, 2023

A Comprehensive Survey on Distributed Training of Graph Neural Networks.
Proc. IEEE, December, 2023

Design of a Compact Superconducting RSFQ Register File.
IEEE Trans. Circuits Syst. I Regul. Pap., November, 2023

FSGraph: fast and scalable implementation of graph traversal on GPUs.
CCF Trans. High Perform. Comput., September, 2023

Carbon Emissions Reduction of Neural Network by Discrete Rank Pruning.
CCF Trans. High Perform. Comput., September, 2023

Domain adaptive person re-identification with memory-based circular ranking.
Appl. Intell., March, 2023

A Survey of Graph Pre-processing Methods: From Algorithmic to Hardware Perspectives.
CoRR, 2023

HiHGNN: Accelerating HGNNs through Parallelism and Data Reusability Exploitation.
CoRR, 2023

Characterizing and Understanding Defense Methods for GNNs on GPUs.
IEEE Comput. Archit. Lett., 2023

Hardware-in-the-Loop Framework for Testing Wireless V2X Communication.
Proceedings of the IEEE Wireless Communications and Networking Conference, 2023

A Bucket-aware Asynchronous Single-Source Shortest Path Algorithm on GPU.
Proceedings of the 52nd International Conference on Parallel Processing Workshops, 2023

Alleviating Transfer Latency in DataFlow Accelerator for DSP Applications.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

A Transfer Learning Framework for High-Accurate Cross-Workload Design Space Exploration of CPU.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

Adaptive Sparse Deep Neural Network Inference on Resource-Constrained Cost-Efficient GPUs.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2023

ROMA: A Reconfigurable On-chip Memory Architecture for Multi-core Accelerators.
Proceedings of the IEEE International Conference on High Performance Computing & Communications, 2023

A High-accurate Multi-objective Ensemble Exploration Framework for Design Space of CPU Microarchitecture.
Proceedings of the Great Lakes Symposium on VLSI 2023, 2023

JRouter: A Multi-Terminal Hierarchical Length-Matching Router under Planar Manhattan Routing Model for RSFQ Circuits.
Proceedings of the Great Lakes Symposium on VLSI 2023, 2023

Improving Utilization of Dataflow Architectures Through Software and Hardware Co-Design.
Proceedings of the Euro-Par 2023: Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Limassol, Cyprus, August 28, 2023

A High-accurate Multi-objective Exploration Framework for Design Space of CPU.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Simple and Efficient Heterogeneous Graph Neural Network.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
General spiking neural network framework for the learning trajectory from a noisy mmWave radar.
Neuromorph. Comput. Eng., June, 2022

Multi-Node Acceleration for Large-Scale GCNs.
IEEE Trans. Computers, 2022

JBNN: A Hardware Design for Binarized Neural Networks Using Single-Flux-Quantum Circuits.
IEEE Trans. Computers, 2022

Accelerating Data Transfer in Dataflow Architectures Through a Look-Ahead Acknowledgment Mechanism.
J. Comput. Sci. Technol., 2022

Sampling Methods for Efficient Training of Graph Convolutional Networks: A Survey.
IEEE CAA J. Autom. Sinica, 2022

Rethinking Efficiency and Redundancy in Training Large-scale Graphs.
CoRR, 2022

A synergistic reinforcement learning-based framework design in driving automation.
Comput. Electr. Eng., 2022

A survey on superconducting computing technology: circuits, architectures and design tools.
CCF Trans. High Perform. Comput., 2022

Accelerating Graph Processing With Lightweight Learning-Based Data Reordering.
IEEE Comput. Archit. Lett., 2022

Characterizing and Understanding HGNNs on GPUs.
IEEE Comput. Archit. Lett., 2022

Characterization and Implementation of Radar System Applications on a Reconfigurable Dataflow Architecture.
IEEE Comput. Archit. Lett., 2022

Characterizing and Understanding Distributed GNN Training on GPUs.
IEEE Comput. Archit. Lett., 2022

GNNSampler: Bridging the Gap Between Sampling Algorithms of GNN and Hardware.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2022

A Routing-Aware Mapping Method for Dataflow Architectures.
Proceedings of the Network and Parallel Computing, 2022

Survey on Graph Neural Network Acceleration: An Algorithmic Perspective.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Heterogeneous Collaborative Refining for Real-Time End-to-End Image-Text Retrieval System.
Proceedings of the ICIAI 2022: The 6th International Conference on Innovation in Artificial Intelligence, Guangzhou China, March 4, 2022

GEM: Execution-Aware Cache Management for Graph Analytics.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2022

MatGraph: An Energy-Efficient and Flexible CGRA Engine for Matrix-Based Graph Analytics.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2022

Parallel-Friendly and Work-Efficient Single Source Shortest Path Algorithm on Single-Node System.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

A Loop Optimization Method for Dataflow Architecture.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

HetGraph: A High Performance CPU-CGRA Architecture for Matrix-based Graph Analytics.
Proceedings of the GLSVLSI '22: Great Lakes Symposium on VLSI 2022, Irvine CA USA, June 6, 2022

WiLi - Vehicular Wireless Channel Dataset enriched with LiDAR and Radar Data.
Proceedings of the IEEE Global Communications Conference, 2022

LRP: Predictive output activation based on SVD approach for CNN s acceleration.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

Alleviating datapath conflicts and design centralization in graph analytics acceleration.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021
An efficient scheduling algorithm for dataflow architecture using loop-pipelining.
Inf. Sci., 2021

BSR-TC: Adaptively Sampling for Accurate Triangle Counting over Evolving Graph Streams.
Int. J. Softw. Eng. Knowl. Eng., 2021

Tackling Variabilities in Autonomous Driving.
CoRR, 2021

RISC-NN: Use RISC, NOT CISC as Neural Network Hardware Infrastructure.
CoRR, 2021

Scalable and efficient graph traversal on high-throughput cluster.
CCF Trans. High Perform. Comput., 2021

Hardware Acceleration for GCNs via Bidirectional Fusion.
IEEE Comput. Archit. Lett., 2021

Triangle Counting by Adaptively Resampling over Evolving Graph Streams.
Proceedings of the 33rd International Conference on Software Engineering and Knowledge Engineering, 2021

Scalable, resource and locality-aware selection of active scatterers in Geometry-based stochastic channel models.
Proceedings of the 32nd IEEE Annual International Symposium on Personal, 2021

Alleviating Imbalance in Synchronous Distributed Training of Deep Neural Networks.
Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, September 30, 2021

Streamline Ring ORAM Accesses through Spatial and Temporal Optimization.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

2020
3DACN: 3D Augmented convolutional network for time series data.
Inf. Sci., 2020

An efficient dataflow accelerator for scientific applications.
Future Gener. Comput. Syst., 2020

Video Face Recognition System: RetinaFace-mnet-faster and Secondary Search.
CoRR, 2020

Top-Related Meta-Learning Method for Few-Shot Detection.
CoRR, 2020

Pixel-Semantic Revise of Position Learning A One-Stage Object Detector with A Shared Encoder-Decoder.
CoRR, 2020

Characterizing and Understanding GCNs on GPU.
IEEE Comput. Archit. Lett., 2020

A Reliability-Aware Joint Design Method of Application Mapping and Wavelength Assignment for WDM-Based Silicon Photonic Interconnects on Chip.
IEEE Access, 2020

An Efficient Multicast Router using Shared-Buffer with Packet Merging for Dataflow Architecture.
Proceedings of the 14th IEEE/ACM International Symposium on Networks-on-Chip, 2020

Highly Efficient and GPU-Friendly Implementation of BFS on Single-node System.
Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2020

Pixel-Semantic Revising of Position: One-Stage Object Detector with Shared Encoder-Decoder.
Proceedings of the Neural Information Processing - 27th International Conference, 2020

CTA: A Critical Task Aware Scheduling Mechanism for Dataflow Architecture.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2020

HyGCN: A GCN Accelerator with Hybrid Architecture.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

Design Automation Methodology from RTL to Gate-level Netlist and Schematic for RSFQ Logic Circuits.
Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

2019
PIM-WEAVER: A High Energy-efficient, General-purpose Acceleration Architecture for String Operations in Big Data Processing.
Sustain. Comput. Informatics Syst., 2019

Wavelength assignment method based on ACO to reduce crosstalk for ring-based optical Network-on-Chip.
Microprocess. Microsystems, 2019

Applying CNN on a scientific application accelerator based on dataflow architecture.
CCF Trans. High Perform. Comput., 2019

Alleviating Irregularity in Graph Analytics Acceleration: a Hardware/Software Co-Design Approach.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Instruction Vulnerability Test and Code Optimization Against DVFS Attack.
Proceedings of the IEEE International Test Conference in Asia, 2019

Balancing Memory Accesses for Energy-Efficient Graph Analytics Accelerators.
Proceedings of the 2019 IEEE/ACM International Symposium on Low Power Electronics and Design, 2019

iATPG: Instruction-level Automatic Test Program Generation for Vulnerabilities under DVFS attack.
Proceedings of the 25th IEEE International Symposium on On-Line Testing and Robust System Design, 2019

C-MIDN: Coupled Multiple Instance Detection Network With Segmentation Guidance for Weakly Supervised Object Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Highly Efficient Breadth-First Search on CPU-Based Single-Node System.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

A Sharing Path Awareness Scheduling Algorithm for Dataflow Architecture.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

C-MAP: Improving the Effectiveness of Mapping Method for CGRA by Reducing NoC Congestion.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

Magma: A Monolithic 3D Vertical Heterogeneous ReRAM-based Main Memory Architecture.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

Utilizing the Instability in Weakly Supervised Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Crosstalk-aware GA-based wavelength allocation method for ring-based optical network-on-chip.
Proceedings of the ACM Turing Celebration Conference - China, 2019

2018
A Pipelining Loop Optimization Method for Dataflow Architecture.
J. Comput. Sci. Technol., 2018

A Non-Stop Double Buffering Mechanism for Dataflow Architecture.
J. Comput. Sci. Technol., 2018

High-Performance and Energy-Efficient Fault Tolerance Scheduling Algorithm Based on Improved TMR for Heterogeneous System.
Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2018

WEAVER: An Energy Efficient, General-Purpose Acceleration Architecture for String Operations in Big Data Applications.
Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2018

Accelerating CNN Algorithm with Fine-Grained Dataflow Architectures.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

Optimizing the Efficiency of Data Transfer in Dataflow Architectures.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

SmarCo: An Efficient Many-Core Processor for High-Throughput Applications in Datacenters.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

Optimizing network efficiency of dataflow architectures through dynamic packet merging.
Proceedings of the Ninth International Green and Sustainable Computing Conference, 2018

2017
An Efficient Network-on-Chip Router for Dataflow Architecture.
J. Comput. Sci. Technol., 2017

2016
ACCC: An Acceleration Mechanism for Character Operation Based on Cache Computing in Big Data Applications.
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016

An energy-efficient bandwidth allocation method for single-chip heterogeneous processor.
Proceedings of the Seventh International Green and Sustainable Computing Conference, 2016

A framework for energy-efficient optimization on multi-cores.
Proceedings of the Seventh International Green and Sustainable Computing Conference, 2016

Memory partition for SIMD in streaming dataflow architectures.
Proceedings of the Seventh International Green and Sustainable Computing Conference, 2016

On the properties of data migration based on topology pattern keeping on cache hierarchy.
Proceedings of the Seventh International Green and Sustainable Computing Conference, 2016

A Percolation Data Migration Schema in a hybrid Cache Hierarchy.
Proceedings of the Seventh International Green and Sustainable Computing Conference, 2016

POSTER: An Optimization of Dataflow Architectures for Scientific Applications.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015
Corrigendum to "Fast and scalable lock methods for video coding on many-core architecture" [J. Visual Communication and Image Representation 25 (7) (2014) 1758-1762].
J. Vis. Commun. Image Represent., 2015

A high-density data path implementation fitting for HTC applications.
Proceedings of the Sixth International Green and Sustainable Computing Conference, 2015

Thread ID based power reduction mechanism for multi-thread shared set-associative caches.
Proceedings of the Sixth International Green and Sustainable Computing Conference, 2015

2014
Fast and scalable lock methods for video coding on many-core architecture.
J. Vis. Commun. Image Represent., 2014

Optimizing mapreduce with low memory requirements for shared-memory systems.
Proceedings of the 15th IEEE/ACIS International Conference on Software Engineering, 2014

Efficiently and Completely Verifying Synchronized Consistency Models.
Proceedings of the Automated Technology for Verification and Analysis, 2014

2013
A Path-Adaptive Opto-electronic Hybrid NoC for Chip Multi-processor.
Proceedings of the 12th IEEE International Conference on Trust, 2013

SimICT: A fast and flexible framework for performance and power evaluation of large-scale architecture.
Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013

Low Execution Efficiency: When General Multi-core Processor Meets Wireless Communication Protocol.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013

An Efficient Parallel Mechanism for Highly-Debuggable Multicore Simulator.
Proceedings of the Advanced Parallel Processing Technologies, 2013

2012
Godson-T: An Efficient Many-Core Processor Exploring Thread-Level Parallelism.
IEEE Micro, 2012

Auto-Tuning GEMV on Many-Core GPU.
Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

ALWP: A Workload Partition Method for the Efficient Parallel Simulation of Manycores.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

PartitionSim: A Parallel Simulator for Many-cores.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

CRAW/P: A Workload Partition Method for the Efficient Parallel Simulation of Manycores.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

2011
High-efficient architecture of Godson-T many-core processor.
Proceedings of the 2011 IEEE Hot Chips 23 Symposium (HCS), 2011

2010
High performance comparison-based sorting algorithm on many-core GPUs.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

2009
Godson-T: An Efficient Many-Core Architecture for Parallel Program Executions.
J. Comput. Sci. Technol., 2009

A Fast Linear-Space Sequence Alignment Algorithm with Dynamic Parallelization Framework.
Proceedings of the Ninth IEEE International Conference on Computer and Information Technology, 2009

A Low-Complexity Synchronization Based Cache Coherence Solution for Many Cores.
Proceedings of the Ninth IEEE International Conference on Computer and Information Technology, 2009

2008
Efficient Parallelization of a Protein Sequence Comparison Algorithm on Manycore Architecture.
Proceedings of the Ninth International Conference on Parallel and Distributed Computing, 2008


  Loading...