Yunji Chen

According to our database1, Yunji Chen authored at least 141 papers between 2004 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Assessing and Understanding Creativity in Large Language Models.
CoRR, 2024

2023
Chip design with machine learning: a survey from algorithm perspective.
Sci. China Inf. Sci., November, 2023

Learning controllable elements oriented representations for reinforcement learning.
Neurocomputing, September, 2023

Rescue to the Curse of universality.
Sci. China Inf. Sci., September, 2023

Emergent Communication for Rules Reasoning.
CoRR, 2023

Context Shift Reduction for Offline Meta-Reinforcement Learning.
CoRR, 2023

Efficient Symbolic Policy Learning with Differentiable Symbolic Expression.
CoRR, 2023

Self-driven Grounding: Large Language Model Agents with Automatical Language-aligned Skill Learning.
CoRR, 2023

Pushing the Limits of Machine Design: Automated CPU Design with AI.
CoRR, 2023

BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models.
CoRR, 2023

Learning Domain-Aware Detection Head with Prompt Tuning.
CoRR, 2023

Flew Over Learning Trap: Learn Unlearnable Samples by Progressive Staged Training.
CoRR, 2023

Unlearnable Examples for Diffusion Models: Protect Data from Unauthorized Exploitation.
CoRR, 2023

ANPL: Compiling Natural Programs with Interactive Decomposition.
CoRR, 2023

Online Symbolic Regression with Informative Query.
CoRR, 2023

Decompose a Task into Generalizable Subtasks in Multi-Agent Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Learning Domain-Aware Detection Head with Prompt Tuning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Contrastive Modules with Temporal Attention for Multi-Task Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

ANPL: Towards Natural Programming with Interactive Decomposition.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Emergent Communication for Rules Reasoning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Efficient Symbolic Policy Learning with Differentiable Symbolic Expression.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Non-autoregressive Machine Translation with Probabilistic Context-free Grammar.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Context Shift Reduction for Offline Meta-Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Online Prototype Alignment for Few-shot Policy Transfer.
Proceedings of the International Conference on Machine Learning, 2023

BALTO: fast tensor program optimization with diversity-based active learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Conceptual Reinforcement Learning for Language-Conditioned Tasks.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Online Symbolic Regression with Informative Query.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Cambricon-G: A Polyvalent Energy-Efficient Accelerator for Dynamic Graph Neural Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Breaking the Interaction Wall: A DLPU-Centric Deep Learning Computing System.
IEEE Trans. Computers, 2022

Tetris: A Heuristic Static Memory Management Framework for Uniform Memory Multicore Neural Network Accelerators.
J. Comput. Sci. Technol., 2022

Object-Category Aware Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Accelerating Sparse Convolution with Column Vector-Wise Sparsity.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Causality-driven Hierarchical Structure Discovery for Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

BabelTower: Learning to Auto-parallelized Program Translation.
Proceedings of the International Conference on Machine Learning, 2022

Neural Program Synthesis with Query.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
A Decomposable Winograd Method for N-D Convolution Acceleration in Video Analysis.
Int. J. Comput. Vis., 2021

Eden: A Unified Environment Framework for Booming Reinforcement Learning Algorithms.
CoRR, 2021

Space-address decoupled scratchpad memory management for neural network accelerators.
Concurr. Comput. Pract. Exp., 2021

ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Distilling Object Detectors with Feature Richness.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

2020
ParaML: A Polyvalent Multicore Accelerator for Machine Learning.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Machine Learning Computers With Fractal von Neumann Architecture.
IEEE Trans. Computers, 2020

Addressing Irregularity in Sparse Neural Networks Through a Cooperative Software/Hardware Approach.
IEEE Trans. Computers, 2020

Self-Aware Neural Network Systems: A Survey and New Perspective.
Proc. IEEE, 2020

Rubik: A Hierarchical Architecture for Efficient Graph Learning.
CoRR, 2020

ALT: Optimizing Tensor Compilation in Deep Learning Compilers with Active Learning.
Proceedings of the 38th IEEE International Conference on Computer Design, 2020

Fixed-Point Back-Propagation Training.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

DWM: A Decomposable Winograd Method for Convolution Acceleration.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Addressing Sparsity in Deep Neural Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

Guest Editors' Introduction: Special Issue on Big Data Systems on Emerging Architectures.
IEEE Trans. Big Data, 2019

BSHIFT: A Low Cost Deep Neural Networks Accelerator.
Int. J. Parallel Program., 2019

CompactNet: Platform-Aware Automatic Optimization for Convolutional Neural Networks.
CoRR, 2019

Cambricon-F: machine learning computers with fractal von neumann architecture.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

TDSNN: From Deep Neural Networks to Deep Spike Neural Networks with Temporal-Coding.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Using Local Clocks to Reproduce Concurrency Bugs.
IEEE Trans. Software Eng., 2018

An Instruction Set Architecture for Machine Learning.
ACM Trans. Comput. Syst., 2018

BenchIP: Benchmarking Intelligence Processors.
J. Comput. Sci. Technol., 2018

Cambricon-S: Addressing Irregularity in Sparse Neural Networks through A Cooperative Software/Hardware Approach.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

2017
Service-Oriented Architecture on FPGA-Based MPSoC.
IEEE Trans. Parallel Distributed Syst., 2017

Secure Outsourcing of Virtual Appliance.
IEEE Trans. Cloud Comput., 2017

An Accelerator for High Efficient Vision Processing.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017

DaDianNao: A Neural Network Supercomputer.
IEEE Trans. Computers, 2017

DLPlib: A Library for Deep Learning Processor.
J. Comput. Sci. Technol., 2017

BENCHIP: Benchmarking Intelligence Processors.
CoRR, 2017

TuNao: A High-Performance and Energy-Efficient Reconfigurable Accelerator for Graph Processing.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

2016
IMR: High-Performance Low-Cost Multi-Ring NoCs.
IEEE Trans. Parallel Distributed Syst., 2016

Accelerating Architectural Simulation Via Statistical Techniques: A Survey.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

Near-Memory Data Services.
IEEE Micro, 2016

A survey of routing algorithm for mesh Network-on-Chip.
Frontiers Comput. Sci., 2016

DianNao family: energy-efficient hardware accelerators for machine learning.
Commun. ACM, 2016

Cambricon-X: An accelerator for sparse neural networks.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Cambricon: An Instruction Set Architecture for Neural Networks.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

2015
FreeRider: Non-Local Adaptive Network-on-Chip Routing with Packet-Carried Propagation of Congestion Information.
IEEE Trans. Parallel Distributed Syst., 2015

Robust Design Space Modeling.
ACM Trans. Design Autom. Electr. Syst., 2015

A Small-Footprint Accelerator for Large-Scale Neural Networks.
ACM Trans. Comput. Syst., 2015

Leveraging the Error Resilience of Neural Networks for Designing Highly Energy Efficient Accelerators.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2015

Architecture Support for Task Out-of-Order Execution in MPSoCs.
IEEE Trans. Computers, 2015

Statistical Performance Comparisons of Computers.
IEEE Trans. Computers, 2015

Practical Iterative Optimization for the Data Center.
ACM Trans. Archit. Code Optim., 2015

A High-Throughput Neural Network Accelerator.
IEEE Micro, 2015

Deterministic Replay: A Survey.
ACM Comput. Surv., 2015

Neuromorphic accelerators: a comparison between neuroscience and machine-learning approaches.
Proceedings of the 48th International Symposium on Microarchitecture, 2015

ShiDianNao: shifting vision processing closer to the sensor.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

ReCBuLC: Reproducing Concurrency Bugs Using Local Clocks.
Proceedings of the 37th IEEE/ACM International Conference on Software Engineering, 2015

Retraining-based timing error mitigation for hardware neural networks.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

HERMES: a fast cross-ISA binary translator with post-optimization.
Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015

PuDianNao: A Polyvalent Machine Learning Accelerator.
Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015

2014
Pre-Silicon Bug Forecast.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2014

Performance Portability Across Heterogeneous SoCs Using a Generalized Library-Based Approach.
ACM Trans. Archit. Code Optim., 2014

An 8-Core MIPS-Compatible Processor in 32/28 nm Bulk CMOS.
IEEE J. Solid State Circuits, 2014

Prevention from Soft Errors via Architecture Elasticity.
J. Comput. Sci. Technol., 2014

An Elastic Architecture Adaptable to Various Application Scenarios.
J. Comput. Sci. Technol., 2014

A General-Purpose Many-Accelerator Architecture Based on Dataflow Graph Clustering of Applications.
J. Comput. Sci. Technol., 2014

Auxiliary stream for optimizing memory access of video decoders.
Sci. China Inf. Sci., 2014

DaDianNao: A Machine-Learning Supercomputer.
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

ArchRanker: A ranking approach to design space exploration.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

Big data genome sequencing on Zynq based clusters (abstract only).
Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014

Co-processing with dynamic reconfiguration on heterogeneous MPSoC: practices and design tradeoffs (abstract only).
Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014

A low-cost memory interface for high-throughput accelerators.
Proceedings of the 2014 International Conference on Compilers, 2014

DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014

Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators.
Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

2013
Effective and efficient microprocessor design space exploration using unlabeled design configurations.
ACM Trans. Intell. Syst. Technol., 2013

Motion Estimation Without Integer-Pel Search.
IEEE Trans. Image Process., 2013

LDet: Determinizing Asynchronous Transfer for Postsilicon Debugging.
IEEE Trans. Computers, 2013

Deterministic Replay Using Global Clock.
ACM Trans. Archit. Code Optim., 2013

Microarchitectural design space exploration made fast.
Microprocess. Microsystems, 2013

Godson-3B1500: A 32nm 1.35GHz 40W 172.8GFLOPS 8-core processor.
Proceedings of the 2013 IEEE International Solid-State Circuits Conference, 2013

Elastic CGRAs.
Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

2012
Program Regularization in Memory Consistency Verification.
IEEE Trans. Parallel Distributed Syst., 2012

Linear Time Memory Consistency Verification.
IEEE Trans. Computers, 2012

Global Adaptive Routing Algorithm Without Additional Congestion Propagation Network
CoRR, 2012

RepTFD: Replay Based Transient Fault Detection
CoRR, 2012

An Elastic Architecture Adaptable to Millions of Application Scenarios.
Proceedings of the Network and Parallel Computing, 9th IFIP International Conference, 2012

BenchNN: On the broad potential application scope of hardware neural network accelerators.
Proceedings of the 2012 IEEE International Symposium on Workload Characterization, 2012

Performance Prediction for Reconfigurable Processor.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

Statistical performance comparisons of computers.
Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

2011
An FFT Performance Model for Optimizing General-Purpose Processor Architecture.
J. Comput. Sci. Technol., 2011

Efficient Deterministic Replay Using Complete Race Detection
CoRR, 2011

The Impact of Mutation Rate on the Computation Time of Evolutionary Dynamic Optimization
CoRR, 2011

Brief announcement: program regularization in verifying memory consistency.
Proceedings of the SPAA 2011: Proceedings of the 23rd Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2011

Godson-3B: A 1GHz 40W 8-core 128GFLOPS processor in 65nm CMOS.
Proceedings of the IEEE International Solid-State Circuits Conference, 2011

Effective and Efficient Microprocessor Design Space Exploration Using Unlabeled Design Configurations.
Proceedings of the IJCAI 2011, 2011

Video Encoding without Integer-Pel Motion Estimation.
Proceedings of the 2011 Data Compression Conference (DCC 2011), 2011

Empirical design bugs prediction for verification.
Proceedings of the Design, Automation and Test in Europe, 2011

2010
System Architecture of Godson-3 Multi-Core Processors.
J. Comput. Sci. Technol., 2010

LReplay: a pending period based deterministic replay scheme.
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

Estimating design quality of digital systems via machine learning.
Proceedings of the 17th IEEE International Conference on Electronics, 2010

A multi-FPGA based platform for emulating a 100m-transistor-scale processor with high-speed peripherals (abstract only).
Proceedings of the ACM/SIGDA 18th International Symposium on Field Programmable Gate Arrays, 2010

A general method to make multi-clock system deterministic.
Proceedings of the Design, Automation and Test in Europe, 2010

On-the-Fly Reduction of Stimuli for Functional Verification.
Proceedings of the 19th IEEE Asian Test Symposium, 2010

2009
Godson-3: A Scalable Multicore RISC Processor with x86 Emulation.
IEEE Micro, 2009

Global Clock, Physical Time Order and Pending Period Analysis in Multiprocessor Systems
CoRR, 2009

An Enhanced HyperTransport Controller with Cache Coherence Support for Multiple-CMP.
Proceedings of the International Conference on Networking, Architecture, and Storage, 2009

Efficiency-Aware QoS DRAM Scheduler.
Proceedings of the International Conference on Networking, Architecture, and Storage, 2009

Designing an Effective Constraint Solver in Coverage Directed Test Generation.
Proceedings of the International Conference on Embedded Software and Systems, 2009

Fast complete memory consistency verification.
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

A stochastic method for controlling the scaling parameters of Cauchy mutation in fast evolutionary programming.
Proceedings of the IEEE Congress on Evolutionary Computation, 2009

2008
Testing content addressable memories using instructions and march-like algorithms.
Proceedings of the 15th IEEE International Conference on Electronics, Circuits and Systems, 2008

Coverage Directed Test Generation: Godson Experience.
Proceedings of the 17th IEEE Asian Test Symposium, 2008

2004
EmGen: An Automatic Test-Program Generation Tool for Embedded IP Cores.
Proceedings of the Embedded Software and Systems, First International Conference, 2004


  Loading...