Seguir
Xuechao Wei
Xuechao Wei
Dirección de correo verificada de pku.edu.cn
Título
Citado por
Citado por
Año
Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs
X Wei, CH Yu, P Zhang, Y Chen, Y Wang, H Hu, Y Liang, J Cong
Proceedings of the 54th Annual Design Automation Conference 2017, 1-6, 2017
4542017
Overcoming data transfer bottlenecks in FPGA-based DNN accelerators via layer conscious memory management
X Wei, Y Liang, J Cong
Proceedings of the 56th Annual Design Automation Conference 2019, 1-6, 2019
752019
TGPA: tile-grained pipeline architecture for low latency CNN inference
X Wei, Y Liang, X Li, CH Yu, P Zhang, J Cong
2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 1-8, 2018
742018
Systems and methods for systolic array design from a high-level program
P Zhang, CH Yu, X Wei, P Pan
US Patent 10,838,910, 2020
572020
Frequency improvement of systolic array-based CNNs on FPGAs
J Zhang, W Zhang, G Luo, X Wei, Y Liang, J Cong
2019 IEEE International Symposium on Circuits and Systems (ISCAS), 1-4, 2019
412019
Throughput optimization for streaming applications on CPU-FPGA heterogeneous systems
X Wei, Y Liang, T Wang, S Lu, J Cong
2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC), 488-493, 2017
312017
{PetS}: A unified framework for {Parameter-Efficient} transformers serving
Z Zhou, X Wei, J Zhang, G Sun
2022 USENIX Annual Technical Conference (USENIX ATC 22), 489-504, 2022
252022
Generating systolic array accelerators with reusable blocks
L Jia, L Lu, X Wei, Y Liang
IEEE Micro 40 (4), 85-92, 2020
202020
FlexBFS: a parallelism-aware implementation of breadth-first search on GPU
G Liu, H An, W Han, X Li, T Sun, W Zhou, X Wei, X Tang
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of …, 2012
182012
Gnnear: Accelerating full-batch training of graph neural networks with near-memory processing
Z Zhou, C Li, X Wei, X Wang, G Sun
Proceedings of the International Conference on Parallel Architectures and …, 2022
152022
FTDL: a tailored FPGA-overlay for deep learning with high scalability
R Shi, Y Ding, X Wei, H Li, H Liu, HKH So, C Ding
2020 57th ACM/IEEE Design Automation Conference (DAC), 1-6, 2020
112020
Gcnear: A hybrid architecture for efficient gcn training with near-memory processing
Z Zhou, C Li, X Wei, G Sun
arXiv preprint arXiv:2111.00680, 1-15, 2021
102021
ArchExplorer: Microarchitecture exploration via bottleneck analysis
C Bai, J Huang, X Wei, Y Ma, S Li, H Zheng, B Yu, Y Xie
Proceedings of the 56th Annual IEEE/ACM International Symposium on …, 2023
52023
FTDL: An FPGA-tailored Architecture for Deep Learning Systems.
R Shi, Y Ding, X Wei, H Liu, HKH So, C Ding
FPGA, 320, 2020
52020
Efficient super-resolution system with block-wise hybridization and quantized winograd on fpga
B Shi, J Zhang, Z He, X Wei, S Li, G Luo, H Zheng, Y Xie
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2023
32023
2022 ICCAD CAD contest problem C: Microarchitecture design space exploration
S Li, C Bai, X Wei, B Shi, YK Chen, Y Xie
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided …, 2022
32022
Distributed Control Independence for Composable Multi-processors
M Mao, H An, T Sun, Q Li, B Deng, X Wei, J Zhou
2012 IEEE/ACIS 11th International Conference on Computer and Information …, 2012
32012
An Intermediate-Centric Dataflow for Transposed Convolution Acceleration on FPGA
Z Ma, T Dai, X Wei, G Luo
ACM Transactions on Embedded Computing Systems 22 (6), 1-22, 2023
12023
Iccad cad contest 2022
S Li, C Bai, X Wei, B Shi, YK Chen, Y Xie
12022
POSTER: RadiK: Scalable Radix Top-K Selection on GPUs
Y Li, B Zhou, J Zhang, X Wei, Y Li, Y Chen
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and …, 2024
2024
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–20