肖超恩, 余福蓉, 王建新, 孙凯勃. 基于CUDA平台的后量子密码算法babyKyber并行设计[J]. 北京电子科技学院学报, 2025, 33(2): 11-20.
    引用本文: 肖超恩, 余福蓉, 王建新, 孙凯勃. 基于CUDA平台的后量子密码算法babyKyber并行设计[J]. 北京电子科技学院学报, 2025, 33(2): 11-20.
    XIAO Chaoen, FURONG Yu, WANG Jianxin, SUN Kaibo. Parallel Design of Post-Quantum Cryptographic Algorithm babyKyber on CUDA Platform[J]. Journal of Beijing Electronic Science and Technology Institute, 2025, 33(2): 11-20.
    Citation: XIAO Chaoen, FURONG Yu, WANG Jianxin, SUN Kaibo. Parallel Design of Post-Quantum Cryptographic Algorithm babyKyber on CUDA Platform[J]. Journal of Beijing Electronic Science and Technology Institute, 2025, 33(2): 11-20.

    基于CUDA平台的后量子密码算法babyKyber并行设计

    Parallel Design of Post-Quantum Cryptographic Algorithm babyKyber on CUDA Platform

    • 摘要: 针对物联网设备在后量子时代面临的新型安全挑战,本文基于CUDA架构提出面向babyKyber算法的并行优化方案。研究聚焦该算法中多项式乘法与数论变换等核心模块,通过细粒度并行将运算拆解至GPU线程级实现计算加速,同时采用粗粒度并行构建多线程块架构以提升算法吞吐量。特别地,本文通过动态线程块配置实验探索GPU资源利用率优化路径。实验数据表明:优化后的并行方案在NVIDIA GeForce MX150平台实现千万级吞吐量,较CPU平台获得三个数量级的加速增益。该研究为后量子密码算法在资源受限物联网终端的工程化部署提供可行解决方案。

       

      Abstract: To address the new security challenge faced by IoT(Internet of Things) devices in the post-quantum era, a parallel optimization scheme for the babyKyber algorithm based on CUDA architecture is proposed in this paper. This research focuses on core algorithm modules such as polynomial multiplication and number-theoretic transformation, achieving acceleration by decomposing computations to the GPU thread level through fine-grained parallelism, and improving algorithm throughput by building a multi-thread block architecture through coarse-grained parallelism. In particular, GPU resource utilization optimization is investigated through experiments with dynamic thread block configurations. Experiment data demonstrate that the optimized parallel scheme achieves a throughput in the tens of millions on the NVIDIA GeForce MX150 GPU platform, yielding a speedup of three orders of magnitude over the CPU-based platform. This research presents a feasible engineering solution for implementing post-quantum cryptographic algorithms on resource-constrained IoT devices.

       

    /

    返回文章
    返回