会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 10. 发明申请
    • RUNTIME OF CUBLAS MATRIX MULTIPLICATION ON GPU
    • GPU上的CUBLAS矩阵多项式运行
    • US20170046307A1
    • 2017-02-16
    • US14823889
    • 2015-08-11
    • INTERNATIONAL BUSINESS MACHINES CORPORATION
    • Alexey Y. LvovJinjun XiongVladimir Zolotov
    • G06F17/16G06T1/20
    • G06F17/16G06T1/20G06T2200/28
    • Methods for improving matrix multiplication runtimes are provided. A method includes determining, by a GPU, optimal partitions for matrix-by-matrix multiplication of two factor matrices having sizes known a priori. The determining step includes performing offline a plurality of matrix-by-matrix multiplication executions, each for a respective different combination of two-way partitions across a plurality of partition sizes. The determining step further includes determining offline a respective performance number for each of the executions based on runtime. The determining step also includes recursively repeating offline said performing and determining steps until the respective performance number ceases to improve for best-performing combinations of the two-way partitions and saving the best performing combinations of the two-way partitions as the optimal partitions. The method further includes performing online, by the GPU, the matrix-by-matrix multiplication of the two factor matrices using calls for a given one of the best performing combinations of the two-way partitions.
    • 提供了改进矩阵乘法运行时的方法。 一种方法包括由GPU确定具有先验已知的尺寸的两个因子矩阵的逐矩阵乘法的最佳分区。 所述确定步骤包括执行多个矩阵逐矩阵乘法执行,所述多个逐行矩阵乘法执行各自针对跨多个分区大小的双向分区的相应不同组合。 所述确定步骤还包括基于运行时确定每个所述执行的脱机相应的性能编号。 确定步骤还包括递归地重复离线所述执行和确定步骤,直到相应的性能编号不再改进以用于双向分区的最佳执行组合并且将双向分区的最佳执行组合保存为最佳分区。 该方法还包括使用对双向分区的最佳执行组合中的给定的一个的调用,由GPU在线执行两个因子矩阵的逐矩阵乘法。