会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 3. 发明申请
    • Scalable Matrix Multiplication in a Shared Memory System
    • 在共享内存系统中的可扩展矩阵乘法
    • US20140331014A1
    • 2014-11-06
    • US14041974
    • 2013-09-30
    • Silicon Graphics International Corp.
    • Cheng Liao
    • G06F12/00
    • G06F15/17306G06F17/16
    • High performance computing systems perform complex or data-intensive calculations using a large number of computing nodes and a shared memory. Disclosed methods and systems provide nodes having a special-purpose coprocessor to perform these calculations, along with a general-purpose processor to direct the calculations. Computational data transfer from the shared memory to the coprocessor incurs a data copying latency. To reduce this latency as experienced by the coprocessor, a complex computation is divided into work units, and one or more threads executing on the processor copy the work units from the shared memory to a local buffer memory of a computing node. By buffering these data for transfer from the local memory to coprocessor memory, and by ensuring that new data are copied while the coprocessor operates on older data, data copying latency is hidden from the coprocessor.
    • 高性能计算系统使用大量计算节点和共享内存执行复杂或数据密集型计算。 公开的方法和系统提供具有专用协处理器的节点来执行这些计算,以及用于指导计算的通用处理器。 从共享存储器到协处理器的计算数据传输引起数据复制延迟。 为了减少由协处理器所经历的延迟,复杂的计算被划分为工作单元,并且在处理器上执行的一个或多个线程将工作单元从共享存储器复制到计算节点的本地缓冲存储器。 通过缓冲这些数据从本地存储器转移到协处理器存储器,并且通过确保在协处理器对旧数据进行操作时复制新数据,数据复制等待时间将从协处理器中隐藏起来。