专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US08667049B2 Massively parallel supercomputer 有权
标题翻译：大容量并行超级计算机
公开(公告)号：US08667049B2
公开(公告)日：2014-03-04
申请号：US13566024
申请日：2012-08-03
申请人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampap , Philip Heidlberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
发明人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampap , Philip Heidlberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
IPC分类号： G06F15/173
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System-On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node individually or simultaneously work on any combination of computation or communication as required by the particular algorithm being solved. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency. The multiple networks include three high-speed networks for parallel algorithm message passing including a Torus, Global Tree, and a Global Asynchronous network that provides global barrier and notification functions.
摘要翻译：数百个teraOPS级别的新型大规模并行超级计算机包括基于片上系统技术的节点架构，即每个处理节点包括单个专用集成电路（ASIC）。在每个ASIC节点内是多个处理元件，每个处理元件由中央处理单元（CPU）和多个浮点处理器组成，以实现计算性能，封装密度，低成本以及功率和冷却要求的最佳平衡。单个节点内的多个处理器单独或同时工作在要解决的特定算法所要求的计算或通信的任何组合上。片上系统ASIC节点通过多个独立网络互连，从而最大限度地最大限度地提高了分组通信吞吐量并最大限度地减少了延迟。多个网络包括用于并行算法消息传递的三个高速网络，包括Torus，全局树和提供全局障碍和通知功能的全球异步网络。

2. 发明申请

US20120311299A1 NOVEL MASSIVELY PARALLEL SUPERCOMPUTER 有权
标题翻译：新的大型并行超级计算机
公开(公告)号：US20120311299A1
公开(公告)日：2012-12-06
申请号：US13566024
申请日：2012-08-03
申请人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidlberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
发明人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidlberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
IPC分类号： G06F15/80
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System-On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node individually or simultaneously work on any combination of computation or communication as required by the particular algorithm being solved. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency. The multiple networks include three high-speed networks for parallel algorithm message passing including a Torus, Global Tree, and a Global Asynchronous network that provides global barrier and notification functions.
摘要翻译：数百个teraOPS级别的新型大规模并行超级计算机包括基于片上系统技术的节点架构，即每个处理节点包括单个专用集成电路（ASIC）。在每个ASIC节点内是多个处理元件，每个处理元件由中央处理单元（CPU）和多个浮点处理器组成，以实现计算性能，封装密度，低成本以及功率和冷却要求的最佳平衡。单个节点内的多个处理器单独或同时工作在要解决的特定算法所要求的计算或通信的任何组合上。片上系统ASIC节点通过多个独立网络进行互连，从而最大限度地最大限度地提高了分组通信吞吐量并最大限度地减少了延迟。多个网络包括用于并行算法消息传递的三个高速网络，包括Torus，全局树和提供全局障碍和通知功能的全球异步网络。

3. 发明授权

US08250133B2 Massively parallel supercomputer 有权
标题翻译：大型并行超级计算机
公开(公告)号：US08250133B2
公开(公告)日：2012-08-21
申请号：US12492799
申请日：2009-06-26
申请人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
发明人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
IPC分类号： G06F15/16
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System- On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node individually or simultaneously work on any combination of computation or communication as required by the particular algorithm being solved. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency. The multiple networks include three high-speed networks for parallel algorithm message passing including a Torus, Global Tree, and a Global Asynchronous network that provides global barrier and notification functions.
摘要翻译：数百个teraOPS级别的新型大规模并行超级计算机包括基于片上系统技术的节点架构，即每个处理节点包括单个专用集成电路（ASIC）。在每个ASIC节点内是多个处理元件，每个处理元件由中央处理单元（CPU）和多个浮点处理器组成，以实现计算性能，封装密度，低成本以及功率和冷却要求的最佳平衡。单个节点内的多个处理器单独或同时工作在要解决的特定算法所要求的计算或通信的任何组合上。片上系统ASIC节点通过多个独立网络互连，从而最大限度地最大限度地提高了分组通信吞吐量并最大限度地减少了延迟。多个网络包括用于并行算法消息传递的三个高速网络，包括Torus，全局树和提供全局障碍和通知功能的全球异步网络。

4. 发明授权

US07555566B2 Massively parallel supercomputer 有权
标题翻译：大型并行超级计算机
公开(公告)号：US07555566B2
公开(公告)日：2009-06-30
申请号：US10468993
申请日：2002-02-25
申请人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
发明人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
IPC分类号： G06F15/16
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System-On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node may be used individually or simultaneously to work on any combination of computation or communication as required by the particular algorithm being solved or executed at any point in time. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency. In the preferred embodiment, the multiple networks include three high-speed networks for parallel algorithm message passing including a Torus, Global Tree, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. For particular classes of parallel algorithms, or parts of parallel calculations, this architecture exhibits exceptional computational performance, and may be enabled to perform calculations for new classes of parallel algorithms. Additional networks are provided for external connectivity and used for Input/Output, System Management and Configuration, and Debug and Monitoring functions. Special node packaging techniques implementing midplane and other hardware devices facilitates partitioning of the supercomputer in multiple networks for optimizing supercomputing resources.
摘要翻译：数百个teraOPS级别的新型大规模并行超级计算机包括基于片上系统技术的节点架构，即，每个处理节点包括单个专用集成电路（ASIC）。在每个ASIC节点内是多个处理元件，每个处理元件由中央处理单元（CPU）和多个浮点处理器组成，以实现计算性能，封装密度，低成本以及功率和冷却要求的最佳平衡。单个节点内的多个处理器可以单独使用或同时使用，以在任何时间点解决或执行的特定算法所要求的任何计算或通信组合上工作。片上系统ASIC节点通过多个独立网络互连，从而最大限度地最大限度地提高了分组通信吞吐量并最大限度地减少了延迟。在优选实施例中，多个网络包括用于并行算法消息传递的三个高速网络，包括提供全局障碍和通知功能的环形，全局树和全球异步网络。这些多个独立网络可以根据用于优化算法处理性能的算法的需求或阶段来协同或独立地利用。对于特定类别的并行算法或并行计算的部分，该架构具有出色的计算性能，并且可以启用对新类并行算法执行计算。为外部连接提供附加网络，用于输入/输出，系统管理和配置以及调试和监控功能。实现中平面和其他硬件设备的特殊节点打包技术有助于在多个网络中划分超级计算机，以优化超级计算资源。

5. 发明申请

US20090259713A1 NOVEL MASSIVELY PARALLEL SUPERCOMPUTER 有权
标题翻译：新的大型并行超级计算机
公开(公告)号：US20090259713A1
公开(公告)日：2009-10-15
申请号：US12492799
申请日：2009-06-26
申请人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
发明人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
IPC分类号： G06F15/76 , G06F15/16 , G06F11/28 , G06F12/08 , G06F9/02 , G06F15/177
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System-On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node may be used individually or simultaneously to work on any combination of computation or communication as required by the particular algorithm being solved or executed at any point in time. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency. In the preferred embodiment, the multiple networks include three high-speed networks for parallel algorithm message passing including a Torus, Global Tree, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. For particular classes of parallel algorithms, or parts of parallel calculations, this architecture exhibits exceptional computational performance, and may be enabled to perform calculations for new classes of parallel algorithms. Additional networks are provided for external connectivity and used for Input/Output, System Management and Configuration, and Debug and Monitoring functions. Special node packaging techniques implementing midplane and other hardware devices facilitates partitioning of the supercomputer in multiple networks for optimizing supercomputing resources.
摘要翻译：数百个teraOPS级别的新型大规模并行超级计算机包括基于片上系统技术的节点架构，即每个处理节点包括单个专用集成电路（ASIC）。在每个ASIC节点内是多个处理元件，每个处理元件由中央处理单元（CPU）和多个浮点处理器组成，以实现计算性能，封装密度，低成本以及功率和冷却要求的最佳平衡。单个节点内的多个处理器可以单独使用或同时使用，以在任何时间点解决或执行的特定算法所要求的任何计算或通信组合上工作。片上系统ASIC节点通过多个独立网络互连，从而最大限度地最大限度地提高了分组通信吞吐量并最大限度地减少了延迟。在优选实施例中，多个网络包括用于并行算法消息传递的三个高速网络，包括提供全局障碍和通知功能的环形，全局树和全球异步网络。这些多个独立网络可以根据用于优化算法处理性能的算法的需求或阶段来协同或独立地利用。对于特定类别的并行算法或并行计算的部分，该架构具有出色的计算性能，并且可以启用对新类并行算法执行计算。为外部连接提供附加网络，用于输入/输出，系统管理和配置以及调试和监控功能。实现中平面和其他硬件设备的特殊节点打包技术有助于在多个网络中划分超级计算机，以优化超级计算资源。

6. 发明授权

US07444385B2 Global interrupt and barrier networks 失效
标题翻译：全局中断和屏障网络
公开(公告)号：US07444385B2
公开(公告)日：2008-10-28
申请号：US10468997
申请日：2002-02-25
申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Burkhard D. Steinmacher-Burow , Todd E. Takken
发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Burkhard D. Steinmacher-Burow , Todd E. Takken
IPC分类号： G06F15/16
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： A system and method for generating global asynchronous signals in a computing structure. Particularly, a global interrupt and barrier network is implemented that implements logic for generating global interrupt and barrier signals for controlling global asynchronous operations performed by processing elements at selected processing nodes of a computing structure in accordance with a processing algorithm; and includes the physical interconnecting of the processing nodes for communicating the global interrupt and barrier signals to the elements via low-latency paths. The global asynchronous signals respectively initiate interrupt and barrier operations at the processing nodes at times selected for optimizing performance of the processing algorithms. In one embodiment, the global interrupt and barrier network is implemented in a scalable, massively parallel supercomputing device structure comprising a plurality of processing nodes interconnected by multiple independent networks, with each node including one or more processing elements for performing computation or communication activity as required when performing parallel algorithm operations. One multiple independent network includes a global tree network for enabling high-speed global tree communications among global tree network nodes or sub-trees thereof. The global interrupt and barrier network may operate in parallel with the global tree network for providing global asynchronous sideband signals.
摘要翻译：一种用于在计算结构中产生全局异步信号的系统和方法。特别地，实现了全局中断和屏障网络，其实现用于根据处理算法产生用于控制由计算结构的选定处理节点处理元件执行的全局异步操作的全局中断和屏障信号的逻辑; 并且包括用于经由低延迟路径将全局中断和屏障信号传送到元件的处理节点的物理互连。全局异步信号分别在处理节点处启动中断和屏障操作，这些时间被选择用于优化处理算法的性能。在一个实施例中，全局中断和屏障网络在可扩展的大规模并行超级计算设备结构中实现，该结构包括由多个独立网络互连的多个处理节点，每个节点包括用于根据需要执行计算或通信活动的一个或多个处理元件当执行并行算法操作时。一个多个独立网络包括全局树网络，用于在全球树网络节点或其子树之间实现高速全局树通信。全局中断和屏障网络可以与全局树网络并行操作，以提供全局异步边带信号。

7. 发明申请

US20110173488A1 NON-VOLATILE MEMORY FOR CHECKPOINT STORAGE 失效
标题翻译：用于检查点存储的非易失性存储器
公开(公告)号：US20110173488A1
公开(公告)日：2011-07-14
申请号：US13004005
申请日：2011-01-10
申请人： Matthias A. Blumrich , Dong Chen , Thomas M. Cipolla , Paul W. Coteus , Alan Gara , Philip Heidelberger , Mark J. Jeanson , Gerard V. Kopcsay , Martin Ohmacht , Todd E. Takken
发明人： Matthias A. Blumrich , Dong Chen , Thomas M. Cipolla , Paul W. Coteus , Alan Gara , Philip Heidelberger , Mark J. Jeanson , Gerard V. Kopcsay , Martin Ohmacht , Todd E. Takken
IPC分类号： G06F11/00 , G06F11/14
CPC分类号： G06F11/1438 , G06F2201/82 , G06F2201/84
摘要： A system, method and computer program product for supporting system initiated checkpoints in high performance parallel computing systems and storing of checkpoint data to a non-volatile memory storage device. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity. In one embodiment, the non-volatile memory is a pluggable flash memory card.
摘要翻译：一种用于在高性能并行计算系统中支持系统发起的检查点并将检查点数据存储到非易失性存储器存储设备的系统，方法和计算机程序产品。系统和方法产生选择性控制信号，以在存在与在节点处运行的用户应用相关联的消息传送活动的情况下执行系统相关数据的检查点。检查点由系统启动，使得即使在存在包括正在进行的用户消息活动的高度并行计算机上的用户应用的情况下，也可以获得多个网络节点的检查点数据。在一个实施例中，非易失性存储器是可插拔闪存卡。

8. 发明授权

US08788879B2 Non-volatile memory for checkpoint storage 失效
标题翻译：用于检查点存储的非易失性存储器
公开(公告)号：US08788879B2
公开(公告)日：2014-07-22
申请号：US13004005
申请日：2011-01-10
申请人： Matthias A. Blumrich , Dong Chen , Thomas M. Cipolla , Paul W. Coteus , Alan Gara , Philip Heidelberger , Mark J. Jeanson , Gerard V. Kopcsay , Martin Ohmacht , Todd E. Takken
发明人： Matthias A. Blumrich , Dong Chen , Thomas M. Cipolla , Paul W. Coteus , Alan Gara , Philip Heidelberger , Mark J. Jeanson , Gerard V. Kopcsay , Martin Ohmacht , Todd E. Takken
IPC分类号： G06F11/00
CPC分类号： G06F11/1438 , G06F2201/82 , G06F2201/84
摘要： A system, method and computer program product for supporting system initiated checkpoints in high performance parallel computing systems and storing of checkpoint data to a non-volatile memory storage device. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity. In one embodiment, the non-volatile memory is a pluggable flash memory card.
摘要翻译：一种用于在高性能并行计算系统中支持系统发起的检查点并将检查点数据存储到非易失性存储器存储设备的系统，方法和计算机程序产品。系统和方法产生选择性控制信号，以在存在与在节点处运行的用户应用相关联的消息传送活动的情况下执行系统相关数据的检查点。检查点由系统启动，使得即使在存在包括正在进行的用户消息活动的高度并行计算机上的用户应用的情况下，也可以获得多个网络节点的检查点数据。在一个实施例中，非易失性存储器是可插拔闪存卡。

9. 发明授权

US07761687B2 Ultrascalable petaflop parallel supercomputer 失效
标题翻译：超平面petaflop平行超级计算机
公开(公告)号：US07761687B2
公开(公告)日：2010-07-20
申请号：US11768905
申请日：2007-06-26
申请人： Matthias A. Blumrich , Dong Chen , George Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Shawn Hall , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Martin Ohmacht , Valentina Salapura , Krishnan Sugavanam , Todd Takken
发明人： Matthias A. Blumrich , Dong Chen , George Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Shawn Hall , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Martin Ohmacht , Valentina Salapura , Krishnan Sugavanam , Todd Takken
IPC分类号： G06F15/173
CPC分类号： G06F15/17337
摘要： A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.
摘要翻译： petaOPS规模的大规模并行超级计算机包括基于片上系统技术的节点架构，其中每个处理节点包括具有多达四个处理元件的单个专用集成电路（ASIC）。 ASIC节点通过多个独立网络互连，以最小的延迟最大化节点之间的数据包通信的吞吐量。多个网络可以包括用于并行算法消息传递的三个高速网络，包括Torus，集合网络和提供全局障碍和通知功能的全球异步网络。这些多个独立网络可以根据用于优化算法处理性能的算法的需求或阶段来协同或独立地利用。提供DMA引擎的使用以促进节点之间的消息传递，而不需要节点处理资源。

10. 发明申请

US20090006808A1 ULTRASCALABLE PETAFLOP PARALLEL SUPERCOMPUTER 失效
标题翻译：超声波PETAFLOP并行超级计算机
公开(公告)号：US20090006808A1
公开(公告)日：2009-01-01
申请号：US11768905
申请日：2007-06-26
申请人： Matthias A. Blumrich , Dong Chen , George Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Shawn Hall , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Martin Ohmacht , Valentina Salapura , Krishnan Sugavanam , Todd Takken
发明人： Matthias A. Blumrich , Dong Chen , George Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Shawn Hall , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Martin Ohmacht , Valentina Salapura , Krishnan Sugavanam , Todd Takken
IPC分类号： G06F15/80 , G06F9/06
CPC分类号： G06F15/17337
摘要： A novel massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. Novel use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.
摘要翻译： petaOPS规模的一种新型大规模并行超级计算机包括基于片上系统技术的节点架构，其中每个处理节点包括具有多达四个处理元件的单个专用集成电路（ASIC）。 ASIC节点通过多个独立网络互连，以最小的延迟最大化节点之间的数据包通信的吞吐量。多个网络可以包括用于并行算法消息传递的三个高速网络，包括Torus，集合网络和提供全局障碍和通知功能的全球异步网络。这些多个独立网络可以根据用于优化算法处理性能的算法的需求或阶段来协同或独立地利用。提供了新的使用DMA引擎来促进节点之间的消息传递，而不需要节点处理资源。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式