专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US07418068B2 Data capture technique for high speed signaling 失效
标题翻译：高速信号数据采集技术
公开(公告)号：US07418068B2
公开(公告)日：2008-08-26
申请号：US10468992
申请日：2002-02-25
申请人： Wayne Melvin Barrett , Dong Chen , Paul William Coteus , Alan Gene Gara , Rory Jackson , Gerard Vincent Kopcsay , Ben Jesse Nathanson , Paylos Michael Vranas , Todd E. Takken
发明人： Wayne Melvin Barrett , Dong Chen , Paul William Coteus , Alan Gene Gara , Rory Jackson , Gerard Vincent Kopcsay , Ben Jesse Nathanson , Paylos Michael Vranas , Todd E. Takken
IPC分类号： H04L7/00
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： A data capture technique for high speed signaling to allow for optimal sampling of an asynchronous data stream. This technique allows for extremely high data rates and does not require that a clock be sent with the data as is done in source synchronous systems. The present invention also provides a hardware mechanism for automatically adjusting transmission delays for optimal two-bit simultaneous bi-directional (SiBiDi) signaling.
摘要翻译：用于高速信令的数据捕获技术，以允许异步数据流的最佳采样。这种技术允许极高的数据速率，并且不要求在源同步系统中进行数据发送时钟。本发明还提供了用于自动调整用于最佳两比特双向（SiBiDi）信令的传输延迟的硬件机制。

2. 发明授权

US08788879B2 Non-volatile memory for checkpoint storage 失效
标题翻译：用于检查点存储的非易失性存储器
公开(公告)号：US08788879B2
公开(公告)日：2014-07-22
申请号：US13004005
申请日：2011-01-10
申请人： Matthias A. Blumrich , Dong Chen , Thomas M. Cipolla , Paul W. Coteus , Alan Gara , Philip Heidelberger , Mark J. Jeanson , Gerard V. Kopcsay , Martin Ohmacht , Todd E. Takken
发明人： Matthias A. Blumrich , Dong Chen , Thomas M. Cipolla , Paul W. Coteus , Alan Gara , Philip Heidelberger , Mark J. Jeanson , Gerard V. Kopcsay , Martin Ohmacht , Todd E. Takken
IPC分类号： G06F11/00
CPC分类号： G06F11/1438 , G06F2201/82 , G06F2201/84
摘要： A system, method and computer program product for supporting system initiated checkpoints in high performance parallel computing systems and storing of checkpoint data to a non-volatile memory storage device. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity. In one embodiment, the non-volatile memory is a pluggable flash memory card.
摘要翻译：一种用于在高性能并行计算系统中支持系统发起的检查点并将检查点数据存储到非易失性存储器存储设备的系统，方法和计算机程序产品。系统和方法产生选择性控制信号，以在存在与在节点处运行的用户应用相关联的消息传送活动的情况下执行系统相关数据的检查点。检查点由系统启动，使得即使在存在包括正在进行的用户消息活动的高度并行计算机上的用户应用的情况下，也可以获得多个网络节点的检查点数据。在一个实施例中，非易失性存储器是可插拔闪存卡。

3. 发明申请

US20110219208A1 MULTI-PETASCALE HIGHLY EFFICIENT PARALLEL SUPERCOMPUTER 有权
标题翻译：多层高效平行超级计算机
公开(公告)号：US20110219208A1
公开(公告)日：2011-09-08
申请号：US13004007
申请日：2011-01-10
申请人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu
发明人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu
IPC分类号： G06F15/76 , G06F9/06
CPC分类号： G06F13/287 , G06F9/06 , G06F9/3004 , G06F9/30047 , G06F9/3885 , G06F12/0811 , G06F12/0831 , G06F12/0862 , G06F12/0864 , G06F12/1027 , G06F15/17381 , G06F15/17387 , G06F15/76 , G06F15/8069 , G06F2212/1016 , G06F2212/602 , G06F2212/6022 , G06F2212/6024 , G06F2212/6032 , Y02D10/13 , Y02D10/14
摘要： A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.
摘要翻译：具有100 petaOPS规模计算的多Petascale高效并行超级计算机，其成本，功耗和占地面积都在降低，并且允许从互连角度来看处理节点的最大封装密度。超级计算机利用了VLSI的技术进步，实现了许多处理器可以集成到单个专用集成电路（ASIC）中的计算模型。每个ASIC计算节点包括利用集成到一个管芯中的四个或更多个处理器的片上系统ASIC，每个处理器具有对所有系统资源的完全访问，并且使得处理器能够对诸如计算或消息传递I / O 并且优选地，根据应用内的各种算法阶段实现功能的自适应分割，或者如果I / O或其他处理器未被充分利用，则可以参与计算或通信节点通过五维环面网络互连使用DMA来最大限度地最大化节点之间的分组通信的吞吐量并最小化等待时间。

4. 发明申请

US20110173413A1 EMBEDDING GLOBAL BARRIER AND COLLECTIVE IN A TORUS NETWORK 有权
标题翻译：嵌入式全球障碍物和多功能网络中的集合
公开(公告)号：US20110173413A1
公开(公告)日：2011-07-14
申请号：US12723277
申请日：2010-03-12
申请人： Dong Chen , Paul W. Coteus , Noel A. Eisley , Alan Gara , Philip Heidleberger , Robert M. Senger , Valentina Salapura , Burkhard Steinmacher-Burow , Yutaka Sugawara , Todd E. Takken
发明人： Dong Chen , Paul W. Coteus , Noel A. Eisley , Alan Gara , Philip Heidleberger , Robert M. Senger , Valentina Salapura , Burkhard Steinmacher-Burow , Yutaka Sugawara , Todd E. Takken
IPC分类号： G06F15/80 , G06F9/06 , G06F9/46
CPC分类号： G06F9/30021 , G06F9/3001 , G06F9/30018 , G06F9/30145 , G06F11/3024 , G06F11/3409 , G06F11/348 , G06F15/17362 , G06F15/17381 , G06F15/17393 , G06F2201/88 , H04L67/10
摘要： Embodiments of the invention provide a method, system and computer program product for embedding a global barrier and global interrupt network in a parallel computer system organized as a torus network. The computer system includes a multitude of nodes. In one embodiment, the method comprises taking inputs from a set of receivers of the nodes, dividing the inputs from the receivers into a plurality of classes, combining the inputs of each of the classes to obtain a result, and sending said result to a set of senders of the nodes. Embodiments of the invention provide a method, system and computer program product for embedding a collective network in a parallel computer system organized as a torus network. In one embodiment, the method comprises adding to a torus network a central collective logic to route messages among at least a group of nodes in a tree structure.
摘要翻译：本发明的实施例提供了一种用于在被组织为环面网络的并行计算机系统中嵌入全局屏障和全局中断网络的方法，系统和计算机程序产品。计算机系统包括多个节点。在一个实施例中，该方法包括从节点的一组接收器中获取输入，将来自接收器的输入划分为多个类，组合每个类的输入以获得结果，并将所述结果发送到一组的节点的发送者。本发明的实施例提供了一种用于将集体网络嵌入组织为环面网络的并行计算机系统中的方法，系统和计算机程序产品。在一个实施例中，该方法包括向环形网络添加集中逻辑以在树结构中的至少一组节点之间路由消息。

5. 发明授权

US07818514B2 Low latency memory access and synchronization 失效
标题翻译：低延迟内存访问和同步
公开(公告)号：US07818514B2
公开(公告)日：2010-10-19
申请号：US12196796
申请日：2008-08-22
申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
IPC分类号： G06F12/06
CPC分类号： G06F12/0862 , G06F9/52 , G06F2212/6028
摘要： A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Bach processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.
摘要翻译：与弱有序的多处理器系统相关联地提供低延迟存储器系统访问。多处理器中的Bach处理器共享资源，并且每个共享资源在锁定设备内具有关联的锁，其提供对多处理器中的多个处理器之间的同步的支持以及资源的有序共享。当处理器拥有与该资源相关联的锁定时，处理器仅具有访问资源的权限，并且处理器拥有锁的尝试仅需要单个加载操作，而不是传统的原子负载后跟存储，使得处理器只执行读取操作，并且硬件锁定装置执行后续的写入操作而不是处理器。还公开了用于非连续数据结构的简单预取。重新定义存储器线，使得除了正常的物理存储器数据之外，每行包括足够大的指针以指向存储器中的任何其他行，其中指针用于确定要预取的存储器行而不是一些其它预测算法。这使得硬件能够有效地预取不连续但重复的存储器访问模式。

6. 发明申请

US20090259713A1 NOVEL MASSIVELY PARALLEL SUPERCOMPUTER 有权
标题翻译：新的大型并行超级计算机
公开(公告)号：US20090259713A1
公开(公告)日：2009-10-15
申请号：US12492799
申请日：2009-06-26
申请人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
发明人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
IPC分类号： G06F15/76 , G06F15/16 , G06F11/28 , G06F12/08 , G06F9/02 , G06F15/177
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System-On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node may be used individually or simultaneously to work on any combination of computation or communication as required by the particular algorithm being solved or executed at any point in time. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency. In the preferred embodiment, the multiple networks include three high-speed networks for parallel algorithm message passing including a Torus, Global Tree, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. For particular classes of parallel algorithms, or parts of parallel calculations, this architecture exhibits exceptional computational performance, and may be enabled to perform calculations for new classes of parallel algorithms. Additional networks are provided for external connectivity and used for Input/Output, System Management and Configuration, and Debug and Monitoring functions. Special node packaging techniques implementing midplane and other hardware devices facilitates partitioning of the supercomputer in multiple networks for optimizing supercomputing resources.
摘要翻译：数百个teraOPS级别的新型大规模并行超级计算机包括基于片上系统技术的节点架构，即每个处理节点包括单个专用集成电路（ASIC）。在每个ASIC节点内是多个处理元件，每个处理元件由中央处理单元（CPU）和多个浮点处理器组成，以实现计算性能，封装密度，低成本以及功率和冷却要求的最佳平衡。单个节点内的多个处理器可以单独使用或同时使用，以在任何时间点解决或执行的特定算法所要求的任何计算或通信组合上工作。片上系统ASIC节点通过多个独立网络互连，从而最大限度地最大限度地提高了分组通信吞吐量并最大限度地减少了延迟。在优选实施例中，多个网络包括用于并行算法消息传递的三个高速网络，包括提供全局障碍和通知功能的环形，全局树和全球异步网络。这些多个独立网络可以根据用于优化算法处理性能的算法的需求或阶段来协同或独立地利用。对于特定类别的并行算法或并行计算的部分，该架构具有出色的计算性能，并且可以启用对新类并行算法执行计算。为外部连接提供附加网络，用于输入/输出，系统管理和配置以及调试和监控功能。实现中平面和其他硬件设备的特殊节点打包技术有助于在多个网络中划分超级计算机，以优化超级计算资源。

7. 发明申请

US20080313408A1 LOW LATENCY MEMORY ACCESS AND SYNCHRONIZATION 失效
标题翻译：低延迟存储器访问和同步
公开(公告)号：US20080313408A1
公开(公告)日：2008-12-18
申请号：US12196796
申请日：2008-08-22
申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
IPC分类号： G06F12/08
CPC分类号： G06F12/0862 , G06F9/52 , G06F2212/6028
摘要： A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Bach processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.
摘要翻译：与弱有序的多处理器系统相关联地提供低延迟存储器系统访问。多处理器中的Bach处理器共享资源，并且每个共享资源在锁定设备内具有关联的锁，其提供对多处理器中的多个处理器之间的同步的支持以及资源的有序共享。当处理器拥有与该资源相关联的锁定时，处理器仅具有访问资源的权限，并且处理器拥有锁的尝试仅需要单个加载操作，而不是传统的原子负载后跟存储，使得处理器只执行读取操作，并且硬件锁定装置执行后续的写入操作而不是处理器。还公开了用于非连续数据结构的简单预取。重新定义存储器线，使得除了正常的物理存储器数据之外，每行包括足够大的指针以指向存储器中的任何其他行，其中指针用于确定要预取的存储器行而不是一些其它预测算法。这使得硬件能够有效地预取不连续但重复的存储器访问模式。

8. 发明申请

US20080104367A1 Collective Network For Computer Structures 有权
标题翻译：计算机结构集体网
公开(公告)号：US20080104367A1
公开(公告)日：2008-05-01
申请号：US11572372
申请日：2005-07-18
申请人： Matthias A. Blumrich , Paul W. Coteus , Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Todd E. Takken , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas
发明人： Matthias A. Blumrich , Paul W. Coteus , Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Todd E. Takken , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas
IPC分类号： G06F15/80 , G06F9/30
CPC分类号： G06F15/17381 , H04L1/1845 , H04L12/4641
摘要： A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices ate included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network and class structures. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.
摘要翻译：一种用于实现互连处理节点之间的高速，低延迟全局集体通信的系统和方法。全局集体网络最优地使得能够在具有多个互连处理节点的计算机结构中执行并行算法操作期间执行集体缩减操作。路由器设备包括通过链路互连网络的节点，以便于在虚拟网络和类结构的节点处执行低延迟全局处理操作。全局集体网络可以被配置为以异步或同步方式提供全局屏障和中断功能。当在大规模并行超级计算结构中实现时，全局集体网络根据处理算法的需要在物理上和逻辑上可分割。

9. 发明授权

US09112953B2 Internet telephony unit and software for enabling internet telephone access from traditional telephone interface 有权
标题翻译：互联网电话单元和用于通过传统电话接口实现互联网电话接入的软件
公开(公告)号：US09112953B2
公开(公告)日：2015-08-18
申请号：US13605614
申请日：2012-09-06
申请人： Todd E. Takken
发明人： Todd E. Takken
IPC分类号： H04L12/66 , H04M7/00 , H04L29/06 , H04M15/00 , H04L29/08
CPC分类号： H04M7/0057 , H04L29/06 , H04L67/14 , H04L67/303 , H04L69/24 , H04L69/329 , H04M15/00 , H04M15/49 , H04M15/55 , H04M15/56 , H04M15/8044 , H04M2215/202 , H04M2215/2046 , H04M2215/42 , H04M2215/46 , H04M2215/745
摘要： Automatic selection and establishment of a communications connection between a telephone device to a receiver device, including entering an address of a receiver device into the telephone device for initiating the communications connection to the receiver device, and automatically selecting a communications network for establishing the communications connection to the receiver device, and selecting the communications network from an internet-based network, a hybrid telephone/internet network, and a telephone network. Automatically determine network access capabilities of the receiver device based on the address of the receiver device, and automatically evaluate the cost of establishing a communications connection for each of the communications networks which the receiver device is capable of accessing. The communications network with the lowest cost is selected.
摘要翻译：自动选择和建立电话设备到接收机设备之间的通信连接，包括将接收机设备的地址输入到电话设备中，用于发起到接收机设备的通信连接，以及自动选择用于建立通信连接的通信网络并且从基于因特网的网络，混合电话/互联网和电话网络选择通信网络。基于接收机设备的地址自动确定接收机设备的网络接入能力，并自动评估为接收机设备能够访问的每个通信网络建立通信连接的成本。选择成本最低的通信网络。

10. 发明授权

US08667049B2 Massively parallel supercomputer 有权
标题翻译：大容量并行超级计算机
公开(公告)号：US08667049B2
公开(公告)日：2014-03-04
申请号：US13566024
申请日：2012-08-03
申请人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampap , Philip Heidlberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
发明人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampap , Philip Heidlberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
IPC分类号： G06F15/173
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System-On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node individually or simultaneously work on any combination of computation or communication as required by the particular algorithm being solved. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency. The multiple networks include three high-speed networks for parallel algorithm message passing including a Torus, Global Tree, and a Global Asynchronous network that provides global barrier and notification functions.
摘要翻译：数百个teraOPS级别的新型大规模并行超级计算机包括基于片上系统技术的节点架构，即每个处理节点包括单个专用集成电路（ASIC）。在每个ASIC节点内是多个处理元件，每个处理元件由中央处理单元（CPU）和多个浮点处理器组成，以实现计算性能，封装密度，低成本以及功率和冷却要求的最佳平衡。单个节点内的多个处理器单独或同时工作在要解决的特定算法所要求的计算或通信的任何组合上。片上系统ASIC节点通过多个独立网络互连，从而最大限度地最大限度地提高了分组通信吞吐量并最大限度地减少了延迟。多个网络包括用于并行算法消息传递的三个高速网络，包括Torus，全局树和提供全局障碍和通知功能的全球异步网络。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式