专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

41. 发明授权

US07877759B2 System for efficient performance monitoring of a large number of simultaneous events 有权
标题翻译：系统用于高效率监控大量同时发生的事件
公开(公告)号：US07877759B2
公开(公告)日：2011-01-25
申请号：US12324254
申请日：2008-11-26
申请人： Alan G. Gara , Michael K. Gschwind , Valentina Salapura
发明人： Alan G. Gara , Michael K. Gschwind , Valentina Salapura
IPC分类号： G06F3/00 , G06F9/44 , G06F9/46 , G06F13/00
CPC分类号： G06F11/348 , Y02D10/34
摘要： A system for monitoring a large number of simultaneous events implements a hybrid counter array device having a first counter portion comprising counter devices, each counter device for receiving signals representing occurrences of events from an event source and providing a first count value corresponding to a lower order bits of the hybrid counter array. A second counter portion comprises a memory array device having addressable memory locations in correspondence with the counter devices, each addressable memory location for storing a second count value representing higher order bits. A control device monitors each of the counter devices and initiates updating a value of a corresponding second count value stored at the corresponding addressable memory location. The system includes interrupt pre-indication for providing fast interrupt trigger to a processor device when a count value related to an event equals a threshold value.
摘要翻译：一种用于监视大量同时事件的系统实现具有包括计数器装置的第一计数器部分的混合计数器阵列装置，每个计数器装置用于接收表示从事件源发生的事件的信号，并提供对应于较低次序的第一计数值混合计数器阵列的位。第二计数器部分包括具有与计数器装置对应的可寻址存储器位置的存储器阵列器件，每个可寻址存储器位置用于存储表示较高位的第二计数值。控制装置监视每个计数器装置并且启动更新存储在相应的可寻址存储器位置处的对应的第二计数值的值。当与事件相关的计数值等于阈值时，该系统包括用于向处理器设备提供快速中断触发的中断预指示。

42. 发明授权

US07817585B2 Data capture technique for high speed signaling 有权
标题翻译：高速信号数据采集技术
公开(公告)号：US07817585B2
公开(公告)日：2010-10-19
申请号：US12191893
申请日：2008-08-14
申请人： Wayne M. Barrett , Dong Chen , Paul W. Coteus , Alan G. Gara , Rory Jackson , Gerard V. Kopcsay , Ben J. Nathanson , Pavlos M. Vranas
发明人： Wayne M. Barrett , Dong Chen , Paul W. Coteus , Alan G. Gara , Rory Jackson , Gerard V. Kopcsay , Ben J. Nathanson , Pavlos M. Vranas
IPC分类号： H04L5/14
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： A data capture technique for high speed signaling to allow for optimal sampling of an asynchronous data stream. This technique allows for extremely high data rates and does not require that a clock be sent with the data as is done in source synchronous systems. The present invention also provides a hardware mechanism for automatically adjusting transmission delays for optimal two-bit simultaneous bi-directional (SiBiDi) signaling.
摘要翻译：用于高速信令的数据捕获技术，以允许异步数据流的最佳采样。这种技术允许极高的数据速率，并且不要求在源同步系统中进行数据发送时钟。本发明还提供了用于自动调整用于最佳两比特双向（SiBiDi）信令的传输延迟的硬件机制。

43. 发明授权

US07797503B2 Configurable memory system and method for providing atomic counting operations in a memory device 有权
标题翻译：可配置的存储器系统和方法，用于在存储器件中提供原子计数操作
公开(公告)号：US07797503B2
公开(公告)日：2010-09-14
申请号：US11768812
申请日：2007-06-26
申请人： Ralph E. Bellofatto , Alan G. Gara , Mark E. Giampapa , Martin Ohmacht
发明人： Ralph E. Bellofatto , Alan G. Gara , Mark E. Giampapa , Martin Ohmacht
IPC分类号： G06F12/00
CPC分类号： G06F12/10 , G06F9/3004 , G06F9/30185 , G06F9/3834 , G06F9/3851 , G06F9/3861 , G06F12/0292 , G06F12/1027 , G06F2212/1044 , G06F2212/206
摘要： A memory system and method for providing atomic memory-based counter operations to operating systems and applications that make most efficient use of counter-backing memory and virtual and physical address space, while simplifying operating system memory management, and enabling the counter-backing memory to be used for purposes other than counter-backing storage when desired. The encoding and address decoding enabled by the invention provides all this functionality through a combination of software and hardware.
摘要翻译：一种用于向操作系统和应用提供基于原子的存储器的计数器操作的存储器系统和方法，所述操作系统和应用可最有效地利用反向存储器和虚拟和物理地址空间，同时简化操作系统存储器管理，在需要的时候用于除背板存储之外的目的。通过本发明实现的编码和地址解码通过软件和硬件的组合来提供所有这些功能。

44. 发明授权

US07782995B2 Low latency counter event indication 有权
标题翻译：低延迟计数器事件指示
公开(公告)号：US07782995B2
公开(公告)日：2010-08-24
申请号：US12130724
申请日：2008-05-30
申请人： Alan G. Gara , Valentina Salapura
发明人： Alan G. Gara , Valentina Salapura
IPC分类号： G06M3/00
CPC分类号： H03K23/54
摘要： A hybrid counter array device for counting events with interrupt indication includes a first counter portion comprising N counter devices, each for counting signals representing event occurrences and providing a first count value representing lower order bits. An overflow bit device associated with each respective counter device is additionally set in response to an overflow condition. The hybrid counter array includes a second counter portion comprising a memory array device having N addressable memory locations in correspondence with the N counter devices, each addressable memory location for storing a second count value representing higher order bits. An operatively coupled control device monitors each associated overflow bit device and initiates incrementing a second count value stored at a corresponding memory location in response to a respective overflow bit being set. The incremented second count value is compared to an interrupt threshold value stored in a threshold register, and, when the second counter value is equal to the interrupt threshold value, a corresponding “interrupt arm” bit is set to enable a fast interrupt indication. On a subsequent roll-over of the lower bits of that counter, the interrupt will be fired.
摘要翻译：一种用于对具有中断指示进行计数事件的混合计数器阵列装置，包括包括N个计数器装置的第一计数器部分，每个计数器装置用于计数表示事件发生的信号，并提供表示较低位的第一计数值。与每个相应计数器装置相关联的溢出位装置响应于溢出状况被另外设置。混合计数器阵列包括第二计数器部分，其包括具有与N个计数器装置对应的N个可寻址存储器位置的存储器阵列器件，每个可寻址存储器位置用于存储表示较高阶位的第二计数值。操作耦合的控制设备监视每个相关联的溢出位设备并且响应于设置的相应溢出位而启动存储在相应存储器位置的第二计数值的递增。将增加的第二计数值与存储在阈值寄存器中的中断阈值进行比较，并且当第二计数器值等于中断阈值时，相应的“中断臂”位被设置为使能快速中断指示。在该计数器的低位后续翻转时，中断将被触发。

45. 发明申请

US20090313439A1 MANAGING COHERENCE VIA PUT/GET WINDOWS 失效
标题翻译：通过输入/获取窗口管理相关性
公开(公告)号：US20090313439A1
公开(公告)日：2009-12-17
申请号：US12543890
申请日：2009-08-19
申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht
发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht
IPC分类号： G06F12/08
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.
摘要翻译：一种用于管理多处理器计算机系统的两个处理器节点的两个处理器之间的相干性的方法和装置。通常，本发明涉及一种软件算法，其简化并显着加速了传送并行计算机的消息中的高速缓存一致性的管理以及辅助该高速缓存一致性算法的硬件设备。软件算法使用put / get窗口的打开和关闭来协调激活的所需要的，以实现缓存一致性。硬件设备可以是硬件地址解码的扩展，其在节点的物理存储器地址空间中创建（a）实际不存在的虚拟存储器的区域，并且（b）因此能够立即响应从处理元素读取和写入请求。

46. 发明授权

US07603523B2 Method and apparatus for filtering snoop requests in a point-to-point interconnect architecture 失效
标题翻译：用于在点对点互连架构中过滤窥探请求的方法和装置
公开(公告)号：US07603523B2
公开(公告)日：2009-10-13
申请号：US12035085
申请日：2008-02-21
申请人： Matthias A. Blumrich , Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk I. Hoenicke , Martin Ohmacht , Valentina Salapura , Pavlos M. Vranas
发明人： Matthias A. Blumrich , Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk I. Hoenicke , Martin Ohmacht , Valentina Salapura , Pavlos M. Vranas
IPC分类号： G06F12/08 , G06F13/00
CPC分类号： G06F12/0831 , G06F12/084 , Y02D10/13
摘要： A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each of the memory writing sources is directly connected to the dedicated input ports of all other snoop filter devices associated with all other processing units in a point-to-point interconnect fashion. Each snoop filter device includes a plurality of parallel operating port snoop filters in correspondence with the plurality of dedicated input ports that are adapted to concurrently filter snoop requests received from respective dedicated memory writing sources and forward a subset of those requests to its associated processing unit.
摘要翻译：一种用于在具有多个处理单元的多处理器计算环境中支持高速缓存一致性的方法和装置，每个处理单元具有与其相关联并与之可操作地相连的一个或多个本地高速缓冲存储器。该方法包括提供与每个处理单元相关联的窥探过滤器设备，每个窥探过滤器设备具有多个专用输入端口，用于从多处理器计算环境中的专用存储器写入源接收窥探请求。每个存储器写入源以点对点互连方式直接连接到与所有其他处理单元相关联的所有其他窥探滤波器设备的专用输入端口。每个窥探过滤器装置包括与多个专用输入端口相对应的多个并行操作端口窥探滤波器，该多个专用输入端口适于同时滤除从相应专用存储器写入源接收到的窥探请求，并将这些请求的子集转发到其相关联的处理单元。

47. 发明授权

US07555566B2 Massively parallel supercomputer 有权
标题翻译：大型并行超级计算机
公开(公告)号：US07555566B2
公开(公告)日：2009-06-30
申请号：US10468993
申请日：2002-02-25
申请人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
发明人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken
IPC分类号： G06F15/16
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System-On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node may be used individually or simultaneously to work on any combination of computation or communication as required by the particular algorithm being solved or executed at any point in time. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency. In the preferred embodiment, the multiple networks include three high-speed networks for parallel algorithm message passing including a Torus, Global Tree, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. For particular classes of parallel algorithms, or parts of parallel calculations, this architecture exhibits exceptional computational performance, and may be enabled to perform calculations for new classes of parallel algorithms. Additional networks are provided for external connectivity and used for Input/Output, System Management and Configuration, and Debug and Monitoring functions. Special node packaging techniques implementing midplane and other hardware devices facilitates partitioning of the supercomputer in multiple networks for optimizing supercomputing resources.
摘要翻译：数百个teraOPS级别的新型大规模并行超级计算机包括基于片上系统技术的节点架构，即，每个处理节点包括单个专用集成电路（ASIC）。在每个ASIC节点内是多个处理元件，每个处理元件由中央处理单元（CPU）和多个浮点处理器组成，以实现计算性能，封装密度，低成本以及功率和冷却要求的最佳平衡。单个节点内的多个处理器可以单独使用或同时使用，以在任何时间点解决或执行的特定算法所要求的任何计算或通信组合上工作。片上系统ASIC节点通过多个独立网络互连，从而最大限度地最大限度地提高了分组通信吞吐量并最大限度地减少了延迟。在优选实施例中，多个网络包括用于并行算法消息传递的三个高速网络，包括提供全局障碍和通知功能的环形，全局树和全球异步网络。这些多个独立网络可以根据用于优化算法处理性能的算法的需求或阶段来协同或独立地利用。对于特定类别的并行算法或并行计算的部分，该架构具有出色的计算性能，并且可以启用对新类并行算法执行计算。为外部连接提供附加网络，用于输入/输出，系统管理和配置以及调试和监控功能。实现中平面和其他硬件设备的特殊节点打包技术有助于在多个网络中划分超级计算机，以优化超级计算资源。

48. 发明授权

US07529895B2 Method for prefetching non-contiguous data structures 失效
标题翻译：预取非连续数据结构的方法
公开(公告)号：US07529895B2
公开(公告)日：2009-05-05
申请号：US11617276
申请日：2006-12-28
申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
IPC分类号： G06F13/28
CPC分类号： G06F12/0862 , G06F9/52 , G06F2212/6028
摘要： A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple perfecting for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefect rather than some other predictive algorithm. This enables hardware to effectively prefect memory access patterns that are non-contiguous, but repetitive.
摘要翻译：与弱有序的多处理器系统相关联地提供低延迟存储器系统访问。多处理器中的每个处理器共享资源，并且每个共享资源在锁定设备内具有关联的锁，其提供对多处理器中的多个处理器之间的同步的支持以及资源的有序共享。当处理器拥有与该资源相关联的锁定时，处理器仅具有访问资源的权限，并且处理器拥有锁的尝试仅需要单个加载操作，而不是传统的原子负载后跟存储，使得处理器只执行读取操作，并且硬件锁定装置执行后续的写入操作而不是处理器。还公开了用于非连续数据结构的简单完善。存储器线被重新定义，使得除了正常的物理存储器数据之外，每行包括足够大的指针以指向存储器中的任何其他行，其中指针用于确定哪个存储器行被提供而不是一些其它预测算法。这使得硬件能够有效地预处理不连续但重复的存储器访问模式。

49. 发明申请

US20090006894A1 METHOD AND APPARATUS TO DEBUG AN INTEGRATED CIRCUIT CHIP VIA SYNCHRONOUS CLOCK STOP AND SCAN 失效
标题翻译：通过同步时钟停止和扫描来调试集成电路芯片的方法和设备
公开(公告)号：US20090006894A1
公开(公告)日：2009-01-01
申请号：US11768791
申请日：2007-06-26
申请人： Ralph E. Bellofatto , Matthew R. Ellavsky , Alan G. Gara , Mark E. Giampapa , Thomas M. Gooding , Rudolf A. Haring , Lance G. Hehenberger , Martin Ohmacht
发明人： Ralph E. Bellofatto , Matthew R. Ellavsky , Alan G. Gara , Mark E. Giampapa , Thomas M. Gooding , Rudolf A. Haring , Lance G. Hehenberger , Martin Ohmacht
IPC分类号： G06F11/00
CPC分类号： G06F11/2236
摘要： An apparatus and method for evaluating a state of an electronic or integrated circuit (IC), each IC including one or more processor elements for controlling operations of IC sub-units, and each the IC supporting multiple frequency clock domains. The method comprises: generating a synchronized set of enable signals in correspondence with one or more IC sub-units for starting operation of one or more IC sub-units according to a determined timing configuration; counting, in response to one signal of the synchronized set of enable signals, a number of main processor IC clock cycles; and, upon attaining a desired clock cycle number, generating a stop signal for each unique frequency clock domain to synchronously stop a functional clock for each respective frequency clock domain; and, upon synchronously stopping all on-chip functional clocks on all frequency clock domains in a deterministic fashion, scanning out data values at a desired IC chip state. The apparatus and methodology enables construction of a cycle-by-cycle view of any part of the state of a running IC chip, using a combination of on-chip circuitry and software.
摘要翻译：一种用于评估电子或集成电路（IC）的状态的装置和方法，每个IC包括用于控制IC子单元的操作的一个或多个处理器元件，以及每个支持多个时钟域的IC。该方法包括：根据确定的定时配置，产生与一个或多个IC子单元相对应的用于开始一个或多个IC子单元的操作的同步的使能信号组; 计数，响应于同步的一组使能信号的一个信号，多个主处理器IC时钟周期; 并且在获得期望的时钟周期数时，产生用于每个唯一频率时钟域的停止信号以同步地停止每个相应频率时钟域的功能时钟; 并且在确定性地同时停止所有频率时钟域上的所有片上功能时钟时，以期望的IC芯片状态扫描数据值。该装置和方法使得能够使用片上电路和软件的组合来构建运行中的IC芯片的状态的任何部分的逐周期视图。

50. 发明申请

US20090006873A1 POWER THROTTLING OF COLLECTIONS OF COMPUTING ELEMENTS 失效
标题翻译：计算元素收集的功率曲线
公开(公告)号：US20090006873A1
公开(公告)日：2009-01-01
申请号：US11768752
申请日：2007-06-26
申请人： Ralph E. Bellofatto , Paul W. Coteus , Paul G. Crumley , Alan G. Gara , Mark E. Giampapa , Thomas M. Gooding , Rudolf Haring , Mark G. Megerian , Martin Ohmacht , Don D. Reed , Richard A. Swetz , Todd Takken
发明人： Ralph E. Bellofatto , Paul W. Coteus , Paul G. Crumley , Alan G. Gara , Mark E. Giampapa , Thomas M. Gooding , Rudolf Haring , Mark G. Megerian , Martin Ohmacht , Don D. Reed , Richard A. Swetz , Todd Takken
IPC分类号： G06F1/26
CPC分类号： G06F1/3203 , G06F1/206
摘要： An apparatus and method for controlling power usage in a computer includes a plurality of computers communicating with a local control device, and a power source supplying power to the local control device and the computer. A plurality of sensors communicate with the computer for ascertaining power usage of the computer, and a system control device communicates with the computer for controlling power usage of the computer.
摘要翻译：用于控制计算机中的电力使用的装置和方法包括与本地控制装置通信的多个计算机，以及向本地控制装置和计算机供电的电源。多个传感器与计算机通信以确定计算机的功率使用，并且系统控制装置与计算机通信以控制计算机的电力使用。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式