会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 31. 发明申请
    • Method and system for managing cache injection in a multiprocessor system
    • 在多处理器系统中管理缓存注入的方法和系统
    • US20060064518A1
    • 2006-03-23
    • US10948407
    • 2004-09-23
    • Patrick BohrerAhmed GheithPeter HochschildRamakrishnan RajamonyHazim ShafiBalaram Sinharoy
    • Patrick BohrerAhmed GheithPeter HochschildRamakrishnan RajamonyHazim ShafiBalaram Sinharoy
    • G06F13/28
    • G06F13/28
    • A method and apparatus for managing cache injection in a multiprocessor system reduces processing time associated with direct memory access transfers in a symmetrical multiprocessor (SMP) or a non-uniform memory access (NUMA) multiprocessor environment. The method and apparatus either detect the target processor for DMA completion or direct processing of DMA completion to a particular processor, thereby enabling cache injection to a cache that is coupled with processor that executes the DMA completion routine processing the data injected into the cache. The target processor may be identified by determining the processor handling the interrupt that occurs on completion of the DMA transfer. Alternatively or in conjunction with target processor identification, an interrupt handler may queue a deferred procedure call to the target processor to process the transferred data. In NUMA multiprocessor systems, the completing processor/target memory is chosen for accessibility of the target memory to the processor and associated cache.
    • 用于管理多处理器系统中的高速缓存注入的方法和装置减少与对称多处理器(SMP)或非均匀存储器访问(NUMA)多处理器环境中的直接存储器访问传输相关联的处理时间。 该方法和装置可以检测目标处理器用于DMA完成或直接处理DMA完成到特定处理器,从而使高速缓存注入与执行DMA完成例程的处理器处理注入高速缓存的数据的处理器相连的高速缓存。 可以通过确定处理器处理在DMA传输完成时发生的中断来识别目标处理器。 或者或与目标处理器识别结合,中断处理程序可以将延迟过程调用排队到目标处理器以处理传送的数据。 在NUMA多处理器系统中,选择完成的处理器/目标存储器,以便可访问目标存储器到处理器和相关联的高速缓存。
    • 37. 发明申请
    • Single Thread Performance in an In-Order Multi-Threaded Processor
    • 单线程处理器中的单线程性能
    • US20110265068A1
    • 2011-10-27
    • US12767886
    • 2010-04-27
    • Elmootazbellah N. ElnozahyAhmed Gheith
    • Elmootazbellah N. ElnozahyAhmed Gheith
    • G06F9/45
    • G06F8/456
    • A mechanism is provided for improving single-thread performance for a multi-threaded, in-order processor core. In a first phase, a compiler analyzes application code to identify instructions that can be executed in parallel with focus on instruction-level parallelism and removing any register interference between the threads. The compiler inserts as appropriate synchronization instructions supported by the apparatus to ensure that the resulting execution of the threads is equivalent to the execution of the application code in a single thread. In a second phase, an operating system schedules the threads produced in the first phase on the hardware threads of a single processor core such that they execute simultaneously. In a third phase, the microprocessor core executes the threads specified by the second phase such that there is one hardware thread executing an application thread.
    • 提供了一种用于提高多线程,按顺序处理器内核的单线程性能的机制。 在第一阶段,编译器分析应用程序代码以识别可以并行执行的指令,重点是指令级并行性,并消除线程之间的任何寄存器干扰。 编译器插入作为设备支持的适当的同步指令,以确保线程的结果执行等同于单个线程中应用程序代码的执行。 在第二阶段中,操作系统在单个处理器核心的硬件线程上调度在第一阶段产生的线程,使得它们同时执行。 在第三阶段,微处理器核心执行由第二阶段指定的线程,使得有一个硬件线程执行应用程序线程。
    • 38. 发明申请
    • Architecture Support for Debugging Multithreaded Code
    • 架构支持调试多线程代码
    • US20110258421A1
    • 2011-10-20
    • US12762817
    • 2010-04-19
    • Elmootazbellah N. ElnozahyAhmed Gheith
    • Elmootazbellah N. ElnozahyAhmed Gheith
    • G06F9/44G06F9/30
    • G06F9/3824G06F11/3636G06F11/3648
    • Mechanisms are provided for debugging application code using a content addressable memory. The mechanisms receive an instruction in a hardware unit of a processor of the data processing system, the instruction having a target memory address that the instruction is attempting to access. A content addressable memory (CAM) associated with the hardware unit is searched for an entry in the CAM corresponding to the target memory address. In response to an entry in the CAM corresponding to the target memory address being found, a determination is made as to whether information in the entry identifies the instruction as an instruction of interest. In response to the entry identifying the instruction as an instruction of interest, an exception is generated and sent to one of an exception handler or a debugger application. In this way, debugging of multithreaded applications may be performed in an efficient manner.
    • 提供了使用内容可寻址存储器调试应用程序代码的机制。 机构在数据处理系统的处理器的硬件单元中接收指令,该指令具有指令试图访问的目标存储器地址。 搜索与硬件单元相关联的内容可寻址存储器(CAM),以对应于目标存储器地址的CAM中的条目。 响应于与所找到的目标存储器地址相对应的CAM中的条目,确定条目中的信息是否将该指令识别为感兴趣的指令。 响应于将该指令识别为感兴趣的指令的条目,生成异常并将其发送到异常处理程序或调试器应用程序之一。 以这种方式,可以以有效的方式执行多线程应用程序的调试。
    • 39. 发明申请
    • Multithreaded processor architecture with implicit granularity adaptation
    • 具有隐式粒度适配性的多线程处理器架构
    • US20060230409A1
    • 2006-10-12
    • US11101608
    • 2005-04-07
    • Matteo FrigoAhmed GheithVolker Strumpen
    • Matteo FrigoAhmed GheithVolker Strumpen
    • G06F9/46
    • G06F9/4843
    • A method and processor architecture for achieving a high level of concurrency and latency hiding in an “infinite-thread processor architecture” with a limited number of hardware threads is disclosed. A preferred embodiment defines “fork” and “join” instructions for spawning new threads and having a novel operational semantics. If a hardware thread is available to shepherd a forked thread, the fork and join instructions have thread creation and termination/synchronization semantics, respectively. If no hardware thread is available, however, the fork and join instructions assume subroutine call and return semantics respectively. The link register of the processor is used to determine whether a given join instruction should be treated as a thread synchronization operation or as a return from subroutine operation.
    • 公开了一种用于在具有有限数量的硬件线程的“无限线程处理器架构”中实现高水平并发和延迟隐藏的方法和处理器架构。 优选实施例定义了用于产生新线程并具有新颖的操作语义的“叉”和“连接”指令。 如果一个硬件线程可用于分派叉形线程,则fork和join指令分别具有线程创建和终止/同步语义。 然而,如果没有硬件线程可用,fork和join指令分别假定子程序调用和返回语义。 处理器的链接寄存器用于确定给定的连接指令是否应被视为线程同步操作或作为从子程序操作返回。