专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US08370575B2 Optimized software cache lookup for SIMD architectures 有权
标题翻译： SIMD架构优化的软件缓存查找
公开(公告)号：US08370575B2
公开(公告)日：2013-02-05
申请号：US11470638
申请日：2006-09-07
申请人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Tao Zhang
发明人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Tao Zhang
IPC分类号： G06F12/00 , G06F13/00 , G06F13/28
CPC分类号： G06F12/0864
摘要： Process, cache memory, computer product and system for loading data associated with a requested address in a software cache. The process includes loading address tags associated with a set in a cache directory using a Single Instruction Multiple Data (SIMD) operation, determining a position of the requested address in the set using a SIMD comparison, and determining an actual data value associated with the position of the requested address in the set.
摘要翻译：处理，高速缓冲存储器，计算机产品和系统，用于在软件缓存中加载与所请求的地址相关联的数据。该过程包括使用单指令多数据（SIMD）操作将与集合相关联的地址标签加载到高速缓存目录中，使用SIMD比较确定所述集合中的请求地址的位置，以及确定与该位置相关联的实际数据值的集合中的请求地址。

2. 发明授权

US07493452B2 Method to efficiently prefetch and batch compiler-assisted software cache accesses 失效
标题翻译：有效预取和批量编译器辅助软件缓存访问的方法
公开(公告)号：US07493452B2
公开(公告)日：2009-02-17
申请号：US11465522
申请日：2006-08-18
申请人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Kathryn M. O'Brien
发明人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Kathryn M. O'Brien
IPC分类号： G06F12/00
CPC分类号： G06F12/0862 , G06F12/10 , G06F2212/6028 , G06F2212/6082
摘要： A method to efficiently pre-fetch and batch compiler-assisted software cache accesses is provided. The method reduces the overhead associated with software cache directory accesses. With the method, the local memory address of the cache line that stores the pre-fetched data is itself cached, such as in a register or well known location in local memory, so that a later data access does not need to perform address translation and software cache operations and can instead access the data directly from the software cache using the cached local memory address. This saves processor cycles that would otherwise be required to perform the address translation a second time when the data is to be used. Moreover, the system and method directly enable software cache accesses to be effectively decoupled from address translation in order to increase the overlap between computation and communication.
摘要翻译：提供了一种有效预取和批量编译器辅助的软件高速缓存访问的方法。该方法减少与软件缓存目录访问相关的开销。使用该方法，存储预取数据的高速缓存行的本地存储器地址本身被缓存，例如在本地存储器中的寄存器或公知位置中，使得稍后的数据访问不需要执行地址转换，软件缓存操作，可以使用缓存的本地内存地址直接从软件缓存访问数据。这节省了当使用数据时第二次执行地址转换所需的处理器周期。此外，系统和方法直接使得软件高速缓存访问能够有效地从地址转换中解耦，以增加计算和通信之间的重叠。

3. 发明授权

US08370817B2 Optimizing scalar code executed on a SIMD engine by alignment of SIMD slots 失效
标题翻译：通过SIMD插槽的对齐来优化在SIMD引擎上执行的标量码
公开(公告)号：US08370817B2
公开(公告)日：2013-02-05
申请号：US12127491
申请日：2008-05-27
申请人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien
发明人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien
IPC分类号： G06F9/45 , G06F15/00
CPC分类号： G06F9/3885 , G06F9/30032 , G06F9/30036 , G06F9/30109 , G06F9/3824
摘要： A mechanism is provided for optimizing scalar code executed on a single instruction multiple data (SIMD) engine by aligning the slots of SIMD registers. With the mechanism, a compiler is provided that parses source code and, for each statement in the program, generates an expression tree. The compiler inspects all storage inputs to scalar operations in the expression tree to determine their alignment in the SIMD registers. This alignment is propagated up the expression tree from the leaves. When the alignments of two operands in the expression tree are the same, the resulting alignment is the shared value. When the alignments of two operands in the expression tree are different, one operand is shifted. For shifted operands, a shift operation is inserted in the expression tree. The executable code is then generated for the expression tree and shifts are inserted where indicated.
摘要翻译：提供了一种用于通过对准SIMD寄存器的时隙来优化在单个指令多数据（SIMD）引擎上执行的标量码的机制。使用该机制，提供了解析源代码的编译器，对于程序中的每个语句，都生成一个表达式树。编译器检查表达式树中的所有存储输入到标量运算，以确定它们在SIMD寄存器中的对齐。该对齐方式从树叶中向上传播。当表达式树中的两个操作数的对齐方式相同时，生成的对齐方式是共享值。当表达式树中的两个操作数的对齐不同时，一个操作数被移位。对于移位的操作数，在表达式树中插入shift操作。然后为表达式树生成可执行代码，并在指定的位置插入移位。

4. 发明申请

US20080222391A1 Apparatus and Method for Optimizing Scalar Code Executed on a SIMD Engine by Alignment of SIMD Slots 失效
标题翻译：用于通过SIMD槽的对准来优化在SIMD引擎上执行的标量的装置和方法
公开(公告)号：US20080222391A1
公开(公告)日：2008-09-11
申请号：US12127491
申请日：2008-05-27
申请人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien
发明人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien
IPC分类号： G06F9/44 , G06F9/30 , G06F9/315
CPC分类号： G06F9/3885 , G06F9/30032 , G06F9/30036 , G06F9/30109 , G06F9/3824
摘要： An apparatus and method for optimizing scalar code executed on a single instruction multiple data (SIMD) engine is provided that aligns the slots of SIMD registers. With the apparatus and method, a compiler is provided that parses source code and, for each statement in the program, generates an expression tree. The compiler inspects all storage inputs to scalar operations in the expression tree to determine their alignment in the SIMD registers. This alignment is propagated up the expression tree from the leaves. When the alignments of two operands in the expression tree are the same, the resulting alignment is the shared value. When the alignments of two operands in the expression tree are different, one operand is shifted. For shifted operands, a shift operation is inserted in the expression tree. The executable code is then generated for the expression tree and shifts are inserted where indicated.
摘要翻译：提供了一种用于优化在单指令多数据（SIMD）引擎上执行的标量码的装置和方法，其对准SIMD寄存器的时隙。使用设备和方法，提供了一个解析源代码的编译器，对于程序中的每个语句，都会生成一个表达式树。编译器检查表达式树中的所有存储输入到标量运算，以确定它们在SIMD寄存器中的对齐。该对齐方式从树叶中向上传播。当表达式树中的两个操作数的对齐方式相同时，生成的对齐方式是共享值。当表达式树中的两个操作数的对齐不同时，一个操作数被移位。对于移位的操作数，在表达式树中插入shift操作。然后为表达式树生成可执行代码，并在指定的位置插入移位。

5. 发明授权

US07386842B2 Efficient data reorganization to satisfy data alignment constraints 失效
标题翻译：有效的数据重组以满足数据对齐约束
公开(公告)号：US07386842B2
公开(公告)日：2008-06-10
申请号：US10862483
申请日：2004-06-07
申请人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Peng Wu
发明人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Peng Wu
IPC分类号： G06F9/45
CPC分类号： G06F8/4452
摘要： An approach is provided for vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores. In the framework presented herein, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirement of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residue iteration counts, and multiple statements with arbitrary alignment combinations. Beyond generating a valid simdization, a preferred embodiment further improves the quality of the generated codes. Four stream-shift placement policies are disclosed, which minimize the number of data reorganization generated by the alignment handling.
摘要翻译：提供了一种方法，用于在仅支持对齐加载和存储的SIMD架构的编译代码中向量化未对齐的引用。在本文提出的框架中，首先简化循环，就好像内存单元不会对齐约束。然后，编译器插入数据重组操作，以满足硬件的实际对齐要求。最后，代码生成算法基于数据重组图生成SIMD代码，解决诸如运行时对齐，未知循环边界，残差迭代计数以及具有任意对齐组合的多个语句之类的现实问题。除了生成有效的simdization之外，优选实施例进一步提高了生成代码的质量。公开了四个流移放置策略，其最小化由对齐处理产生的数据重组的数量。

6. 发明申请

US20080077930A1 Workload Partitioning in a Parallel System with Hetergeneous Alignment Constraints 有权
标题翻译：并联系统中的工作负载划分与Hetergene对齐约束
公开(公告)号：US20080077930A1
公开(公告)日：2008-03-27
申请号：US11535172
申请日：2006-09-26
申请人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Kathryn M. O'Brien , Tong Chen
发明人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Kathryn M. O'Brien , Tong Chen
IPC分类号： G06F9/46
CPC分类号： G06F8/456 , G06F8/452 , G06F9/5066
摘要： A process, compiler, computer program product and system for workload partitioning in a heterogeneous system. The process includes determining heterogeneous alignment constraints in the workload, partitioning a portion of tasks to a processing element sensitive to alignment constraints, and partitioning a remaining portion of tasks to a processing element not sensitive to alignment constraints.
摘要翻译：用于异构系统中工作负载划分的过程，编译器，计算机程序产品和系统。该过程包括确定工作负载中的异质对齐约束，将一部分任务划分为对准约束敏感的处理元素，以及将剩余部分任务划分为对准约束不敏感的处理元素。

7. 发明申请

US20080046657A1 System and Method to Efficiently Prefetch and Batch Compiler-Assisted Software Cache Accesses 失效
标题翻译：有效预取和批量编译器辅助软件缓存访问的系统和方法
公开(公告)号：US20080046657A1
公开(公告)日：2008-02-21
申请号：US11465522
申请日：2006-08-18
申请人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Kathryn M. O'Brien
发明人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Kathryn M. O'Brien
IPC分类号： G06F12/00
CPC分类号： G06F12/0862 , G06F12/10 , G06F2212/6028 , G06F2212/6082
摘要： A system and method to efficiently pre-fetch and batch compiler-assisted software cache accesses are provided. The system and method reduce the overhead associated with software cache directory accesses. With the system and method, the local memory address of the cache line that stores the pre-fetched data is itself cached, such as in a register or well known location in local memory, so that a later data access does not need to perform address translation and software cache operations and can instead access the data directly from the software cache using the cached local memory address. This saves processor cycles that would otherwise be required to perform the address translation a second time when the data is to be used. Moreover, the system and method directly enable software cache accesses to be effectively decoupled from address translation in order to increase the overlap between computation and communication.
摘要翻译：提供了一种有效预取和批量编译器辅助的软件高速缓存访问的系统和方法。系统和方法减少与软件缓存目录访问相关的开销。使用系统和方法，存储预取数据的高速缓存行的本地存储器地址本身被缓存，例如在本地存储器中的寄存器或公知位置中，使得稍后的数据访问不需要执行地址翻译和软件缓存操作，并且可以使用缓存的本地存储器地址直接从软件缓存访问数据。这节省了当使用数据时第二次执行地址转换所需的处理器周期。此外，系统和方法直接使得软件高速缓存访问能够有效地从地址转换中解耦，以增加计算和通信之间的重叠。

8. 发明授权

US08146067B2 Efficient data reorganization to satisfy data alignment constraints 有权
标题翻译：有效的数据重组以满足数据对齐约束
公开(公告)号：US08146067B2
公开(公告)日：2012-03-27
申请号：US12108056
申请日：2008-04-23
申请人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Peng Wu
发明人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Peng Wu
IPC分类号： G06F9/45
CPC分类号： G06F8/4452
摘要： Vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores is presented. In the framework presented herein, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirement of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residue iteration counts, and multiple statements with arbitrary alignment combinations. Beyond generating a valid simdization, a preferred embodiment further improves the quality of the generated codes. Four stream-shift placement policies are disclosed, which minimize the number of data reorganization generated by the alignment handling.
摘要翻译：介绍了仅支持对齐加载和存储的SIMD架构的编译代码中的对齐引用。在本文提出的框架中，首先简化循环，就好像内存单元不会对齐约束。然后，编译器插入数据重组操作，以满足硬件的实际对齐要求。最后，代码生成算法基于数据重组图生成SIMD代码，解决诸如运行时对齐，未知循环边界，残差迭代计数以及具有任意对齐组合的多个语句之类的现实问题。除了生成有效的simdization之外，优选实施例进一步提高了生成代码的质量。公开了四个流移放置策略，其最小化由对齐处理产生的数据重组的数量。

9. 发明授权

US08006238B2 Workload partitioning in a parallel system with hetergeneous alignment constraints 有权
标题翻译：具有异构对齐约束的并行系统中的工作负载划分
公开(公告)号：US08006238B2
公开(公告)日：2011-08-23
申请号：US11535172
申请日：2006-09-26
申请人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Kathryn M. O'Brien , Tong Chen
发明人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Kathryn M. O'Brien , Tong Chen
IPC分类号： G06F9/45 , G06F9/40 , G06F9/46
CPC分类号： G06F8/456 , G06F8/452 , G06F9/5066
摘要： A process, compiler, computer program product and system for workload partitioning in a heterogeneous system. The process includes determining heterogeneous alignment constraints in the workload, partitioning a portion of tasks to a processing element sensitive to alignment constraints, and partitioning a remaining portion of tasks to a processing element not sensitive to alignment constraints.
摘要翻译：用于异构系统中工作负载划分的过程，编译器，计算机程序产品和系统。该过程包括确定工作负载中的异质对齐约束，将一部分任务划分为对准约束敏感的处理元素，以及将剩余部分任务划分为对准约束不敏感的处理元素。

10. 发明申请

US20080201699A1 Efficient Data Reorganization to Satisfy Data Alignment Constraints 有权
标题翻译：有效的数据重组以满足数据对齐限制
公开(公告)号：US20080201699A1
公开(公告)日：2008-08-21
申请号：US12108056
申请日：2008-04-23
申请人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Peng Wu
发明人： Alexandre E. Eichenberger , John Kevin Patrick O'Brien , Peng Wu
IPC分类号： G06F9/45
CPC分类号： G06F8/4452
摘要： Vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores is presented. In the framework presented herein, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirement of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residue iteration counts, and multiple statements with arbitrary alignment combinations. Beyond generating a valid simdization, a preferred embodiment further improves the quality of the generated codes. Four stream-shift placement policies are disclosed, which minimize the number of data reorganization generated by the alignment handling.
摘要翻译：介绍了仅支持对齐加载和存储的SIMD架构的编译代码中的对齐引用。在本文提出的框架中，首先简化循环，就好像内存单元不会对齐约束。然后，编译器插入数据重组操作，以满足硬件的实际对齐要求。最后，代码生成算法基于数据重组图生成SIMD代码，解决诸如运行时对齐，未知循环边界，残差迭代计数以及具有任意对齐组合的多个语句之类的现实问题。除了生成有效的simdization之外，优选实施例进一步提高了生成代码的质量。公开了四个流移放置策略，其最小化由对齐处理产生的数据重组的数量。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式