会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明授权
    • Method to efficiently prefetch and batch compiler-assisted software cache accesses
    • 有效预取和批量编译器辅助软件缓存访问的方法
    • US07493452B2
    • 2009-02-17
    • US11465522
    • 2006-08-18
    • Alexandre E. EichenbergerJohn Kevin Patrick O'BrienKathryn M. O'Brien
    • Alexandre E. EichenbergerJohn Kevin Patrick O'BrienKathryn M. O'Brien
    • G06F12/00
    • G06F12/0862G06F12/10G06F2212/6028G06F2212/6082
    • A method to efficiently pre-fetch and batch compiler-assisted software cache accesses is provided. The method reduces the overhead associated with software cache directory accesses. With the method, the local memory address of the cache line that stores the pre-fetched data is itself cached, such as in a register or well known location in local memory, so that a later data access does not need to perform address translation and software cache operations and can instead access the data directly from the software cache using the cached local memory address. This saves processor cycles that would otherwise be required to perform the address translation a second time when the data is to be used. Moreover, the system and method directly enable software cache accesses to be effectively decoupled from address translation in order to increase the overlap between computation and communication.
    • 提供了一种有效预取和批量编译器辅助的软件高速缓存访​​问的方法。 该方法减少与软件缓存目录访问相关的开销。 使用该方法,存储预取数据的高速缓存行的本地存储器地址本身被缓存,例如在本地存储器中的寄存器或公知位置中,使得稍后的数据访问不需要执行地址转换, 软件缓存操作,可以使用缓存的本地内存地址直接从软件缓存访问数据。 这节省了当使用数据时第二次执行地址转换所需的处理器周期。 此外,系统和方法直接使得软件高速缓存访​​问能够有效地从地址转换中解耦,以增加计算和通信之间的重叠。
    • 3. 发明授权
    • Optimizing scalar code executed on a SIMD engine by alignment of SIMD slots
    • 通过SIMD插槽的对齐来优化在SIMD引擎上执行的标量码
    • US08370817B2
    • 2013-02-05
    • US12127491
    • 2008-05-27
    • Alexandre E. EichenbergerJohn Kevin Patrick O'Brien
    • Alexandre E. EichenbergerJohn Kevin Patrick O'Brien
    • G06F9/45G06F15/00
    • G06F9/3885G06F9/30032G06F9/30036G06F9/30109G06F9/3824
    • A mechanism is provided for optimizing scalar code executed on a single instruction multiple data (SIMD) engine by aligning the slots of SIMD registers. With the mechanism, a compiler is provided that parses source code and, for each statement in the program, generates an expression tree. The compiler inspects all storage inputs to scalar operations in the expression tree to determine their alignment in the SIMD registers. This alignment is propagated up the expression tree from the leaves. When the alignments of two operands in the expression tree are the same, the resulting alignment is the shared value. When the alignments of two operands in the expression tree are different, one operand is shifted. For shifted operands, a shift operation is inserted in the expression tree. The executable code is then generated for the expression tree and shifts are inserted where indicated.
    • 提供了一种用于通过对准SIMD寄存器的时隙来优化在单个指令多数据(SIMD)引擎上执行的标量码的机制。 使用该机制,提供了解析源代码的编译器,对于程序中的每个语句,都生成一个表达式树。 编译器检查表达式树中的所有存储输入到标量运算,以确定它们在SIMD寄存器中的对齐。 该对齐方式从树叶中向上传播。 当表达式树中的两个操作数的对齐方式相同时,生成的对齐方式是共享值。 当表达式树中的两个操作数的对齐不同时,一个操作数被移位。 对于移位的操作数,在表达式树中插入shift操作。 然后为表达式树生成可执行代码,并在指定的位置插入移位。
    • 4. 发明申请
    • Apparatus and Method for Optimizing Scalar Code Executed on a SIMD Engine by Alignment of SIMD Slots
    • 用于通过SIMD槽的对准来优化在SIMD引擎上执行的标量的装置和方法
    • US20080222391A1
    • 2008-09-11
    • US12127491
    • 2008-05-27
    • Alexandre E. EichenbergerJohn Kevin Patrick O'Brien
    • Alexandre E. EichenbergerJohn Kevin Patrick O'Brien
    • G06F9/44G06F9/30G06F9/315
    • G06F9/3885G06F9/30032G06F9/30036G06F9/30109G06F9/3824
    • An apparatus and method for optimizing scalar code executed on a single instruction multiple data (SIMD) engine is provided that aligns the slots of SIMD registers. With the apparatus and method, a compiler is provided that parses source code and, for each statement in the program, generates an expression tree. The compiler inspects all storage inputs to scalar operations in the expression tree to determine their alignment in the SIMD registers. This alignment is propagated up the expression tree from the leaves. When the alignments of two operands in the expression tree are the same, the resulting alignment is the shared value. When the alignments of two operands in the expression tree are different, one operand is shifted. For shifted operands, a shift operation is inserted in the expression tree. The executable code is then generated for the expression tree and shifts are inserted where indicated.
    • 提供了一种用于优化在单指令多数据(SIMD)引擎上执行的标量码的装置和方法,其对准SIMD寄存器的时隙。 使用设备和方法,提供了一个解析源代码的编译器,对于程序中的每个语句,都会生成一个表达式树。 编译器检查表达式树中的所有存储输入到标量运算,以确定它们在SIMD寄存器中的对齐。 该对齐方式从树叶中向上传播。 当表达式树中的两个操作数的对齐方式相同时,生成的对齐方式是共享值。 当表达式树中的两个操作数的对齐不同时,一个操作数被移位。 对于移位的操作数,在表达式树中插入shift操作。 然后为表达式树生成可执行代码,并在指定的位置插入移位。
    • 5. 发明授权
    • Efficient data reorganization to satisfy data alignment constraints
    • 有效的数据重组以满足数据对齐约束
    • US07386842B2
    • 2008-06-10
    • US10862483
    • 2004-06-07
    • Alexandre E. EichenbergerJohn Kevin Patrick O'BrienPeng Wu
    • Alexandre E. EichenbergerJohn Kevin Patrick O'BrienPeng Wu
    • G06F9/45
    • G06F8/4452
    • An approach is provided for vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores. In the framework presented herein, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirement of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residue iteration counts, and multiple statements with arbitrary alignment combinations. Beyond generating a valid simdization, a preferred embodiment further improves the quality of the generated codes. Four stream-shift placement policies are disclosed, which minimize the number of data reorganization generated by the alignment handling.
    • 提供了一种方法,用于在仅支持对齐加载和存储的SIMD架构的编译代码中向量化未对齐的引用。 在本文提出的框架中,首先简化循环,就好像内存单元不会对齐约束。 然后,编译器插入数据重组操作,以满足硬件的实际对齐要求。 最后,代码生成算法基于数据重组图生成SIMD代码,解决诸如运行时对齐,未知循环边界,残差迭代计数以及具有任意对齐组合的多个语句之类的现实问题。 除了生成有效的simdization之外,优选实施例进一步提高了生成代码的质量。 公开了四个流移放置策略,其最小化由对齐处理产生的数据重组的数量。
    • 7. 发明申请
    • System and Method to Efficiently Prefetch and Batch Compiler-Assisted Software Cache Accesses
    • 有效预取和批量编译器辅助软件缓存访问的系统和方法
    • US20080046657A1
    • 2008-02-21
    • US11465522
    • 2006-08-18
    • Alexandre E. EichenbergerJohn Kevin Patrick O'BrienKathryn M. O'Brien
    • Alexandre E. EichenbergerJohn Kevin Patrick O'BrienKathryn M. O'Brien
    • G06F12/00
    • G06F12/0862G06F12/10G06F2212/6028G06F2212/6082
    • A system and method to efficiently pre-fetch and batch compiler-assisted software cache accesses are provided. The system and method reduce the overhead associated with software cache directory accesses. With the system and method, the local memory address of the cache line that stores the pre-fetched data is itself cached, such as in a register or well known location in local memory, so that a later data access does not need to perform address translation and software cache operations and can instead access the data directly from the software cache using the cached local memory address. This saves processor cycles that would otherwise be required to perform the address translation a second time when the data is to be used. Moreover, the system and method directly enable software cache accesses to be effectively decoupled from address translation in order to increase the overlap between computation and communication.
    • 提供了一种有效预取和批量编译器辅助的软件高速缓存访​​问的系统和方法。 系统和方法减少与软件缓存目录访问相关的开销。 使用系统和方法,存储预取数据的高速缓存行的本地存储器地址本身被缓存,例如在本地存储器中的寄存器或公知位置中,使得稍后的数据访问不需要执行地址 翻译和软件缓存操作,并且可以使用缓存的本地存储器地址直接从软件缓存访问数据。 这节省了当使用数据时第二次执行地址转换所需的处理器周期。 此外,系统和方法直接使得软件高速缓存访​​问能够有效地从地址转换中解耦,以增加计算和通信之间的重叠。
    • 8. 发明授权
    • Efficient data reorganization to satisfy data alignment constraints
    • 有效的数据重组以满足数据对齐约束
    • US08146067B2
    • 2012-03-27
    • US12108056
    • 2008-04-23
    • Alexandre E. EichenbergerJohn Kevin Patrick O'BrienPeng Wu
    • Alexandre E. EichenbergerJohn Kevin Patrick O'BrienPeng Wu
    • G06F9/45
    • G06F8/4452
    • Vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores is presented. In the framework presented herein, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirement of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residue iteration counts, and multiple statements with arbitrary alignment combinations. Beyond generating a valid simdization, a preferred embodiment further improves the quality of the generated codes. Four stream-shift placement policies are disclosed, which minimize the number of data reorganization generated by the alignment handling.
    • 介绍了仅支持对齐加载和存储的SIMD架构的编译代码中的对齐引用。 在本文提出的框架中,首先简化循环,就好像内存单元不会对齐约束。 然后,编译器插入数据重组操作,以满足硬件的实际对齐要求。 最后,代码生成算法基于数据重组图生成SIMD代码,解决诸如运行时对齐,未知循环边界,残差迭代计数以及具有任意对齐组合的多个语句之类的现实问题。 除了生成有效的simdization之外,优选实施例进一步提高了生成代码的质量。 公开了四个流移放置策略,其最小化由对齐处理产生的数据重组的数量。
    • 10. 发明申请
    • Efficient Data Reorganization to Satisfy Data Alignment Constraints
    • 有效的数据重组以满足数据对齐限制
    • US20080201699A1
    • 2008-08-21
    • US12108056
    • 2008-04-23
    • Alexandre E. EichenbergerJohn Kevin Patrick O'BrienPeng Wu
    • Alexandre E. EichenbergerJohn Kevin Patrick O'BrienPeng Wu
    • G06F9/45
    • G06F8/4452
    • Vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores is presented. In the framework presented herein, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirement of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residue iteration counts, and multiple statements with arbitrary alignment combinations. Beyond generating a valid simdization, a preferred embodiment further improves the quality of the generated codes. Four stream-shift placement policies are disclosed, which minimize the number of data reorganization generated by the alignment handling.
    • 介绍了仅支持对齐加载和存储的SIMD架构的编译代码中的对齐引用。 在本文提出的框架中,首先简化循环,就好像内存单元不会对齐约束。 然后,编译器插入数据重组操作,以满足硬件的实际对齐要求。 最后,代码生成算法基于数据重组图生成SIMD代码,解决诸如运行时对齐,未知循环边界,残差迭代计数以及具有任意对齐组合的多个语句之类的现实问题。 除了生成有效的simdization之外,优选实施例进一步提高了生成代码的质量。 公开了四个流移放置策略,其最小化由对齐处理产生的数据重组的数量。