会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明授权
    • SIMD compare instruction using permute logic for distributed register files
    • SIMD比较指令使用分布式寄存器文件的置换逻辑
    • US09575753B2
    • 2017-02-21
    • US13420699
    • 2012-03-15
    • Alexandre E. EichenbergerBruce M. Fleischer
    • Alexandre E. EichenbergerBruce M. Fleischer
    • G06F9/30G06F9/38
    • G06F9/30032G06F9/30021G06F9/30036G06F9/3838
    • Mechanisms, in a data processing system comprising a single instruction multiple data (SIMD) processor, for performing a data dependency check operation on vector element values of at least two input vector registers are provided. Two calls to a simd-check instruction are performed, one with input vector registers having a first order and one with the input vector registers having a different order. The simd-check instruction performs comparisons to determine if any data dependencies are present. Results of the two calls to the simd-check instruction are obtained and used to determine if any data dependencies are present in the at least two input vector registers. Based on the results, the SIMD processor may perform various operations.
    • 提供了一种包括用于对至少两个输入向量寄存器的向量元素值进行数据相关性检查操作的单指令多数据(SIMD)处理器的数据处理系统中的机制。 执行对SIMD检查指令的两次调用,其中一个具有输入向量寄存器具有第一级,一个具有不同顺序的输入向量寄存器。 simd检查指令执行比较以确定是否存在任何数据依赖性。 获得对simd检查指令的两次调用的结果,并用于确定至少两个输入向量寄存器中是否存在任何数据依赖性。 基于该结果,SIMD处理器可以执行各种操作。
    • 4. 发明授权
    • Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture
    • 在高性能计算架构中使用数据预处理的复矩阵乘法运算
    • US08650240B2
    • 2014-02-11
    • US12542324
    • 2009-08-17
    • Alexandre E. EichenbergerMichael K. GschwindJohn A. Gunnels
    • Alexandre E. EichenbergerMichael K. GschwindJohn A. Gunnels
    • G06F7/52
    • G06F17/16G06F9/30014G06F9/30032G06F9/30036G06F9/30043G06F9/30109
    • Mechanisms for performing a complex matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the complex matrix multiplication operation to a first target vector register. The first vector operand comprises a real and imaginary part of a first complex vector value. A complex load and splat operation is performed to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register. The second complex vector value has a real and imaginary part. A cross multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored in a result vector register.
    • 提供了执行复矩阵乘法运算的机制。 执行矢量加载操作以将复矩阵乘法运算的第一向量操作数加载到第一目标向量寄存器。 第一矢量操作数包括第一复矢量值的实部和虚部。 执行复杂的加载和拼接操作以加载第二向量操作数的第二复数向量值,并在第二目标向量寄存器内复制第二复数向量值。 第二个复矢量值具有实部和虚部。 对第一目标向量寄存器的元素和第二目标向量寄存器的元素执行交叉乘法运算,以生成复矩阵乘法运算的部分乘积。 部分产品与其他部分产品一起累积,并将结果积累的部分产品存储在结果向量寄存器中。
    • 7. 发明申请
    • Efficient Enqueuing of Values in SIMD Engines with Permute Unit
    • 有效排队SIMD发动机与价值单位
    • US20130151822A1
    • 2013-06-13
    • US13315596
    • 2011-12-09
    • Alexandre E. EichenbergerJohn K.P. O'BrienYuan Zhao
    • Alexandre E. EichenbergerJohn K.P. O'BrienYuan Zhao
    • G06F9/38
    • G06F9/30072G06F9/30018G06F9/30036
    • Mechanisms, in a data processing system having a processor, for generating enqueued data for performing computations of a conditional branch of code are provided. Mask generation logic of the processor operates to generate a mask representing a subset of iterations of a loop of the code that results in a condition of the conditional branch being satisfied. The mask is used to select data elements from an input data element vector register corresponding to the subset of iterations of the loop of the code that result in the condition of the conditional branch being satisfied. Furthermore, the selected data elements are used to perform computations of the conditional branch of code. Iterations of the loop of the code that do not result in the condition of the conditional branch being satisfied are not used as a basis for performing computations of the conditional branch of code.
    • 提供了在具有处理器的数据处理系统中用于生成用于执行代码的条件分支的计算的入队数据的机制。 处理器的掩码生成逻辑操作以产生代表导致条件分支的条件得到满足的代码循环的迭代子集的掩码。 该掩码用于从输入数据元素向量寄存器中选择数据元素,该数据元素对应于导致满足条件分支条件的代码循环的迭代子集。 此外,所选择的数据元素用于执行代码的条件分支的计算。 不导致满足条件分支的条件的代码的循环的迭代不用作执行代码的条件分支的计算的基础。
    • 8. 发明授权
    • Optimizing scalar code executed on a SIMD engine by alignment of SIMD slots
    • 通过SIMD插槽的对齐来优化在SIMD引擎上执行的标量码
    • US08370817B2
    • 2013-02-05
    • US12127491
    • 2008-05-27
    • Alexandre E. EichenbergerJohn Kevin Patrick O'Brien
    • Alexandre E. EichenbergerJohn Kevin Patrick O'Brien
    • G06F9/45G06F15/00
    • G06F9/3885G06F9/30032G06F9/30036G06F9/30109G06F9/3824
    • A mechanism is provided for optimizing scalar code executed on a single instruction multiple data (SIMD) engine by aligning the slots of SIMD registers. With the mechanism, a compiler is provided that parses source code and, for each statement in the program, generates an expression tree. The compiler inspects all storage inputs to scalar operations in the expression tree to determine their alignment in the SIMD registers. This alignment is propagated up the expression tree from the leaves. When the alignments of two operands in the expression tree are the same, the resulting alignment is the shared value. When the alignments of two operands in the expression tree are different, one operand is shifted. For shifted operands, a shift operation is inserted in the expression tree. The executable code is then generated for the expression tree and shifts are inserted where indicated.
    • 提供了一种用于通过对准SIMD寄存器的时隙来优化在单个指令多数据(SIMD)引擎上执行的标量码的机制。 使用该机制,提供了解析源代码的编译器,对于程序中的每个语句,都生成一个表达式树。 编译器检查表达式树中的所有存储输入到标量运算,以确定它们在SIMD寄存器中的对齐。 该对齐方式从树叶中向上传播。 当表达式树中的两个操作数的对齐方式相同时,生成的对齐方式是共享值。 当表达式树中的两个操作数的对齐不同时,一个操作数被移位。 对于移位的操作数,在表达式树中插入shift操作。 然后为表达式树生成可执行代码,并在指定的位置插入移位。
    • 10. 发明授权
    • Method using SLP packing with statements having both isomorphic and non-isomorphic expressions
    • 使用具有同构和非同构表达式的语句的SLP打包的方法
    • US08266587B2
    • 2012-09-11
    • US11964324
    • 2007-12-26
    • Alexandre E. EichenbergerKai-Ting Amy WangPeng Wu
    • Alexandre E. EichenbergerKai-Ting Amy WangPeng Wu
    • G06F9/44
    • G06F8/456
    • Disclosure for using SLP in processing a plurality of statements, wherein the statements are associated with an array having a number of array positions, and each statement includes one or more expressions. Expressions are gathered for each of the statements into a structure comprising a single merge stream furnished with a location for each expression. The location for a given expression is associated with one of the array positions. A plurality of expressions are selectively identified and SLP packing operations are applied to the identified expressions to merge into one or more isomorphic sub-streams. Expressions of the isomorphic sub-streams and other expressions of the single merge stream are combined into a number of input vectors that are substantially equal in length to one another. A location vector is generated that contains the respective locations for all of the expressions in the single merge stream.
    • 在处理多个语句中使用SLP的公开,其中所述语句与具有多个数组位置的数组相关联,并且每个语句包括一个或多个表达式。 将每个语句的表达式收集到包含单个合并流的结构中,每个合并流都包含每个表达式的位置。 给定表达式的位置与其中一个数组位置相关联。 选择性地识别多个表达,并且将SLP打包操作应用于所识别的表达,以合并到一个或多个同构子流中。 单个合并流的同构子流和其他表达式的表达式被组合成彼此长度上基本相等的多个输入向量。 生成位置向量,其包含单个合并流中所有表达式的相应位置。