会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 21. 发明申请
    • Efficient Code Generation Using Loop Peeling for SIMD Loop Code with Multiple Misaligned Statements
    • 使用循环剥离对具有多个不对齐语句的SIMD循环码进行有效的代码生成
    • US20080222623A1
    • 2008-09-11
    • US12122050
    • 2008-05-16
    • Alexandre E. EichenbergerKai-Ting Amy WangPeng Wu
    • Alexandre E. EichenbergerKai-Ting Amy WangPeng Wu
    • G06F9/45
    • G06F8/447G06F8/4441
    • An approach is provided for vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores. In this framework, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirements of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residual iteration counts, and multiple statements with arbitrary alignment combinations. Loop peeling is used to reduce the computational overhead associated with misaligned data. A loop prologue and epilogue are peeled from individual iterations in the simdized loop, and vector-splicing instructions are applied to the peeled iterations, while the steady-state loop body incurs no additional computational overhead.
    • 提供了一种方法,用于在仅支持对齐加载和存储的SIMD架构的编译代码中向量化未对齐的引用。 在这个框架中,循环首先被模拟,就好像内存单元没有对齐约束。 编译器然后插入数据重组操作以满足硬件的实际对齐要求。 最后,代码生成算法基于数据重组图生成SIMD代码,解决诸如运行时对齐,未知循环边界,残差迭代计数以及具有任意对齐组合的多个语句之类的现实问题。 循环剥离用于减少与未对齐数据相关的计算开销。 循环序言和结语在模拟循环中从单独迭代中去除,向量拼接指令被应用于剥离的迭代,而稳态循环体不引起额外的计算开销。
    • 22. 发明申请
    • Apparatus and Method for Optimizing Scalar Code Executed on a SIMD Engine by Alignment of SIMD Slots
    • 用于通过SIMD槽的对准来优化在SIMD引擎上执行的标量的装置和方法
    • US20080222391A1
    • 2008-09-11
    • US12127491
    • 2008-05-27
    • Alexandre E. EichenbergerJohn Kevin Patrick O'Brien
    • Alexandre E. EichenbergerJohn Kevin Patrick O'Brien
    • G06F9/44G06F9/30G06F9/315
    • G06F9/3885G06F9/30032G06F9/30036G06F9/30109G06F9/3824
    • An apparatus and method for optimizing scalar code executed on a single instruction multiple data (SIMD) engine is provided that aligns the slots of SIMD registers. With the apparatus and method, a compiler is provided that parses source code and, for each statement in the program, generates an expression tree. The compiler inspects all storage inputs to scalar operations in the expression tree to determine their alignment in the SIMD registers. This alignment is propagated up the expression tree from the leaves. When the alignments of two operands in the expression tree are the same, the resulting alignment is the shared value. When the alignments of two operands in the expression tree are different, one operand is shifted. For shifted operands, a shift operation is inserted in the expression tree. The executable code is then generated for the expression tree and shifts are inserted where indicated.
    • 提供了一种用于优化在单指令多数据(SIMD)引擎上执行的标量码的装置和方法,其对准SIMD寄存器的时隙。 使用设备和方法,提供了一个解析源代码的编译器,对于程序中的每个语句,都会生成一个表达式树。 编译器检查表达式树中的所有存储输入到标量运算,以确定它们在SIMD寄存器中的对齐。 该对齐方式从树叶中向上传播。 当表达式树中的两个操作数的对齐方式相同时,生成的对齐方式是共享值。 当表达式树中的两个操作数的对齐不同时,一个操作数被移位。 对于移位的操作数,在表达式树中插入shift操作。 然后为表达式树生成可执行代码,并在指定的位置插入移位。
    • 23. 发明授权
    • Efficient data reorganization to satisfy data alignment constraints
    • 有效的数据重组以满足数据对齐约束
    • US07386842B2
    • 2008-06-10
    • US10862483
    • 2004-06-07
    • Alexandre E. EichenbergerJohn Kevin Patrick O'BrienPeng Wu
    • Alexandre E. EichenbergerJohn Kevin Patrick O'BrienPeng Wu
    • G06F9/45
    • G06F8/4452
    • An approach is provided for vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores. In the framework presented herein, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirement of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residue iteration counts, and multiple statements with arbitrary alignment combinations. Beyond generating a valid simdization, a preferred embodiment further improves the quality of the generated codes. Four stream-shift placement policies are disclosed, which minimize the number of data reorganization generated by the alignment handling.
    • 提供了一种方法,用于在仅支持对齐加载和存储的SIMD架构的编译代码中向量化未对齐的引用。 在本文提出的框架中,首先简化循环,就好像内存单元不会对齐约束。 然后,编译器插入数据重组操作,以满足硬件的实际对齐要求。 最后,代码生成算法基于数据重组图生成SIMD代码,解决诸如运行时对齐,未知循环边界,残差迭代计数以及具有任意对齐组合的多个语句之类的现实问题。 除了生成有效的simdization之外,优选实施例进一步提高了生成代码的质量。 公开了四个流移放置策略,其最小化由对齐处理产生的数据重组的数量。
    • 26. 发明授权
    • Analyze and reduce number of data reordering operations in SIMD code
    • 分析和减少SIMD代码中数据重排序的数量
    • US08954943B2
    • 2015-02-10
    • US11340452
    • 2006-01-26
    • Alexandre E. EichenbergerKai-Ting Amy WangPeng WuPeng Zhao
    • Alexandre E. EichenbergerKai-Ting Amy WangPeng WuPeng Zhao
    • G06F9/45G06F15/00G06F15/76
    • G06F8/443
    • A method for analyzing data reordering operations in Single Issue Multiple Data source code and generating executable code therefrom is provided. Input is received. One or more data reordering operations in the input are identified and each data reordering operation in the input is abstracted into a corresponding virtual shuffle operation so that each virtual shuffle operation forms part of an expression tree. One or more virtual shuffle trees are collapsed by combining virtual shuffle operations within at least one of the one or more virtual shuffle trees to form one or more combined virtual shuffle operations, wherein each virtual shuffle tree is a subtree of the expression tree that only contains virtual shuffle operations. Then code is generated for the one or more combined virtual shuffle operations.
    • 提供了一种用于分析单发多数据源代码中的数据重排序操作并从中生成可执行代码的方法。 收到输入。 识别输入中的一个或多个数据重排序操作,并将输入中的每个数据重排序操作抽象为相应的虚拟随机播放操作,使得每个虚拟随机播放操作形成表达式树的一部分。 通过将所述一个或多个虚拟随机播放树中的至少一个中的虚拟随机播放操作组合以形成一个或多个组合的虚拟随机播放操作来折叠一个或多个虚拟洗牌树,其中每个虚拟随机播放树是仅包含表达式树的子树 虚拟随机操作。 然后为一个或多个组合的虚拟随机操作生成代码。