会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Written-domain language modeling with decomposition
    • 书面域语言建模与分解
    • US09460088B1
    • 2016-10-04
    • US13906654
    • 2013-05-31
    • Google Inc.
    • Hasim SakYun-hsuan SungCyril Georges Luc Allauzen
    • G06F17/28G06F17/27G10L15/26G10L15/28G10L15/06G10L15/14G10L15/04G10L19/00G10L21/00G10L25/00
    • G06F17/2881G06F17/2765G10L15/19
    • An automatic speech recognition system and method are provided for written-domain language modeling. According to one implementation, a process includes accessing decomposed training data that results from applying rewrite grammar rules to original training data, the decomposed training data comprising (i) regular words from the original training data that have not been rewritten using the set of rewrite grammar rules, and (ii) decomposed segments that result from rewriting non-lexical entities from the original training data using the rewrite grammar rules, generating a restriction model that (i) maps language model paths for regular words to themselves, and (ii) restricts language model paths for decomposed segments for non-lexical entities, training a n-gram language model over the training data, composing the restriction model and the language model to obtain a restricted language model, and constructing a decoding network by composing a context dependency model and a pronunciation lexicon with the restricted language model.
    • 提供了一种用于书面域语言建模的自动语音识别系统和方法。 根据一个实施方式,一个过程包括访问由重写语法规则应用于原始训练数据而产生的分解的训练数据,分解的训练数据包括(i)来自原始训练数据的常规单词,该原始训练数据未被重写使用该组重写语法 规则,和(ii)使用重写语法规则从原始训练数据重写非词汇实体产生的分段,生成限制模型,其将(i)将常规单词的语言模型路径映射到自身,以及(ii)限制 用于非词汇实体的分解段的语言模型路径,训练训练数据上的n-gram语言模型,组成限制模型和语言模型以获得受限语言模型,以及通过组合上下文依赖模型构建解码网络 和具有受限语言模型的发音词典。
    • 2. 发明授权
    • Statistical unit selection language models based on acoustic fingerprinting
    • 基于声指纹的统计单位选择语言模型
    • US09424835B2
    • 2016-08-23
    • US14850249
    • 2015-09-10
    • Google Inc.
    • Alexander GutkinJavier Gonzalvo FructuosoCyril Georges Luc Allauzen
    • G10L15/08G10L15/06G10L19/018G10L13/08
    • G10L15/063G10L13/08G10L19/018
    • Methods, systems, and apparatus, including computer programs encoded on computer storage media, for providing statistical unit selection language modeling based on acoustic fingerprinting. The methods, systems and apparatus include the actions of obtaining a unit database of acoustic units and, for each acoustic unit, linguistic data corresponding to the acoustic unit; obtaining stored data associating each acoustic unit with (i) a corresponding acoustic fingerprint and (ii) a probability of the linguistic data corresponding to the acoustic unit occurring in a text corpus; determining that the unit database of acoustic units has been updated to include one or more new acoustic units; for each new acoustic unit in the updated unit database: generating an acoustic fingerprint for the new acoustic unit; identifying an acoustic unit that (i) has an acoustic fingerprint that is indicated as similar to the fingerprint of the new acoustic unit, and (ii) has a stored associated probability.
    • 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于提供基于声学指纹识别的统计单位选择语言建模。 方法,系统和装置包括获得单元数据库的动作,对于每个声学单元,对应于声学单元的语言数据; 获得将每个声学单元与(i)对应的声学指纹相关联的存储数据和(ii)与在文本语料库中发生的声学单元相对应的语言数据的概率; 确定声学单元的单元数据库已经被更新为包括一个或多个新的声学单元; 对于更新的单元数据库中的每个新的声学单元:为新的声学单元产生声学指纹; 识别(i)具有与新声学单元的指纹相似的声音指纹的声学单元,以及(ii)具有存储的相关概率。
    • 3. 发明授权
    • Mixture of n-gram language models
    • n-gram语言模型的混合
    • US09208779B2
    • 2015-12-08
    • US14019685
    • 2013-09-06
    • Google Inc.
    • Hasim SakCyril Georges Luc Allauzen
    • G10L15/00G10L15/197G10L15/06
    • G10L15/197G10L15/063G10L2015/0631
    • Methods, systems, and apparatus, including computer programs encoded on computer storage media, for creating a static language model from a mixture of n-gram language models. One of the methods includes receiving a set of development sentences W, receiving a set of language models GM, determining a set of n-gram language model weights λM based on the development sentences W and the set of language models GM, determining a set of sentence cluster weights γC, each of the sentence cluster weights corresponding to a cluster in a set of sentence clusters, each cluster in the set of sentence clusters associated with at least one sentence from the set of development sentences W, and generating a language model from the set of language models GM, the set of n-gram language model weights λM, the set of sentence clusters, and the set of sentence cluster weights γC.
    • 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于从混合的n-gram语言模型创建静态语言模型。 一种方法包括接收一组开发句子W,接收一组语言模型GM,基于开发句子W和语言模型GM集合确定一组n语言模型权重λM,确定一组 语句集群权重γC,每个句子集合权重对应于一组语句集群中的一个集群,每组集群中的句子集合与来自该组开发语句W的至少一个句子相关联,并且从 语言模型GM集合,n-gram语言模型权重集合λM,句子集合集合以及句子集群权重集合γC。
    • 4. 发明申请
    • MIXTURE OF N-GRAM LANGUAGE MODELS
    • N-GRAM语言模型的混合
    • US20150073788A1
    • 2015-03-12
    • US14019685
    • 2013-09-06
    • Google Inc.
    • Hasim SakCyril Georges Luc Allauzen
    • G10L15/26G10L15/18
    • G10L15/197G10L15/063G10L2015/0631
    • Methods, systems, and apparatus, including computer programs encoded on computer storage media, for creating a static language model from a mixture of n-gram language models. One of the methods includes receiving a set of development sentences W, receiving a set of language models GM, determining a set of n-gram language model weights λM based on the development sentences W and the set of language models GM, determining a set of sentence cluster weights γC, each of the sentence cluster weights corresponding to a cluster in a set of sentence clusters, each cluster in the set of sentence clusters associated with at least one sentence from the set of development sentences W, and generating a language model from the set of language models GM, the set of n-gram language model weights λM, the set of sentence clusters, and the set of sentence cluster weights γC.
    • 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于从混合的n-gram语言模型创建静态语言模型。 一种方法包括接收一组开发句子W,接收一组语言模型GM,基于开发句子W和语言模型GM集合确定一组n语言模型权重λM,确定一组 语句集群权重γC,每个句子集合权重对应于一组语句集群中的一个集群,每组集群中的句子集合与来自该组开发语句W的至少一个句子相关联,并且从 语言模型GM集合,n-gram语言模型权重集合λM,句子集合集合以及句子集群权重集合γC。
    • 5. 发明授权
    • Natural language correction for speech input
    • 语言输入的自然语言修正
    • US09483459B1
    • 2016-11-01
    • US13799767
    • 2013-03-13
    • Google Inc.
    • Michael D RileyJohan SchalkwykCyril Georges Luc AllauzenCiprian Ioan ChelbaEdward Oscar Benson
    • G10L21/00G06F17/27
    • G06F17/273G06F17/27G06F17/277G06F17/30654G06F17/30672G10L15/18G10L15/183G10L15/19G10L15/22G10L25/48
    • A system is configured to receive a first string corresponding to an interpretation of a natural-language user voice entry; provide a representation of the first string as feedback to the natural-language user voice entry; receive, based on the feedback, a second string corresponding to a natural-language corrective user entry, where the natural-language corrective user entry may correspond to a correction to the natural-language user voice entry; parse the second string into one or more tokens; determine at least one corrective instruction from the one or more tokens of the second string; generate, from at least a portion of each of the first and second strings and based on the at least one corrective instruction, candidate corrected user entries; select a corrected user entry from the candidate corrected user entries; and output the selected, corrected user entry.
    • 系统被配置为接收对应于自然语言用户语音输入的解释的第一串; 提供第一个字符串的表示作为对自然语言用户语音输入的反馈; 基于所述反馈接收对应于自然语言校正用户条目的第二字符串,其中所述自然语言校正用户条目可对应于对所述自然语言用户语音输入的校正; 将第二个字符串解析成一个或多个令牌; 确定来自所述第二串的所述一个或多个令牌的至少一个校正指令; 从所述第一和第二串中的每一个的至少一部分中,基于所述至少一个校正指令生成候选校正用户条目; 从候选者更正的用户条目中选择一个更正的用户条目; 并输出所选择的,更正的用户条目。
    • 6. 发明申请
    • STATISTICAL UNIT SELECTION LANGUAGE MODELS BASED ON ACOUSTIC FINGERPRINTING
    • 基于声音指纹的统计单位选择语言模型
    • US20160093295A1
    • 2016-03-31
    • US14850249
    • 2015-09-10
    • Google Inc.
    • Alexander GutkinJavier Gonzalvo FructuosoCyril Georges Luc Allauzen
    • G10L15/06G10L13/08G10L19/018
    • G10L15/063G10L13/08G10L19/018
    • Methods, systems, and apparatus, including computer programs encoded on computer storage media, for providing statistical unit selection language modeling based on acoustic fingerprinting. The methods, systems and apparatus include the actions of obtaining a unit database of acoustic units and, for each acoustic unit, linguistic data corresponding to the acoustic unit; obtaining stored data associating each acoustic unit with (i) a corresponding acoustic fingerprint and (ii) a probability of the linguistic data corresponding to the acoustic unit occurring in a text corpus; determining that the unit database of acoustic units has been updated to include one or more new acoustic units; for each new acoustic unit in the updated unit database: generating an acoustic fingerprint for the new acoustic unit; identifying an acoustic unit that (i) has an acoustic fingerprint that is indicated as similar to the fingerprint of the new acoustic unit, and (ii) has a stored associated probability.
    • 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于提供基于声学指纹识别的统计单位选择语言建模。 方法,系统和装置包括获得单元数据库的动作,对于每个声学单元,对应于声学单元的语言数据; 获得将每个声学单元与(i)对应的声学指纹相关联的存储数据和(ii)与在文本语料库中发生的声学单元相对应的语言数据的概率; 确定声学单元的单元数据库已经被更新为包括一个或多个新的声学单元; 对于更新的单元数据库中的每个新的声学单元:为新的声学单元产生声学指纹; 识别(i)具有与新声学单元的指纹相似的声音指纹的声学单元,以及(ii)具有存储的相关概率。