专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

US20160093295A1 STATISTICAL UNIT SELECTION LANGUAGE MODELS BASED ON ACOUSTIC FINGERPRINTING 有权
标题翻译：基于声音指纹的统计单位选择语言模型
公开(公告)号：US20160093295A1
公开(公告)日：2016-03-31
申请号：US14850249
申请日：2015-09-10
申请人： Google Inc.
发明人： Alexander Gutkin , Javier Gonzalvo Fructuoso , Cyril Georges Luc Allauzen
IPC分类号： G10L15/06 , G10L13/08 , G10L19/018
CPC分类号： G10L15/063 , G10L13/08 , G10L19/018
摘要： Methods, systems, and apparatus, including computer programs encoded on computer storage media, for providing statistical unit selection language modeling based on acoustic fingerprinting. The methods, systems and apparatus include the actions of obtaining a unit database of acoustic units and, for each acoustic unit, linguistic data corresponding to the acoustic unit; obtaining stored data associating each acoustic unit with (i) a corresponding acoustic fingerprint and (ii) a probability of the linguistic data corresponding to the acoustic unit occurring in a text corpus; determining that the unit database of acoustic units has been updated to include one or more new acoustic units; for each new acoustic unit in the updated unit database: generating an acoustic fingerprint for the new acoustic unit; identifying an acoustic unit that (i) has an acoustic fingerprint that is indicated as similar to the fingerprint of the new acoustic unit, and (ii) has a stored associated probability.
摘要翻译：方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于提供基于声学指纹识别的统计单位选择语言建模。方法，系统和装置包括获得单元数据库的动作，对于每个声学单元，对应于声学单元的语言数据; 获得将每个声学单元与（i）对应的声学指纹相关联的存储数据和（ii）与在文本语料库中发生的声学单元相对应的语言数据的概率; 确定声学单元的单元数据库已经被更新为包括一个或多个新的声学单元; 对于更新的单元数据库中的每个新的声学单元：为新的声学单元产生声学指纹; 识别（i）具有与新声学单元的指纹相似的声音指纹的声学单元，以及（ii）具有存储的相关概率。

2. 发明授权

US09542927B2 Method and system for building text-to-speech voice from diverse recordings 有权
标题翻译：从各种录音中构建文字到语音的方法和系统
公开(公告)号：US09542927B2
公开(公告)日：2017-01-10
申请号：US14540088
申请日：2014-11-13
申请人： Google Inc.
发明人： Ioannis Agiomyrgiannakis , Alexander Gutkin
IPC分类号： G10L13/08 , G10L13/02 , G10L13/06 , G10L25/03
CPC分类号： G10L13/02 , G10L13/06 , G10L25/03
摘要： A method and system is disclosed for building a speech database for a text-to-speech (TTS) synthesis system from multiple speakers recorded under diverse conditions. For a plurality of utterances of a reference speaker, a set of reference-speaker vectors may be extracted, and for each of a plurality of utterances of a colloquial speaker, a respective set of colloquial-speaker vectors may be extracted. A matching procedure, carried out under a transform that compensates for speaker differences, may be used to match each colloquial-speaker vector to a reference-speaker vector. The colloquial-speaker vector may be replaced with the matched reference-speaker vector. The matching-and-replacing can be carried out separately for each set of colloquial-speaker vectors. A conditioned set of speaker vectors can then be constructed by aggregating all the replaced speaker vectors. The condition set of speaker vectors can be used to train the TTS system.
摘要翻译：公开了一种用于从在不同条件下记录的多个扬声器构建文本到语音（TTS）合成系统的语音数据库的方法和系统。对于参考扬声器的多个话语，可以提取一组参考扬声器向量，并且对于口语扬声器的多个话语中的每一个，可以提取相应的一组口语扬声器向量。在补偿扬声器差异的变换下执行的匹配过程可以用于将每个口语扬声器向量与参考扬声器矢量相匹配。口语扬声器矢量可以用匹配的参考扬声器矢量代替。可以针对每组口语扬声器向量单独执行匹配和替换。然后可以通过聚合所有替换的说话者向量来构建一组有条理的扬声器向量。扬声器矢量的条件集可用于训练TTS系统。

3. 发明授权

US09424835B2 Statistical unit selection language models based on acoustic fingerprinting 有权
标题翻译：基于声指纹的统计单位选择语言模型
公开(公告)号：US09424835B2
公开(公告)日：2016-08-23
申请号：US14850249
申请日：2015-09-10
申请人： Google Inc.
发明人： Alexander Gutkin , Javier Gonzalvo Fructuoso , Cyril Georges Luc Allauzen
IPC分类号： G10L15/08 , G10L15/06 , G10L19/018 , G10L13/08
CPC分类号： G10L15/063 , G10L13/08 , G10L19/018
摘要： Methods, systems, and apparatus, including computer programs encoded on computer storage media, for providing statistical unit selection language modeling based on acoustic fingerprinting. The methods, systems and apparatus include the actions of obtaining a unit database of acoustic units and, for each acoustic unit, linguistic data corresponding to the acoustic unit; obtaining stored data associating each acoustic unit with (i) a corresponding acoustic fingerprint and (ii) a probability of the linguistic data corresponding to the acoustic unit occurring in a text corpus; determining that the unit database of acoustic units has been updated to include one or more new acoustic units; for each new acoustic unit in the updated unit database: generating an acoustic fingerprint for the new acoustic unit; identifying an acoustic unit that (i) has an acoustic fingerprint that is indicated as similar to the fingerprint of the new acoustic unit, and (ii) has a stored associated probability.
摘要翻译：方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于提供基于声学指纹识别的统计单位选择语言建模。方法，系统和装置包括获得单元数据库的动作，对于每个声学单元，对应于声学单元的语言数据; 获得将每个声学单元与（i）对应的声学指纹相关联的存储数据和（ii）与在文本语料库中发生的声学单元相对应的语言数据的概率; 确定声学单元的单元数据库已经被更新为包括一个或多个新的声学单元; 对于更新的单元数据库中的每个新的声学单元：为新的声学单元产生声学指纹; 识别（i）具有与新声学单元的指纹相似的声音指纹的声学单元，以及（ii）具有存储的相关概率。

4. 发明申请

US20160140951A1 Method and System for Building Text-to-Speech Voice from Diverse Recordings 有权
标题翻译：从不同录音中构建文字到语音的方法和系统
公开(公告)号：US20160140951A1
公开(公告)日：2016-05-19
申请号：US14540088
申请日：2014-11-13
申请人： Google Inc.
发明人： Ioannis Agiomyrgiannakis , Alexander Gutkin
IPC分类号： G10L13/02
CPC分类号： G10L13/02 , G10L13/06 , G10L25/03
摘要： A method and system is disclosed for building a speech database for a text-to-speech (TTS) synthesis system from multiple speakers recorded under diverse conditions. For a plurality of utterances of a reference speaker, a set of reference-speaker vectors may be extracted, and for each of a plurality of utterances of a colloquial speaker, a respective set of colloquial-speaker vectors may be extracted. A matching procedure, carried out under a transform that compensates for speaker differences, may be used to match each colloquial-speaker vector to a reference-speaker vector. The colloquial-speaker vector may be replaced with the matched reference-speaker vector. The matching-and-replacing can be carried out separately for each set of colloquial-speaker vectors. A conditioned set of speaker vectors can then be constructed by aggregating all the replaced speaker vectors. The condition set of speaker vectors can be used to train the TTS system.
摘要翻译：公开了一种用于从在不同条件下记录的多个扬声器构建文本到语音（TTS）合成系统的语音数据库的方法和系统。对于参考扬声器的多个话语，可以提取一组参考扬声器向量，并且对于口语扬声器的多个话语中的每一个，可以提取相应的一组口语扬声器向量。在补偿扬声器差异的变换下执行的匹配过程可以用于将每个口语扬声器向量与参考扬声器矢量相匹配。口语扬声器矢量可以用匹配的参考扬声器矢量代替。可以针对每组口语扬声器向量单独执行匹配和替换。然后可以通过聚合所有替换的说话者向量来构建一组有条理的扬声器向量。扬声器矢量的条件集可用于训练TTS系统。

5. 发明申请

US20150234804A1 JOINT MULTIGRAM-BASED DETECTION OF SPELLING VARIANTS 审中-公开
标题翻译：联合多媒体基因检测发现变异
公开(公告)号：US20150234804A1
公开(公告)日：2015-08-20
申请号：US14468468
申请日：2014-08-26
申请人： GOOGLE INC.
发明人： Matthew Nicholas Stuttle , Alexander Gutkin
IPC分类号： G06F17/27 , G06N99/00 , G06N7/00
CPC分类号： G06F17/273 , G06N7/005 , G06N20/00
摘要： Content processing includes receiving a set of a correctly spelled alert words and at least one spelling variant corresponding to each correctly spelled alert word; determining at least one alignment of joint multigrams for each correctly spelled alert word/corresponding spelling variant pair; training a model of correspondence between the set of received orthographic alert words and corresponding spelling variants using the determined alignments; and receiving a spelling variant observation from a content block. Using the trained model, the technology determines a probability that the received spelling variant observation corresponds to a received correctly spelled alert word. For a determined probability exceeding a configured threshold, the technology denies automatic acceptance of the content block.
摘要翻译：内容处理包括接收一组正确拼写的警报词和对应于每个正确拼写的警报词的至少一个拼写变体; 确定每个正确拼写的警报词/对应的拼写变体对的联合多格式的至少一个对齐; 使用所确定的比对来训练所接收的正字警告词和相应拼写变体的集合之间的对应关系模型; 并从内容块接收拼写变体观察。使用经过训练的模型，该技术确定接收到的拼写变体观察对应于接收到的正确拼写警报词的概率。对于超过配置阈值的确定概率，该技术拒绝内容块的自动接受。

6. 发明授权

US08751236B1 Devices and methods for speech unit reduction in text-to-speech synthesis systems 有权
标题翻译：文本到语音合成系统中语音单元缩减的设备和方法
公开(公告)号：US08751236B1
公开(公告)日：2014-06-10
申请号：US14061118
申请日：2013-10-23
申请人： Google Inc.
发明人： Javier Gonzalvo Fructuoso , Alexander Gutkin , Ioannis Agiomyrgiannakis
IPC分类号： G10L13/00 , G10L13/06
CPC分类号： G10L13/06
摘要： A device may receive a plurality of speech sounds that are indicative of pronunciations of a first linguistic term. The device may determine concatenation features of the plurality of speech sounds. The concatenation features may be indicative of an acoustic transition between a first speech sound and a second speech sound when the first speech sound and the second speech sound are concatenated. The first speech sound may be included in the plurality of speech sounds and the second speech sound may be indicative of a pronunciation of a second linguistic term. The device may cluster the plurality of speech sounds into one or more clusters based on the concatenation features. The device may provide a representative speech sound of the given cluster as the first speech sound when the first speech sound and the second speech sound are concatenated.
摘要翻译：设备可以接收指示第一语言术语的发音的多个语音。设备可以确定多个语音的连接特征。当第一语音和第二语音被级联时，级联特征可以指示第一语音和第二语音之间的声转换。第一语音可以被包括在多个语音中，第二语音可以指示第二语言术语的发音。该装置可以基于级联特征将多个语音进行聚类成一个或多个簇。当第一语音和第二语音被级联时，该设备可以提供给定簇的代表性语音作为第一语音。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式