专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

71. 发明授权

US09484027B2 Using pitch during speech recognition post-processing to improve recognition accuracy 有权
标题翻译：在语音识别后处理中使用音调来提高识别精度
公开(公告)号：US09484027B2
公开(公告)日：2016-11-01
申请号：US12635346
申请日：2009-12-10
申请人： Xufang Zhao , Uma Arun
发明人： Xufang Zhao , Uma Arun
IPC分类号： G10L21/00 , G10L25/00 , G10L15/20 , G10L25/90 , G10L25/15
CPC分类号： G10L15/20 , G10L25/15 , G10L25/90 , G10L2015/027
摘要： A method of automated speech recognition in a vehicle. The method includes receiving audio in the vehicle, pre-processing the received audio to generate acoustic feature vectors, decoding the generated acoustic feature vectors to produce at least one speech hypothesis, and post-processing the at least one speech hypothesis using pitch to improve speech recognition accuracy. The speech hypothesis can be accepted as recognized speech during post-processing if pitch is present in the received audio. Alternatively, a pitch count for the received audio can be determined, N-best speech hypotheses can be post-processed by comparing the pitch count to syllable counts associated with the speech hypotheses, and the speech hypothesis having a syllable count equal to the pitch count can be accepted as recognized speech.
摘要翻译：一种在车辆中自动语音识别的方法。该方法包括在车辆中接收音频，对接收的音频进行预处理以产生声学特征向量，解码所生成的声学特征向量以产生至少一个语音假设，以及使用音高对语音假设进行后处理以改善语音识别精度。如果接收到的音频中存在音调，则语音假设可以在后处理中被接受为识别语音。或者，可以确定接收到的音频的音调计数，通过将音调计数与与语音假设相关联的音节计数进行比较，可以对N个最佳语音假设进行后处理，并且具有等于音高计数的音节计数的语音假设可以被接受为公认的演讲。

72. 发明授权

US09467790B2 Reverberation estimator 有权
标题翻译：混响估计器
公开(公告)号：US09467790B2
公开(公告)日：2016-10-11
申请号：US13810877
申请日：2010-07-20
申请人： Pasi Ojala
发明人： Pasi Ojala
IPC分类号： G10L25/00 , H04R29/00 , G10H7/00 , G10L25/48 , G10L19/083 , G10L21/0216 , G10L21/0208 , G10L19/08
CPC分类号： H04R29/00 , G01H7/00 , G10H7/00 , G10L19/08 , G10L19/083 , G10L21/0216 , G10L25/48 , G10L2021/02087
摘要： A method comprising determining a reverberation time estimate for an audio signal from a first part of an encoded audio signal representing the audio signal.
摘要翻译：一种方法，包括从表示音频信号的编码音频信号的第一部分确定音频信号的混响时间估计。

73. 发明授权

US09466315B2 System and method for calculating similarity of audio file 有权
标题翻译：用于计算音频文件相似度的系统和方法
公开(公告)号：US09466315B2
公开(公告)日：2016-10-11
申请号：US14450675
申请日：2014-08-04
申请人： TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
发明人： Weifeng Zhao , Shenyuan Li , Liwei Zhang , Jianfeng Chen
IPC分类号： G10L25/00 , G10L25/54 , G10L25/90
CPC分类号： G10L25/00 , G10L25/54 , G10L25/90
摘要： A method for calculating a similarity of audio files includes constituting a pitch sequence of a first audio file and a pitch sequence of a second audio file; calculating an eigenvector of the first audio file according to the pitch sequence of the first audio file, and calculating an eigenvector of the second audio file according to the pitch sequence of the second audio file; calculating a similarity between the first audio file and the second audio file according to the eigenvector of the first audio file and the eigenvector of the second audio file.
摘要翻译：用于计算音频文件的相似度的方法包括构成第一音频文件的音调序列和第二音频文件的音调序列; 根据第一音频文件的音调序列计算第一音频文件的特征向量，并根据第二音频文件的音调序列计算第二音频文件的特征向量; 根据第一音频文件的特征向量和第二音频文件的特征向量计算第一音频文件和第二音频文件之间的相似度。

74. 发明授权

US09443516B2 Far-field speech recognition systems and methods 有权
标题翻译：远场语音识别系统和方法
公开(公告)号：US09443516B2
公开(公告)日：2016-09-13
申请号：US14151554
申请日：2014-01-09
申请人： Honeywell International Inc.
发明人： SrinivasaRao Katuri , Amit Kulkarni
IPC分类号： G10L15/00 , G10L25/00 , G10L15/22 , G10L15/28 , G10L15/30 , G10L21/0208 , G10L21/0216
CPC分类号： G10L15/22 , G10L15/28 , G10L15/30 , G10L21/0208 , G10L2021/02166
摘要： A method for far-field speech recognition can include determining a location for a plurality of sound recognition devices, communicatively coupling each of the plurality of sound recognition devices, adjusting a sound reception for the plurality of sound recognition devices to receive a voice command from a particular direction, and sending instructions to a device based on the voice command.
摘要翻译：用于远场语音识别的方法可以包括确定多个声音识别装置的位置，通信地耦合多个声音识别装置中的每一个，调整多个声音识别装置的声音接收以从一个声音识别装置接收一个声音指令特定方向，并且基于语音命令向设备发送指令。

75. 发明授权

US09390706B2 Personality-based intelligent personal assistant system and methods 有权
标题翻译：基于人格的智能个人助理系统和方法
公开(公告)号：US09390706B2
公开(公告)日：2016-07-12
申请号：US14309728
申请日：2014-06-19
申请人： Mattersight Corporation
发明人： David Gustafson , Christopher Danson
IPC分类号： G10L15/00 , G06F17/20 , G06F17/27 , G10L21/00 , G10L25/00 , G10L13/04 , G10L15/26 , H04M3/493 , G10L25/51 , G10L15/08 , G10L25/63 , G10L25/66
CPC分类号： G10L15/22 , G06F3/167 , G06F17/27 , G10L13/043 , G10L15/19 , G10L15/265 , G10L15/30 , G10L25/51 , G10L25/63 , G10L25/66 , G10L2015/088 , H04M3/493
摘要： The methods, apparatus, and systems described herein assist a user with a request. The methods include receiving at least one input from a user, entering the at least one input into an algorithm trained to output a personality type of the user, and tailoring an output based on the personality type.
摘要翻译：这里描述的方法，装置和系统帮助用户提出请求。所述方法包括从用户接收至少一个输入，将所述至少一个输入输入到经过训练以输出用户的个性类型的算法中，以及基于所述个性类型定制输出。

76. 发明授权

US09384759B2 Voice activity detection and pitch estimation 有权
标题翻译：语音活动检测和音调估计
公开(公告)号：US09384759B2
公开(公告)日：2016-07-05
申请号：US13590022
申请日：2012-08-20
申请人： Pierre Zakarauskas , Alexander Escott , Clarence S. H. Chu , Shawn E. Stevenson
发明人： Pierre Zakarauskas , Alexander Escott , Clarence S. H. Chu , Shawn E. Stevenson
IPC分类号： G10L21/00 , G10L25/00 , G10L25/93 , G10L15/00 , G10L15/20 , G10L25/78 , G10L25/90 , G10L25/18
CPC分类号： G10L25/78 , G10L25/18 , G10L25/90 , G10L25/93
摘要： Implementations include systems, methods and/or devices operable to detect voice activity in an audible signal by detecting glottal pulses. The dominant frequency of a series of glottal pulses is perceived as the intonation pattern or melody of natural speech, which is also referred to as the pitch. However, as noted above, spoken communication typically occurs in the presence of noise and/or other interference. In turn, the undulation of voiced speech is masked in some portions of the frequency spectrum associated with human speech by the noise and/or other interference. In some implementations, detection of voice activity is facilitated by dividing the frequency spectrum associated with human speech into multiple sub-bands in order to identify glottal pulses that dominate the noise and/or other inference in particular sub-bands. Additionally and/or alternatively, in some implementations the analysis is furthered to provide a pitch estimate of the detected voice activity.
摘要翻译：实现包括可操作以通过检测声门脉冲来检测可听信号中的语音活动的系统，方法和/或设备。一系列声门脉冲的主频被视为自然语音的语调模式或旋律，也称为音调。然而，如上所述，语音通信通常在存在噪声和/或其他干扰的情况下发生。反过来，通过噪声和/或其他干扰，有声语音的波动在与人类语音相关联的频谱的某些部分被屏蔽。在一些实现中，通过将与人类语音相关联的频谱划分成多个子带来便于语音活动的检测，以便识别支配噪声和/或特别是子带中的其它推断的声门脉冲。另外和/或替代地，在一些实现中，进一步分析以提供检测到的语音活动的音高估计。

77. 发明授权

US09373321B2 Generation of wake-up words 有权
标题翻译：生成唤醒词
公开(公告)号：US09373321B2
公开(公告)日：2016-06-21
申请号：US14093703
申请日：2013-12-02
申请人： Cypress Semiconductor Corporation
发明人： Ojas Ashok Bapat , Kenichi Kumatani
IPC分类号： G10L15/00 , G10L15/26 , G10L17/00 , G10L15/18 , G10L21/00 , G10L25/00 , G10L15/06 , G10L15/08
CPC分类号： G10L15/06 , G10L2015/0638 , G10L2015/088
摘要： A method, system and tangible computer readable medium for generating one or more wake-up words are provided. For example, the method can include receiving a text representation of the one or more wake-up words. A strength of the text representation of the one or more wake-up words can be determined based on one or more static measures. The method can also include receiving an audio representation of the one or more wake-up words. A strength of the audio representation of the one or more wake-up words can be determined based on one or more dynamic measures. Feedback on the one or more wake-up words is provided (e.g., to an end user) based on the strengths of the text and audio representations.
摘要翻译：提供了用于产生一个或多个唤醒字的方法，系统和有形计算机可读介质。例如，该方法可以包括接收一个或多个唤醒字的文本表示。可以基于一个或多个静态测量来确定一个或多个唤醒单词的文本表示的强度。该方法还可以包括接收一个或多个唤醒字的音频表示。可以基于一个或多个动态测量来确定一个或多个唤醒单词的音频表示的强度。基于文本和音频表示的强度，提供对一个或多个唤醒词的反馈（例如，给最终用户）。

78. 发明授权

US09329832B2 Voice internet system and method 有权
标题翻译：语音互联网系统和方法
公开(公告)号：US09329832B2
公开(公告)日：2016-05-03
申请号：US13467013
申请日：2012-05-08
申请人： Robert Allen Blaisch
发明人： Robert Allen Blaisch
IPC分类号： G10L21/00 , G10L25/00 , G10L15/00 , G06F3/16 , G06F3/0484 , G06F17/30 , H04M3/493
CPC分类号： G06F3/167 , G06F3/04842 , G06F17/30861 , G06F17/3089 , G10L15/00 , H04M3/493 , H04M3/4938
摘要： A system and method is provided for voice activated Web based infrastructure (Voice Portal) which accepts spoken input from a variety of devices, including desktop and laptop computers, tablets, smart phones, standard mobile phones, and ordinary hard-wired telephones.
摘要翻译：提供了一种用于语音激活的基于Web的基础设施（语音门户）的系统和方法，其接收来自各种设备（包括台式电脑和膝上型计算机，平板电脑，智能电话，标准移动电话和普通硬连线电话）的口语输入。

79. 发明授权

US09275637B1 Wake word evaluation 有权
标题翻译：唤醒词评估
公开(公告)号：US09275637B1
公开(公告)日：2016-03-01
申请号：US13670316
申请日：2012-11-06
申请人： Rawles LLC
发明人： Stan Weidner Salvador , Jeffrey Paul Lilly , Frederick V. Weber , Jeffrey Penrod Adams , Ryan Paul Thomas
IPC分类号： G10L21/00 , G10L25/00 , G10L15/06 , G10L15/00 , G10L15/01
CPC分类号： G10L15/01 , G10L15/06 , G10L15/187 , G10L2015/088
摘要： Natural language controlled devices may be configured to activate command recognition in response to one or more wake words. Techniques are provided to receive a candidate word for evaluation as a wake word that activates a natural language control functionality of a computing device. The candidate word may include one or more words or sounds. Values for multiple wake word metrics are then determined. The candidate word is evaluated based on the various wake word metrics.
摘要翻译：自然语言控制的设备可以被配置为响应于一个或多个唤醒语言来激活命令识别。提供技术来接收用于评估的候选词作为激活计算设备的自然语言控制功能的唤醒词。候选词可以包括一个或多个单词或声音。然后确定多个唤醒字度量的值。基于各种唤醒字度量来评估候选词。

80. 发明授权

US09263042B1 Providing pre-computed hotword models 有权
标题翻译：提供预先计算的词典模型
公开(公告)号：US09263042B1
公开(公告)日：2016-02-16
申请号：US14340833
申请日：2014-07-25
申请人： Google Inc.
发明人： Matthew Sharifi
IPC分类号： G10L15/26 , G10L15/00 , G10L25/00 , G10L15/22 , G10L15/06
CPC分类号： G10L15/22 , G06F3/04842 , G06F3/167 , G10L15/063 , G10L15/08 , G10L15/18 , G10L15/265 , G10L15/30 , G10L2015/0631 , G10L2015/0638 , G10L2015/088 , G10L2015/223
摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, for each of multiple words or sub-words, audio data corresponding to multiple users speaking the word or sub-word; training, for each of the multiple words or sub-words, a pre-computed hotword model for the word or sub-word based on the audio data for the word or sub-word; receiving a candidate hotword from a computing device; identifying one or more pre-computed hotword models that correspond to the candidate hotword; and providing the identified, pre-computed hotword models to the computing device.
摘要翻译：方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于为多个单词或子单词中的每一个获得对应于说话单词或子单词的多个用户的音频数据; 针对所述多个单词或子单词中的每一个的训练，基于所述单词或子单词的音频数据的用于所述单词或子单词的预先计算的词典模型; 从计算设备接收候选词; 识别与所述候选词语对应的一个或多个预先计算的词典模型; 以及将所识别的预先计算的热词模型提供给所述计算设备。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式