专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US09558741B2 Systems and methods for speech recognition 有权
标题翻译：用于语音识别的系统和方法
公开(公告)号：US09558741B2
公开(公告)日：2017-01-31
申请号：US14291138
申请日：2014-05-30
申请人： Tencent Technology (Shenzhen) Company Limited
发明人： Lou Li , Li Lu , Xiang Zhang , Feng Rao , Shuai Yue , Bo Chen , Jianxiong Ma , Haibo Liu
IPC分类号： G10L15/28 , G10L15/08 , G10L15/18 , G10L15/183
CPC分类号： G10L15/083 , G10L15/1815 , G10L15/183
摘要： Systems and methods are provided for speech recognition. For example, audio characteristics are extracted from acquired voice signals; a syllable confusion network is identified based on at least information associated with the audio characteristics; a word lattice is generated based on at least information associated with the syllable confusion network and a predetermined phonetic dictionary; and an optimal character sequence is calculated in the word lattice as a speech recognition result.
摘要翻译：提供了语音识别的系统和方法。例如，从获取的语音信号中提取音频特性; 至少基于与音频特征相关联的信息来识别音节混淆网络; 基于至少与音节混淆网络和预定语音字典相关联的信息生成单词格点; 并且在单词格中计算出最佳字符序列作为语音识别结果。

2. 发明授权

US09508347B2 Method and device for parallel processing in model training 有权
标题翻译：模型训练中并行处理的方法与装置
公开(公告)号：US09508347B2
公开(公告)日：2016-11-29
申请号：US14108237
申请日：2013-12-16
申请人： Tencent Technology (Shenzhen) Company Limited
发明人： Eryu Wang , Li Lu , Xiang Zhang , Haibo Liu , Feng Rao , Lou Li , Shuai Yue , Bo Chen
IPC分类号： G10L15/16 , G10L15/34 , G10L15/06 , G06N3/02
CPC分类号： G10L15/34 , G06N3/02 , G10L15/063 , G10L15/16
摘要： A method and a device for training a DNN model includes: at a device including one or more processors and memory: establishing an initial DNN model; dividing a training data corpus into a plurality of disjoint data subsets; for each of the plurality of disjoint data subsets, providing the data subset to a respective training processing unit of a plurality of training processing units operating in parallel, wherein the respective training processing unit applies a Stochastic Gradient Descent (SGD) process to update the initial DNN model to generate a respective DNN sub-model based on the data subset; and merging the respective DNN sub-models generated by the plurality of training processing units to obtain an intermediate DNN model, wherein the intermediate DNN model is established as either the initial DNN model for a next training iteration or a final DNN model in accordance with a preset convergence condition.
摘要翻译：用于训练DNN模型的方法和设备包括：在包括一个或多个处理器和存储器的设备上：建立初始DNN模型; 将训练数据语料库划分为多个不相交的数据子集; 对于多个不相交数据子集中的每一个，将数据子集提供给并行操作的多个训练处理单元的相应训练处理单元，其中各训练处理单元应用随机梯度下降（SGD）过程来更新初始 DNN模型基于数据子集生成相应的DNN子模型; 并且合并由多个训练处理单元生成的各个DNN子模型，以获得中间DNN模型，其中中间DNN模型被建立为用于下一个训练迭代的初始DNN模型或根据下面的训练迭代的最终DNN模型预设收敛条件。

3. 发明申请

US20190102373A1 MODEL-BASED AUTOMATIC CORRECTION OF TYPOGRAPHICAL ERRORS 审中-公开
公开(公告)号：US20190102373A1
公开(公告)日：2019-04-04
申请号：US16133440
申请日：2018-09-17
申请人： Tencent Technology (Shenzhen) Company Limited
发明人： Lou Li , Qiang Cheng , Feng Rao , Li Lu , Xiang Zhang , Shuai Yue , Bo Chen , Duling Lu
IPC分类号： G06F17/27
摘要： A method is performed at a computer for automatically correcting typographical errors. The computer selects a target word in a target sentence and identifies a target word therein as having a typographical error and first and second sequences of words separated by the target word as context. After identifying, among a database of grammatically correct sentences, a set of sentences having the first and second sequences of words, each sentence including a replacement word, the computer selects a set of candidate grammatically correct sentences whose corresponding replacement words have similarities to the target word above a pre-set threshold, Finally, the computer chooses, among the set of candidate grammatically correct sentences, a fittest grammatically correct sentence according to a linguistic model and replaces the target word in the target sentence with the replacement word within the fittest grammatically correct sentence.

4. 发明授权

US09502038B2 Method and device for voiceprint recognition 有权
标题翻译：用于声纹识别的方法和装置
公开(公告)号：US09502038B2
公开(公告)日：2016-11-22
申请号：US14105110
申请日：2013-12-12
申请人： Tencent Technology (Shenzhen) Company Limited
发明人： Eryu Wang , Li Lu , Xiang Zhang , Haibo Liu , Lou Li , Feng Rao , Duling Lu , Shuai Yue , Bo Chen
IPC分类号： G10L21/00 , G10L17/18
CPC分类号： G10L17/18 , G10L17/02 , G10L17/04 , G10L17/08
摘要： A method and device for voiceprint recognition, include: establishing a first-level Deep Neural Network (DNN) model based on unlabeled speech data, the unlabeled speech data containing no speaker labels and the first-level DNN model specifying a plurality of basic voiceprint features for the unlabeled speech data; obtaining a plurality of high-level voiceprint features by tuning the first-level DNN model based on labeled speech data, the labeled speech data containing speech samples with respective speaker labels, and the tuning producing a second-level DNN model specifying the plurality of high-level voiceprint features; based on the second-level DNN model, registering a respective high-level voiceprint feature sequence for a user based on a registration speech sample received from the user; and performing speaker verification for the user based on the respective high-level voiceprint feature sequence registered for the user.
摘要翻译：用于声纹识别的方法和装置包括：基于未标记的语音数据建立第一级深神经网络（DNN）模型，不包含扬声器标签的未标记语音数据和指定多个基本声纹特征的第一级DNN模型对于未标记的语音数据; 通过基于标记的语音数据调整第一级DNN模型来获得多个高级声纹特征，所述标记语音数据包含具有相应扬声器标签的语音样本，并且调谐产生指定多个高的DNN模型级的声纹特征; 基于第二级DNN模型，基于从用户接收到的注册语音样本，为用户注册相应的高级声纹特征序列; 以及基于为用户注册的各个高级声纹特征序列，为用户执行说话人验证。

5. 发明授权

US09940935B2 Method and device for voiceprint recognition 有权
公开(公告)号：US09940935B2
公开(公告)日：2018-04-10
申请号：US15240696
申请日：2016-08-18
申请人： Tencent Technology (Shenzhen) Company Limited
发明人： Eryu Wang , Li Lu , Xiang Zhang , Haibo Liu , Lou Li , Feng Rao , Duling Lu , Shuai Yue , Bo Chen
IPC分类号： G10L15/16 , G10L17/18 , G10L17/02 , G10L17/04 , G10L17/08
CPC分类号： G10L17/18 , G10L17/02 , G10L17/04 , G10L17/08
摘要： A method is performed at a device having one or more processors and memory. The device establishes a first-level Deep Neural Network (DNN) model based on unlabeled speech data, the unlabeled speech data containing no speaker labels and the first-level DNN model specifying a plurality of basic voiceprint features for the unlabeled speech data. The device establishes a second-level DNN model by tuning the first-level DNN model based on labeled speech data, the labeled speech data containing speech samples with respective speaker labels, wherein the second-level DNN model specifies a plurality of high-level voiceprint features. Using the second-level DNN model, registers a first high-level voiceprint feature sequence for a user based on a registration speech sample received from the user. The device performs speaker verification for the user based on the first high-level voiceprint feature sequence registered for the user.

6. 发明申请

US20140350934A1 Systems and Methods for Voice Identification 有权
标题翻译：语音识别系统与方法
公开(公告)号：US20140350934A1
公开(公告)日：2014-11-27
申请号：US14291138
申请日：2014-05-30
申请人： Tencent Technology (Shenzhen) Company Limited
发明人： Lou Li , Li Lu , Xiang Zhang , Feng Rao , Shuai Yue , Bo Chen , Jianxiong Ma , Haibo Liu
IPC分类号： G10L17/22
CPC分类号： G10L15/083 , G10L15/1815 , G10L15/183
摘要： Systems and methods are provided for voice identification. For example, audio characteristics are extracted from acquired voice signals; a syllable confusion network is identified based on at least information associated with the audio characteristics; a word lattice is generated based on at least information associated with the syllable confusion network and a predetermined phonetic dictionary; and an optimal character sequence is calculated in the word lattice as an identification result.
摘要翻译：为语音识别提供了系统和方法。例如，从获取的语音信号中提取音频特性; 至少基于与音频特征相关联的信息来识别音节混淆网络; 基于至少与音节混淆网络和预定语音字典相关联的信息生成单词格点; 并且在字格中计算最佳字符序列作为识别结果。

7. 发明申请

US20140214419A1 METHOD AND SYSTEM FOR AUTOMATIC SPEECH RECOGNITION 有权
标题翻译：自动语音识别方法与系统
公开(公告)号：US20140214419A1
公开(公告)日：2014-07-31
申请号：US14108223
申请日：2013-12-16
申请人： Tencent Technology (Shenzhen) Company Limited
发明人： Feng Rao , Li Lu , Bo Chen , Shuai Yue , Xiang Zhang , Eryu Wang , Dadong Xie , Lou Li , Duling Lu
IPC分类号： G10L15/06
CPC分类号： G10L15/063 , G10L15/183 , G10L15/197 , G10L15/26
摘要： An automatic speech recognition method includes at a computer having one or more processors and memory for storing one or more programs to be executed by the processors, obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus; obtaining a plurality of classified language models that respectively correspond to the plurality of speech corpus categories through a language model training applied on each speech corpus category; obtaining an interpolation language model through implementing a weighted interpolation on each classified language model and merging the interpolated plurality of classified language models; constructing a decoding resource in accordance with an acoustic model and the interpolation language model; and decoding input speech using the decoding resource, and outputting a character string with a highest probability as a recognition result of the input speech.
摘要翻译：自动语音识别方法包括在具有一个或多个处理器的计算机和用于存储要由处理器执行的一个或多个程序的存储器，通过分类和计算原始语音语料库获得多个语音语料库类别; 通过应用于每个语音语料库类别的语言模型训练获得分别对应于所述多个语音语料库类别的多个分类语言模型; 通过对每个分类语言模型实施加权插值并合并内插的多个分类语言模型来获得内插语言模型; 根据声学模型和内插语言模型构建解码资源; 并使用解码资源解码输入语音，并输出具有最高概率的字符串作为输入语音的识别结果。

8. 发明授权

US09697821B2 Method and system for building a topic specific language model for use in automatic speech recognition 有权
公开(公告)号：US09697821B2
公开(公告)日：2017-07-04
申请号：US14108223
申请日：2013-12-16
申请人： Tencent Technology (Shenzhen) Company Limited
发明人： Feng Rao , Li Lu , Bo Chen , Shuai Yue , Xiang Zhang , Eryu Wang , Dadong Xie , Lou Li , Duling Lu
IPC分类号： G10L15/06 , G10L15/183 , G10L15/197 , G10L15/26
CPC分类号： G10L15/063 , G10L15/183 , G10L15/197 , G10L15/26
摘要： An automatic speech recognition method includes at a computer having one or more processors and memory for storing one or more programs to be executed by the processors, obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus; obtaining a plurality of classified language models that respectively correspond to the plurality of speech corpus categories through a language model training applied on each speech corpus category; obtaining an interpolation language model through implementing a weighted interpolation on each classified language model and merging the interpolated plurality of classified language models; constructing a decoding resource in accordance with an acoustic model and the interpolation language model; and decoding input speech using the decoding resource, and outputting a character string with a highest probability as a recognition result of the input speech.

9. 发明授权

US09177131B2 User authentication method and apparatus based on audio and video data 有权
标题翻译：基于音频和视频数据的用户认证方法和设备
公开(公告)号：US09177131B2
公开(公告)日：2015-11-03
申请号：US14262665
申请日：2014-04-25
申请人： Tencent Technology (Shenzhen) Company Limited
发明人： Xiang Zhang , Li Lu , Eryu Wang , Shuai Yue , Feng Rao , Haibo Liu , Lou Li , Duling Lu , Bo Chen
IPC分类号： H04L29/06 , G06F21/32
CPC分类号： G06F21/32 , G06F2221/2117
摘要： A computer-implemented method is performed at a server having one or more processors and memory storing programs executed by the one or more processors for authenticating a user from video and audio data. The method includes: receiving a login request from a mobile device, the login request including video data and audio data; extracting a group of facial features from the video data; extracting a group of audio features from the audio data and recognizing a sequence of words in the audio data; identifying a first user account whose respective facial features match the group of facial features and a second user account whose respective audio features match the group of audio features. If the first user account is the same as the second user account, retrieve the sequence of words associated with the user account and compare the sequences of words for authentication purpose.
摘要翻译：在具有一个或多个处理器的服务器和由一个或多个处理器执行的用于从视频和音频数据认证用户的存储器存储程序的服务器执行计算机实现的方法。该方法包括：从移动设备接收登录请求，登录请求包括视频数据和音频数据; 从视频数据中提取一组面部特征; 从音频数据提取一组音频特征并识别音频数据中的单词序列; 识别其各自的面部特征与该组面部特征相匹配的第一用户帐户和其各个音频特征与该组音频特征相匹配的第二用户帐户。如果第一个用户帐户与第二个用户帐户相同，则检索与用户帐户相关联的单词序列，并比较用于验证目的的单词序列。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式