会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明申请
    • SYSTEMS AND METHODS FOR SPEECH TRANSCRIPTION
    • 用于语音转录的系统和方法
    • US20160171974A1
    • 2016-06-16
    • US14735002
    • 2015-06-09
    • BAIDU USA LLC
    • Awni HannunCarl CaseJared CasperBryan CatanzaroGregory DiamosErich ElsenRyan PrengerSanjeev SatheeshShubhabrata SenguptaAdam CoatesAndrew Y. Ng
    • G10L15/06G10L15/16G10L15/26
    • Presented herein are embodiments of state-of-the-art speech recognition systems developed using end-to-end deep learning. In embodiments, the model architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, embodiments of the system do not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learn a function that is robust to such effects. A phoneme dictionary, nor even the concept of a “phoneme,” is needed. Embodiments include a well-optimized recurrent neural network (RNN) training system that can use multiple GPUs, as well as a set of novel data synthesis techniques that allows for a large amount of varied data for training to be efficiently obtained. Embodiments of the system can also handle challenging noisy environments better than widely used, state-of-the-art commercial speech systems.
    • 这里提出的是使用端对端深度学习开发的最先进的语音识别系统的实施例。 在实施例中,模型架构比传统的语音系统要简单得多,传统的语音系统依赖于经过精心设计的处理流水线; 当在嘈杂的环境中使用时,这些传统系统也往往表现不佳。 相比之下,系统的实施例不需要手工设计的组件来建模背景噪声,混响或者说话者的变化,而是直接学习对这种效果是鲁棒的功能。 音素字典,甚至是“音素”的概念都是必需的。 实施例包括可以使用多个GPU的良好优化的循环神经网络(RNN)训练系统,以及一组新颖的数据合成技术,其允许有效获得用于训练的大量变化的数据。 该系统的实施例也可以比广泛使用的最先进的商业语音系统更好地处理具有挑战性的嘈杂环境。