专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US10157608B2 Device for predicting voice conversion model, method of predicting voice conversion model, and computer program product 有权
公开(公告)号：US10157608B2
公开(公告)日：2018-12-18
申请号：US15433690
申请日：2017-02-15
申请人： Kabushiki Kaisha Toshiba
发明人： Yamato Ohtani , Yu Nasu , Masatsune Tamura , Masahiro Morita
IPC分类号： G10L13/033 , G10L13/047 , G10L13/08 , G10L21/003
摘要： According to an embodiment, a voice processing device includes an interface system, a determining processor, and a predicting processor. The interface system configured to receive neutral voice data representing audio in a neutral voice of a user. The determining processor configured to determine a predictive parameter based at least in part on the neutral voice data. The predicting processor configured to predict a voice conversion model for converting the neutral voice of the speaker to a target voice using at least the predictive parameter.

2. 发明授权

US09304987B2 Content creation support apparatus, method and program 有权
标题翻译：内容创建支持设备，方法和程序
公开(公告)号：US09304987B2
公开(公告)日：2016-04-05
申请号：US14301378
申请日：2014-06-11
申请人： KABUSHIKI KAISHA TOSHIBA
发明人： Kosei Fume , Masahiro Morita
IPC分类号： G10L15/00 , G06F17/27 , G10L13/08 , G10L15/26 , G10L13/033
CPC分类号： G06F17/2755 , G10L13/033 , G10L13/08 , G10L15/26
摘要： According to one embodiment, a content creation support apparatus includes a speech synthesis unit, a speech recognition unit, an extraction unit, a detection unit, a presentation unit and a selection unit. The speech synthesis unit performs a speech synthesis on a first text. The speech recognition unit performs a speech recognition on the synthesized speech to obtain a second text. The extraction unit extracts feature values by performing a morphological analysis on each of the first and second texts. The detection unit compares a first feature value of a first difference string and a second feature value of a second difference string. The presentation unit presents correction candidate(s) according to the second feature value. The selection unit selects one of the correction candidates in accordance with an instruction from a user.
摘要翻译：根据一个实施例，内容创建支持设备包括语音合成单元，语音识别单元，提取单元，检测单元，呈现单元和选择单元。语音合成单元对第一文本执行语音合成。语音识别单元对合成语音执行语音识别以获得第二文本。提取单元通过对第一和第二文本中的每一个执行形态分析来提取特征值。检测单元将第一差分字符串的第一特征值与第二差分字符串的第二特征值进行比较。呈现单元根据第二特征值呈现校正候选。选择单元根据来自用户的指令来选择一个校正候选。

3. 发明授权

US09830904B2 Text-to-speech device, text-to-speech method, and computer program product 有权
公开(公告)号：US09830904B2
公开(公告)日：2017-11-28
申请号：US15185259
申请日：2016-06-17
申请人： KABUSHIKI KAISHA TOSHIBA
发明人： Yu Nasu , Masatsune Tamura , Ryo Morinaka , Masahiro Morita
IPC分类号： G10L13/00 , G10L13/10 , G10L13/06 , G10L13/033
CPC分类号： G10L13/10 , G10L13/033 , G10L13/06
摘要： According to an embodiment, a text-to-speech device includes a context acquirer, an acoustic model parameter acquirer, a conversion parameter acquirer, a converter, and a waveform generator. The context acquirer is configured to acquire a context sequence affecting fluctuations in voice. The acoustic model parameter acquirer is configured to acquire an acoustic model parameter sequence that corresponds to the context sequence and represents an acoustic model in a standard speaking style of a target speaker. The conversion parameter acquirer is configured to acquire a conversion parameter sequence corresponding to the context sequence to convert an acoustic model parameter in the standard speaking style into one in a different speaking style. The converter is configured to convert the acoustic model parameter sequence using the conversion parameter sequence. The waveform generator is configured to generate a voice signal based on the acoustic model parameter sequence acquired after conversion.

4. 发明授权

US09484012B2 Speech synthesis dictionary generation apparatus, speech synthesis dictionary generation method and computer program product 有权
标题翻译：语音合成字典生成装置，语音合成字典生成方法和计算机程序产品
公开(公告)号：US09484012B2
公开(公告)日：2016-11-01
申请号：US14606089
申请日：2015-01-27
申请人： KABUSHIKI KAISHA TOSHIBA
发明人： Masahiro Morita
IPC分类号： G10L13/00 , G10L13/033
CPC分类号： G10L13/033
摘要： According to an embodiment, a speech synthesis dictionary generation apparatus includes an analyzer, a speaker adapter, a level designation unit, and a determination unit. The analyzer analyzes speech data and generates a speech database containing characteristics of utterance by an object speaker. The speaker adapter generates the model of the object speaker by speaker adaptation of converting a base model to be closer to characteristics of the object speaker based on the database. The level designation unit accepts designation of a target speaker level representing a speaker's utterance skill and/or a speaker's native level in a language of the speech synthesis dictionary. The determination determines a parameter related to fidelity of reproduction of speaker properties in the speaker adaptation, in accordance with a relationship between the target speaker level and a speaker level of the object speaker.
摘要翻译：根据实施例，语音合成词典生成装置包括分析器，扬声器适配器，音量指定单元和确定单元。分析器分析语音数据并产生包含对象扬声器的话语特征的语音数据库。扬声器适配器通过基于数据库将基本模型转换为更靠近对象扬声器的特征的扬声器适配器来生成对象扬声器的模型。电平指定单元以语音合成词典的语言接受表示说话人的话语技能和/或说话者的本机级别的目标说话者级别的指定。根据目标扬声器水平与对象扬声器的扬声器水平之间的关系，确定与讲话者适配中的扬声器特性的再现的保真度有关的参数。

5. 发明申请

US20160300564A1 TEXT-TO-SPEECH DEVICE, TEXT-TO-SPEECH METHOD, AND COMPUTER PROGRAM PRODUCT 有权
标题翻译：文本到语音设备，文本到语音方法和计算机程序产品
公开(公告)号：US20160300564A1
公开(公告)日：2016-10-13
申请号：US15185259
申请日：2016-06-17
申请人： KABUSHIKI KAISHA TOSHIBA
发明人： Yu NASU , Masatsune Tamura , Ryo Morinaka , Masahiro Morita
IPC分类号： G10L13/10 , G10L13/06 , G10L13/033
CPC分类号： G10L13/10 , G10L13/033 , G10L13/06
摘要： According to an embodiment, a text-to-speech device includes a context acquirer, an acoustic model parameter acquirer, a conversion parameter acquirer, a converter, and a waveform generator. The context acquirer is configured to acquire a context sequence affecting fluctuations in voice. The acoustic model parameter acquirer is configured to acquire an acoustic model parameter sequence that corresponds to the context sequence and represents an acoustic model in a standard speaking style of a target speaker. The conversion parameter acquirer is configured to acquire a conversion parameter sequence corresponding to the context sequence to convert an acoustic model parameter in the standard speaking style into one in a different speaking style. The converter is configured to convert the acoustic model parameter sequence using the conversion parameter sequence. The waveform generator is configured to generate a voice signal based on the acoustic model parameter sequence acquired after conversion.
摘要翻译：根据实施例，文本到语音设备包括上下文获取器，声学模型参数获取器，转换参数获取器，转换器和波形发生器。上下文获取器被配置为获取影响语音波动的上下文序列。声学模型参数获取器被配置为获取对应于上下文序列的声学模型参数序列，并且表示目标说话者的标准说话风格中的声学模型。转换参数获取器被配置为获取与上下文序列相对应的转换参数序列，以将标准语音风格的声学模型参数转换为不同语音风格的声学模型参数。转换器被配置为使用转换参数序列转换声学模型参数序列。波形发生器被配置为基于在转换之后获取的声学模型参数序列来生成语音信号。

6. 发明授权

US09135910B2 Speech synthesis device, speech synthesis method, and computer program product 有权
标题翻译：语音合成装置，语音合成方法和计算机程序产品
公开(公告)号：US09135910B2
公开(公告)日：2015-09-15
申请号：US13765012
申请日：2013-02-12
申请人： KABUSHIKI KAISHA TOSHIBA
发明人： Masatsune Tamura , Masahiro Morita
IPC分类号： G10L13/00 , G10L13/08 , G10L13/06 , G10L13/033 , G10L15/00
CPC分类号： G10L13/08 , G10L13/033 , G10L13/06
摘要： According to an embodiment, a speech synthesis device includes a first storage, a second storage, a first generator, a second generator, a third generator, and a fourth generator. The first storage is configured to store therein first information obtained from a target uttered voice. The second storage is configured to store therein second information obtained from an arbitrary uttered voice. The first generator is configured to generate third information by converting the second information so as to be close to a target voice quality or prosody. The second generator is configured to generate an information set including the first information and the third information. The third generator is configured to generate fourth information used to generate a synthesized speech, based on the information set. The fourth generator configured to generate the synthesized speech corresponding to input text using the fourth information.
摘要翻译：根据实施例，语音合成装置包括第一存储器，第二存储器，第一发生器，第二发生器，第三发生器和第四发生器。第一存储器被配置为在其中存储从目标发出的语音获得的第一信息。第二存储器被配置为在其中存储从任意发出的语音获得的第二信息。第一生成器被配置为通过转换第二信息以接近目标语音质量或韵律来生成第三信息。第二生成器被配置为生成包括第一信息和第三信息的信息集。第三生成器被配置为基于该信息集合生成用于生成合成语音的第四信息。第四发生器，被配置为使用第四信息生成与输入文本相对应的合成语音。

7. 发明申请

US20140350922A1 SPEECH PROCESSING DEVICE, SPEECH PROCESSING METHOD AND COMPUTER PROGRAM PRODUCT 审中-公开
标题翻译：语音处理设备，语音处理方法和计算机程序产品
公开(公告)号：US20140350922A1
公开(公告)日：2014-11-27
申请号：US14194976
申请日：2014-03-03
申请人： KABUSHIKI KAISHA TOSHIBA
发明人： Yamato Ohtani , Masahiro Morita
IPC分类号： G10L25/18
CPC分类号： G10L25/18 , G10L21/038
摘要： According to an embodiment, a speech processing device includes an extractor, a detector, a generator, a converter, and a compensator. The extractor is configured to extract a speech parameter from a spectral envelope of input speech. The detector is configured to detect a missing band in which a component is missed in the spectral envelope. The generator is configured to generate a parameter for the missing band on the basis of a position of the missing band, statistical information created by using a parameter extracted from a spectral envelope of speech with no missing component, and the extracted speech parameter. The converter is configured to convert the generated parameter to a spectral envelope of the missing band. The compensator is configured to generate a spectral envelope supplemented with the missing band by combining the spectral envelopes of the missing band and of the input speech.
摘要翻译：根据实施例，语音处理装置包括提取器，检测器，发生器，转换器和补偿器。提取器被配置为从输入语音的频谱包络中提取语音参数。检测器被配置为检测在频谱包络中丢失了组件的丢失频带。发生器被配置为基于丢失频带的位置生成用于丢失频带的参数，通过使用从没有丢失分量的语音的频谱包络提取的参数创建的统计信息和所提取的语音参数。转换器被配置为将生成的参数转换为丢失频带的频谱包络。补偿器被配置为通过组合丢失频带和输入语音的频谱包络来产生补充有缺失频带的频谱包络。

8. 发明授权

US10650800B2 Speech processing device, speech processing method, and computer program product 有权
公开(公告)号：US10650800B2
公开(公告)日：2020-05-12
申请号：US15898337
申请日：2018-02-16
申请人： KABUSHIKI KAISHA TOSHIBA
发明人： Masatsune Tamura , Masahiro Morita
IPC分类号： G10L13/06 , G10L25/18 , G10L13/047
摘要： A speech processing device of an embodiment includes a spectrum parameter calculation unit, a phase spectrum calculation unit, a group delay spectrum calculation unit, a band group delay parameter calculation unit, and a band group delay compensation parameter calculation unit. The spectrum parameter calculation unit calculates a spectrum parameter. The phase spectrum calculation unit calculates a first phase spectrum. The group delay spectrum calculation unit calculates a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum. The band group delay parameter calculation unit calculates a band group delay parameter in a predetermined frequency band from a group delay spectrum. The band group delay compensation parameter calculation unit calculates a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum.

9. 发明授权

US10109286B2 Speech synthesizer, audio watermarking information detection apparatus, speech synthesizing method, audio watermarking information detection method, and computer program product 有权
公开(公告)号：US10109286B2
公开(公告)日：2018-10-23
申请号：US15704051
申请日：2017-09-14
申请人： KABUSHIKI KAISHA TOSHIBA
发明人： Kentaro Tachibana , Takehiko Kagoshima , Masatsune Tamura , Masahiro Morita
IPC分类号： G10L21/00 , G10L19/018 , G10L13/033 , G10L13/02 , G10L19/012
摘要： According to an embodiment, a speech synthesizer includes a source generator, a phase modulator, and a vocal tract filter unit. The source generator generates a source signal by using a fundamental frequency sequence and a pulse signal. The phase modulator modulates, with respect to the source signal generated by the source generator, a phase of the pulse signal at each pitch mark based on audio watermarking information. The vocal tract filter unit generates a speech signal by using a spectrum parameter sequence with respect to the source signal in which the phase of the pulse signal is modulated by the phase modulator.

10. 发明授权

US09928828B2 Transliteration work support device, transliteration work support method, and computer program product 有权
公开(公告)号：US09928828B2
公开(公告)日：2018-03-27
申请号：US15090776
申请日：2016-04-05
申请人： KABUSHIKI KAISHA TOSHIBA
发明人： Kosei Fume , Yuka Kuroda , Yoshiaki Mizuoka , Masahiro Morita
IPC分类号： G06F17/28 , G10L13/08 , G10L13/027 , G06F17/22 , G06F17/27
CPC分类号： G10L13/08 , G06F17/2223 , G06F17/2264 , G06F17/273 , G06F17/2755 , G10L13/027
摘要： According to an embodiment, a transliteration work support device includes an analysis unit, a storage unit, an estimation unit, a construction unit, a correction unit, and an update unit. The analysis unit performs language analysis on document data and creates transliteration auxiliary information representing a way of transliteration of a word or a phrase in the document data. The storage unit stores a correction history representing a way of transliteration corrected in the past of the word or the phrase. The estimation unit estimates a correction place and a correction candidate of the document data or the transliteration auxiliary information from the history. The construction unit constructs work list information including work items corresponding to types of corrections according to the correction candidate and progress information. The correction unit corrects the document data or the transliteration auxiliary information. The update unit updates the history and the progress information according to the correction.

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式