会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 10. 发明授权
    • System and method for extraction of off-topic part from conversation
    • 脱离话题部分提取的系统和方法
    • US09002843B2
    • 2015-04-07
    • US13740473
    • 2013-01-14
    • International Business Machines Corporation
    • Nobuyasu ItohMasafumi NishimuraYuto Yamaguchi
    • G06F7/00G06F17/30G06F17/27
    • G06F17/3053G06F17/2785Y10S707/99933
    • A system and method extract off-topic parts from a conversation. The system includes a first corpus including documents of a plurality of fields; a second corpus including only documents of a field to which the conversation belongs; a determination means for determination as a lower limit subject word a word for which IDF value for the first corpus and IDF value for the second corpus are each below a first certain threshold value; a score calculation part for calculation as a score a TF-IDF value for each word included in the second corpus; a clipping part, for sequential cutting out of intervals from text data that are contents of the conversation; and an extraction part for extraction as an off-topic part an interval where average value of the score of words included in the clipped interval is larger than a second certain threshold value.
    • 系统和方法从对话中提取脱离主题的部分。 该系统包括包括多个字段的文档的第一语料库; 第二语料库仅包括会话所属领域的文件; 确定装置,用于将第一语料库的IDF值和第二语料库的IDF值分别低于第一特定阈值的单词确定为下限主题词; 分数计算部分,用于计算包括在第二语料库中的每个单词的分数TF-IDF值; 剪辑部分,用于从作为对话的内容的文本数据中间隔切换; 以及提取部分,用于作为偏离主题部分提取包括在剪切间隔中的单词的分数的平均值大于第二特定阈值的间隔。