会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • System and method for text categorization based on ontologies
    • 基于本体的文本分类系统和方法
    • US08782051B2
    • 2014-07-15
    • US13872022
    • 2013-04-26
    • Kirill ChashchinSergey AnshukovValery BardinSimon Kordonsky
    • Kirill ChashchinSergey AnshukovValery BardinSimon Kordonsky
    • G06F17/30G06F17/27
    • G06F17/277G06F17/30616G06F17/30707G06F17/30734
    • A system for text categorization based on ontologies comprising data collector software modules; a categorizer software module; and a database comprising an indexed database of documents and their categorizations, and further comprising a plurality of ontologies, each ontology comprising a plurality of hierarchical taxonomies and each hierarchical taxonomy comprising a plurality of taxons. The data collector software modules receive a document to be classified and submit them to the categorizer software module; and the categorizer performs the following steps to categorize each document: splitting the document into sentences; selecting words or phrases that are present in ontologies stored in the database server; selecting a plurality of subtrees from the ontologies based on the presence of specific subcategories in the document; determining a weight for each subcategory; pruning subcategories having a weight below a threshold; and for each of the plurality of modified subtrees, computing a conditionality coefficient.
    • 基于本体的文本分类系统,包括数据收集器软件模块; 分类软件模块; 以及数据库,其包括索引的文档数据库及其分类,并且还包括多个本体,每个本体包括多个分级分类以及包括多个分类单元的每个分级分类。 数据收集器软件模块接收要分类的文档,并将其提交给分类器软件模块; 并且分类器执行以下步骤来对每个文档进行分类:将文档分割成句子; 选择存在数据库服务器中的本体中存在的单词或短语; 基于文档中特定子类别的存在从本体中选择多个子树; 确定每个子类别的权重; 具有重量低于阈值的修剪子类别; 并且对于所述多个修改的子树中的每一个,计算条件系数。