会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明申请
    • COMBINING ATTRIBUTE REFINEMENTS AND TEXTUAL QUERIES
    • 组合属性研究和文本查询
    • US20110307504A1
    • 2011-12-15
    • US12796673
    • 2010-06-09
    • Rakesh AgrawalJohn Christopher ShaferFabian Martin Suchanek
    • Rakesh AgrawalJohn Christopher ShaferFabian Martin Suchanek
    • G06F17/30
    • G06F16/90328
    • A user submits an unstructured query that is analyzed to determine a mapping from attributes to attribute values. One or more matching items from a structured data set are determined based on the attribute values of attributes associated with the items. The matching items are displayed. One or more refinement attributes are displayed, each with one or more attribute values. The attribute values in the refinements that correspond to the attribute values of the query are shown as selected. If the user selects any of the refinement attributes, the query is revised to incorporate the attribute values of the selected refinements. New matching items are determined using the revised structured query. The revised structured query and the new matching items are displayed. This process can be iterated, by modification of the query or the refinements. The matching items, the selected refinement attribute values and the query are synchronized.
    • 用户提交被分析的非结构化查询,以确定从属性到属性值的映射。 基于与项目相关联的属性的属性值来确定来自结构化数据集的一个或多个匹配项目。 显示匹配项。 显示一个或多个细化属性,每个具有一个或多个属性值。 与查询的属性值对应的细化中的属性值显示为选定。 如果用户选择任何细化属性,则修改查询以合并所选细化的属性值。 使用修订的结构化查询确定新的匹配项。 显示修改的结构化查询和新的匹配项。 可以通过修改查询或改进来迭代此过程。 匹配的项目,所选的细化属性值和查询是同步的。
    • 4. 发明授权
    • Method and system for performing proximity joins on high-dimensional
data points in parallel
    • 用于在高维数据点上并行执行邻近连接的方法和系统
    • US5884320A
    • 1999-03-16
    • US920331
    • 1997-08-20
    • Rakesh AgrawalJohn Christopher Shafer
    • Rakesh AgrawalJohn Christopher Shafer
    • G06F17/30
    • G06F17/30592G06F17/30445G06F17/30498Y10S707/99932Y10S707/99945Y10S707/99948
    • A method and system for performing spatial proximity joins on high-dimensional points representing data objects of a database in parallel in a multiprocessor system. The method comprises the steps of: partitioning the data points among the processors; creating index structures for the data points of the processors in parallel; assigning the join operations to the processors using the index structures; and simultaneously redistributing and joining the data points in the processors in parallel based on a predetermined joining condition. An efficient data structure, .epsilon.-K-D-B tree, is used to provide fast access to the high-dimensional points and to minimize system storage requirements. The invention achieves fast response time and requires minimum storage space by having structurally identical indices among the processors, assigning workload based on the join costs, and redistributing the data points among the processors while joining the data whenever possible.
    • 一种用于在多处理器系统中并行地表示数据库的数据对象的高维点处执行空间邻近连接的方法和系统。 该方法包括以下步骤:对处理器之间的数据点进行分割; 为处理器的数据点并行创建索引结构; 使用索引结构将连接操作分配给处理器; 并且基于预定的接合条件并行地重新分配和连接处理器中的数据点。 使用有效的数据结构epsilon -K-D-B树来提供对高维点的快速访问并且最小化系统存储要求。 本发明通过在处理器之间具有结构相同的索引来实现快速的响应时间并且需要最小的存储空间,基于加入成本分配工作负荷,并且在可能的情况下加入数据时在处理器之间重新分配数据点。
    • 5. 发明授权
    • Method and system for generating a decision-tree classifier independent
of system memory size
    • 用于生成独立于系统内存大小的决策树分类器的方法和系统
    • US5799311A
    • 1998-08-25
    • US646893
    • 1996-05-08
    • Rakesh AgrawalManish MehtaJohn Christopher Shafer
    • Rakesh AgrawalManish MehtaJohn Christopher Shafer
    • G06F17/30
    • G06F17/30705G06F17/30625G06F2216/03Y10S707/99943
    • A method and system are disclosed for generating a decision-tree classifier from a training set of records, independent of the system memory size. The method comprises the steps of: generating an attribute list for each attribute of the records, sorting the attribute lists for numeric attributes, and generating a decision tree by repeatedly partitioning the records using the attribute lists. For each node, split points are evaluated to determine the best split test for partitioning the records at the node. Preferably, a gini index and class histograms are used in determining the best splits. The gini index indicates how well a split point separates the records while the class histograms reflect the class distribution of the records at the node. Also, a hash table is built as the attribute list of the split attribute is divided among the child nodes, which is then used for splitting the remaining attribute lists of the node. The created tree is further pruned based on the MDL principle, which encodes the tree and split tests in an MDL-based code, and determines whether to prune and how to prune each node based on the code length of the node.
    • 公开了用于从记录的训练集合生成决策树分类器的方法和系统,与系统存储器大小无关。 该方法包括以下步骤:为记录的每个属性生成属性列表,对数字属性的属性列表进行排序,以及通过使用属性列表重复分割记录来生成决策树。 对于每个节点,分析点进行评估,以确定分区节点上的记录的最佳分割测试。 优选地,使用基尼系数索引和类别直方图来确定最佳分割。 gini指数表示分割点将记录分离成多少,而类直方图反映了节点上记录的类分布。 此外,由于分割属性的属性列表在子节点之间划分,因此构建了哈希表,然后用于分割节点的剩余属性列表。 基于MDL原理进一步修剪创建的树,MDL原理对基于MDL的代码中的树和分割测试进行编码,并根据节点的代码长度确定是否修剪和如何修剪每个节点。
    • 8. 发明授权
    • Method and system for generating a decision-tree classifier in parallel
in a multi-processor system
    • 在多处理器系统中并行生成决策树分类器的方法和系统
    • US6138115A
    • 2000-10-24
    • US245765
    • 1999-02-05
    • Rakesh AgrawalManish MehtaJohn Christopher Shafer
    • Rakesh AgrawalManish MehtaJohn Christopher Shafer
    • G06F17/30
    • G06F17/30705G06F17/30625Y10S707/962Y10S707/966Y10S707/968Y10S707/99933Y10S707/99936Y10S707/99937Y10S707/99944
    • A method and system are disclosed for generating a decision-tree classifier in parallel in a multi-processor system, from a training set of records. The method comprises the steps of: partitioning the records among the processors, each processor generating an attribute list for each attribute, and the processors cooperatively generating a decision tree by repeatedly partitioning the records using the attribute lists. For each node, each processor determines its best split test and, along with other processors, selects the best overall split for the records at that node. Preferably, the gini-index and class histograms are used in determining the best splits. Also, each processor builds a hash table using the attribute list of the split attribute and shares it with other processors. The hash tables are used for splitting the remaining attribute lists. The created tree is then pruned based on the MDL principle, which encodes the tree and split tests in an MDL-based code, and determines whether to prune and how to prune each node based on the code length of the node.
    • 公开了一种用于在多处理器系统中从培训记录集并行生成决策树分类器的方法和系统。 该方法包括以下步骤:在处理器之间划分记录,每个处理器为每个属性生成属性列表,并且处理器通过使用属性列表重复分割记录来协同地生成决策树。 对于每个节点,每个处理器确定其最佳分割测试,并与其他处理器一起为该节点上的记录选择最佳的整体分割。 优选地,使用基尼系数索引和类别直方图来确定最佳分割。 此外,每个处理器使用split属性列表构建哈希表,并与其他处理器共享。 散列表用于分割剩余的属性列表。 然后,基于MDL原理修剪创建的树,MDL原理在基于MDL的代码中对树进行编码和分割测试,并根据节点的代码长度确定是否修剪和如何修剪每个节点。
    • 10. 发明授权
    • System and method for parallel mining of association rules in databases
    • 数据库中关联规则并行挖掘的系统和方法
    • US5842200A
    • 1998-11-24
    • US500717
    • 1995-07-11
    • Rakesh AgrawalJohn Christopher Shafer
    • Rakesh AgrawalJohn Christopher Shafer
    • G06Q30/02G06F17/30
    • G06Q30/02G06F2216/03Y10S707/99931Y10S707/99933
    • A multiprocessor including a plurality of processing systems is disclosed for discovering consumer purchasing tendencies. Each processing system of the multiprocessor identifies consumer transaction itemsets that are stored in a database that is distributed among the processing systems and which appear in the database a user-defined minimum number of times, referred to as minimum support. Then, the system discovers association rules in the itemsets by comparing the ratio of the number of times each of the large itemsets appears in the database to the number of times particular subsets of the itemset appear in the database. When the ratio exceeds a predetermined minimum confidence value, the system outputs an association rule which is representative of purchasing tendencies of consumers.
    • 公开了一种包括多个处理系统的多处理器,用于发现消费者购买倾向。 多处理器的每个处理系统识别存储在数据库中的消费者事务项目集,该数据库分布在处理系统之间,并且在数据库中出现用户定义的最小次数,称为最小支持。 然后,通过比较数据库中出现的每个大项目集的次数与数据库中出现的项目集的特定子集的次数之间的比例,系统发现项目集中的关联规则。 当比率超过预定的最小置信度值时,系统输出代表消费者购买倾向的关联规则。