专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

US20110307504A1 COMBINING ATTRIBUTE REFINEMENTS AND TEXTUAL QUERIES 审中-公开
标题翻译：组合属性研究和文本查询
公开(公告)号：US20110307504A1
公开(公告)日：2011-12-15
申请号：US12796673
申请日：2010-06-09
申请人： Rakesh Agrawal , John Christopher Shafer , Fabian Martin Suchanek
发明人： Rakesh Agrawal , John Christopher Shafer , Fabian Martin Suchanek
IPC分类号： G06F17/30
CPC分类号： G06F16/90328
摘要： A user submits an unstructured query that is analyzed to determine a mapping from attributes to attribute values. One or more matching items from a structured data set are determined based on the attribute values of attributes associated with the items. The matching items are displayed. One or more refinement attributes are displayed, each with one or more attribute values. The attribute values in the refinements that correspond to the attribute values of the query are shown as selected. If the user selects any of the refinement attributes, the query is revised to incorporate the attribute values of the selected refinements. New matching items are determined using the revised structured query. The revised structured query and the new matching items are displayed. This process can be iterated, by modification of the query or the refinements. The matching items, the selected refinement attribute values and the query are synchronized.
摘要翻译：用户提交被分析的非结构化查询，以确定从属性到属性值的映射。基于与项目相关联的属性的属性值来确定来自结构化数据集的一个或多个匹配项目。显示匹配项。显示一个或多个细化属性，每个具有一个或多个属性值。与查询的属性值对应的细化中的属性值显示为选定。如果用户选择任何细化属性，则修改查询以合并所选细化的属性值。使用修订的结构化查询确定新的匹配项。显示修改的结构化查询和新的匹配项。可以通过修改查询或改进来迭代此过程。匹配的项目，所选的细化属性值和查询是同步的。

2. 发明授权

US09158813B2 Relaxation for structured queries 有权
标题翻译：结构化查询放松
公开(公告)号：US09158813B2
公开(公告)日：2015-10-13
申请号：US12796678
申请日：2010-06-09
申请人： Alexandros Ntoulas , Sreenivas Gollapudi , Samuel Ieong , Stelios Paparizos , John Christopher Shafer
发明人： Alexandros Ntoulas , Sreenivas Gollapudi , Samuel Ieong , Stelios Paparizos , John Christopher Shafer
IPC分类号： G06F17/30
CPC分类号： G06F17/30451 , G06F17/30861 , G06F17/30864
摘要： A structured query may specify attribute values for attributes. An estimate of the number of items that will match the structured query if it is applied to a structured database is determined. If the estimated number of items is below a threshold, the structured query may be relaxed to form new candidate structured queries. The number of candidate queries may be determined based on a desired running time. Each of the candidate structured queries may be determined by changing one or more attribute values of the attributes of the structured query. Estimates of the number of items each of the candidate structured queries will match is determined, and the candidate structured query that has the highest matching estimation is used to query the database. The matching results may be output.
摘要翻译：结构化查询可以指定属性的属性值。确定将结构化查询与应用于结构化数据库时匹配的项目数量的估计。如果估计的项目数量低于阈值，则可以放宽结构化查询以形成新的候选结构化查询。可以基于期望的运行时间来确定候选查询的数量。可以通过改变结构化查询的属性的一个或多个属性值来确定每个候选结构化查询。确定每个候选结构化查询将匹配的项目的数量的估计，并且使用具有最高匹配估计的候选结构化查询来查询数据库。可以输出匹配结果。

3. 发明授权

US06592627B1 System and method for organizing repositories of semi-structured documents such as email 有权
标题翻译：用于组织电子邮件等半结构化文件的存储库的系统和方法
公开(公告)号：US06592627B1
公开(公告)日：2003-07-15
申请号：US09329684
申请日：1999-06-10
申请人： Rakesh Agrawal , Roberto Javier Bayardo , Dimitrios Gunopulos , Ching-Tien Howard Ho , Sunita Sarawagi , John Christopher Shafer , Ramakrishnan Srikant
发明人： Rakesh Agrawal , Roberto Javier Bayardo , Dimitrios Gunopulos , Ching-Tien Howard Ho , Sunita Sarawagi , John Christopher Shafer , Ramakrishnan Srikant
IPC分类号： G06F1730
CPC分类号： G06F17/30011
摘要： A user can easily organize computerized document folders by associating a few sample documents in the document database with each folder. The present invention learns folder profiles based on the sample documents and moves the remaining documents into the folders accordingly. In this way, the user can construct new folders, or rearrange existing folders, or cause the computer to automatically rearrange and maintain the folders. This is particularly useful for managing a database of perhaps thousands of emails.
摘要翻译：用户可以通过将文档数据库中的几个示例文档与每个文件夹相关联来轻松组织计算机化的文档文件夹。本发明基于样本文件学习文件夹简档，并将剩余的文档相应地移动到文件夹中。这样，用户可以构建新的文件夹，或重新排列现有的文件夹，或使计算机自动重新排列和维护文件夹。这对于管理数以千计的电子邮件的数据库特别有用。

4. 发明授权

US5884320A Method and system for performing proximity joins on high-dimensional data points in parallel 失效
标题翻译：用于在高维数据点上并行执行邻近连接的方法和系统
公开(公告)号：US5884320A
公开(公告)日：1999-03-16
申请号：US920331
申请日：1997-08-20
申请人： Rakesh Agrawal , John Christopher Shafer
发明人： Rakesh Agrawal , John Christopher Shafer
IPC分类号： G06F17/30
CPC分类号： G06F17/30592 , G06F17/30445 , G06F17/30498 , Y10S707/99932 , Y10S707/99945 , Y10S707/99948
摘要： A method and system for performing spatial proximity joins on high-dimensional points representing data objects of a database in parallel in a multiprocessor system. The method comprises the steps of: partitioning the data points among the processors; creating index structures for the data points of the processors in parallel; assigning the join operations to the processors using the index structures; and simultaneously redistributing and joining the data points in the processors in parallel based on a predetermined joining condition. An efficient data structure, .epsilon.-K-D-B tree, is used to provide fast access to the high-dimensional points and to minimize system storage requirements. The invention achieves fast response time and requires minimum storage space by having structurally identical indices among the processors, assigning workload based on the join costs, and redistributing the data points among the processors while joining the data whenever possible.
摘要翻译：一种用于在多处理器系统中并行地表示数据库的数据对象的高维点处执行空间邻近连接的方法和系统。该方法包括以下步骤：对处理器之间的数据点进行分割; 为处理器的数据点并行创建索引结构; 使用索引结构将连接操作分配给处理器; 并且基于预定的接合条件并行地重新分配和连接处理器中的数据点。使用有效的数据结构epsilon -K-D-B树来提供对高维点的快速访问并且最小化系统存储要求。本发明通过在处理器之间具有结构相同的索引来实现快速的响应时间并且需要最小的存储空间，基于加入成本分配工作负荷，并且在可能的情况下加入数据时在处理器之间重新分配数据点。

5. 发明授权

US5799311A Method and system for generating a decision-tree classifier independent of system memory size 失效
标题翻译：用于生成独立于系统内存大小的决策树分类器的方法和系统
公开(公告)号：US5799311A
公开(公告)日：1998-08-25
申请号：US646893
申请日：1996-05-08
申请人： Rakesh Agrawal , Manish Mehta , John Christopher Shafer
发明人： Rakesh Agrawal , Manish Mehta , John Christopher Shafer
IPC分类号： G06F17/30
CPC分类号： G06F17/30705 , G06F17/30625 , G06F2216/03 , Y10S707/99943
摘要： A method and system are disclosed for generating a decision-tree classifier from a training set of records, independent of the system memory size. The method comprises the steps of: generating an attribute list for each attribute of the records, sorting the attribute lists for numeric attributes, and generating a decision tree by repeatedly partitioning the records using the attribute lists. For each node, split points are evaluated to determine the best split test for partitioning the records at the node. Preferably, a gini index and class histograms are used in determining the best splits. The gini index indicates how well a split point separates the records while the class histograms reflect the class distribution of the records at the node. Also, a hash table is built as the attribute list of the split attribute is divided among the child nodes, which is then used for splitting the remaining attribute lists of the node. The created tree is further pruned based on the MDL principle, which encodes the tree and split tests in an MDL-based code, and determines whether to prune and how to prune each node based on the code length of the node.
摘要翻译：公开了用于从记录的训练集合生成决策树分类器的方法和系统，与系统存储器大小无关。该方法包括以下步骤：为记录的每个属性生成属性列表，对数字属性的属性列表进行排序，以及通过使用属性列表重复分割记录来生成决策树。对于每个节点，分析点进行评估，以确定分区节点上的记录的最佳分割测试。优选地，使用基尼系数索引和类别直方图来确定最佳分割。 gini指数表示分割点将记录分离成多少，而类直方图反映了节点上记录的类分布。此外，由于分割属性的属性列表在子节点之间划分，因此构建了哈希表，然后用于分割节点的剩余属性列表。基于MDL原理进一步修剪创建的树，MDL原理对基于MDL的代码中的树和分割测试进行编码，并根据节点的代码长度确定是否修剪和如何修剪每个节点。

6. 发明申请

US20110307517A1 RELAXATION FOR STRUCTURED QUERIES 有权
标题翻译：结构性质疑的放松
公开(公告)号：US20110307517A1
公开(公告)日：2011-12-15
申请号：US12796678
申请日：2010-06-09
申请人： Alexandros Ntoulas , Sreenivas Gollapudi , Samuel Ieong , Stelios Paparizos , John Christopher Shafer
发明人： Alexandros Ntoulas , Sreenivas Gollapudi , Samuel Ieong , Stelios Paparizos , John Christopher Shafer
IPC分类号： G06F17/30
CPC分类号： G06F17/30451 , G06F17/30861 , G06F17/30864
摘要： A structured query may specify attribute values for attributes. An estimate of the number of items that will match the structured query if it is applied to a structured database is determined. If the estimated number of items is below a threshold, the structured query may be relaxed to form new candidate structured queries. The number of candidate queries may be determined based on a desired running time. Each of the candidate structured queries may be determined by changing one or more attribute values of the attributes of the structured query. Estimates of the number of items each of the candidate structured queries will match is determined, and the candidate structured query that has the highest matching estimation is used to query the database. The matching results may be output.
摘要翻译：结构化查询可以指定属性的属性值。确定将结构化查询与应用于结构化数据库时匹配的项目数量的估计。如果估计的项目数量低于阈值，则可以放宽结构化查询以形成新的候选结构化查询。可以基于期望的运行时间来确定候选查询的数量。可以通过改变结构化查询的属性的一个或多个属性值来确定每个候选结构化查询。确定每个候选结构化查询将匹配的项目的数量的估计，并且使用具有最高匹配估计的候选结构化查询来查询数据库。可以输出匹配结果。

7. 发明授权

US06633885B1 System and method for web-based querying 有权
标题翻译：用于基于Web的查询的系统和方法
公开(公告)号：US06633885B1
公开(公告)日：2003-10-14
申请号：US09477257
申请日：2000-01-04
申请人： Rakesh Agrawal , John Christopher Shafer
发明人： Rakesh Agrawal , John Christopher Shafer
IPC分类号： G06F1730
CPC分类号： G06F17/30864 , Y10S707/99933 , Y10S707/99935 , Y10S707/99943
摘要： A system and method for exporing a web-accessible database includes providing a GUI that a user can manipulate to quickly modify the results of a query to expand or contract the results set, without requiring additional querying. Attribute controls can be manipulated to impose restrictions on the results set, including by designating example records the attributes of which are used to restrict the records displayed to the user. Only records that can be displayed are instantiated, to further increase the speed of the system.
摘要翻译：用于退出web可访问数据库的系统和方法包括提供用户可以操纵以在不需要额外查询的情况下快速修改查询结果以扩展或收缩结果集的GUI。可以操纵属性控件对结果集施加限制，包括通过指定用于限制向用户显示的记录的属性的示例记录。只有可以显示的记录被实例化，以进一步提高系统的速度。

8. 发明授权

US6138115A Method and system for generating a decision-tree classifier in parallel in a multi-processor system 有权
标题翻译：在多处理器系统中并行生成决策树分类器的方法和系统
公开(公告)号：US6138115A
公开(公告)日：2000-10-24
申请号：US245765
申请日：1999-02-05
申请人： Rakesh Agrawal , Manish Mehta , John Christopher Shafer
发明人： Rakesh Agrawal , Manish Mehta , John Christopher Shafer
IPC分类号： G06F17/30
CPC分类号： G06F17/30705 , G06F17/30625 , Y10S707/962 , Y10S707/966 , Y10S707/968 , Y10S707/99933 , Y10S707/99936 , Y10S707/99937 , Y10S707/99944
摘要： A method and system are disclosed for generating a decision-tree classifier in parallel in a multi-processor system, from a training set of records. The method comprises the steps of: partitioning the records among the processors, each processor generating an attribute list for each attribute, and the processors cooperatively generating a decision tree by repeatedly partitioning the records using the attribute lists. For each node, each processor determines its best split test and, along with other processors, selects the best overall split for the records at that node. Preferably, the gini-index and class histograms are used in determining the best splits. Also, each processor builds a hash table using the attribute list of the split attribute and shares it with other processors. The hash tables are used for splitting the remaining attribute lists. The created tree is then pruned based on the MDL principle, which encodes the tree and split tests in an MDL-based code, and determines whether to prune and how to prune each node based on the code length of the node.
摘要翻译：公开了一种用于在多处理器系统中从培训记录集并行生成决策树分类器的方法和系统。该方法包括以下步骤：在处理器之间划分记录，每个处理器为每个属性生成属性列表，并且处理器通过使用属性列表重复分割记录来协同地生成决策树。对于每个节点，每个处理器确定其最佳分割测试，并与其他处理器一起为该节点上的记录选择最佳的整体分割。优选地，使用基尼系数索引和类别直方图来确定最佳分割。此外，每个处理器使用split属性列表构建哈希表，并与其他处理器共享。散列表用于分割剩余的属性列表。然后，基于MDL原理修剪创建的树，MDL原理在基于MDL的代码中对树进行编码和分割测试，并根据节点的代码长度确定是否修剪和如何修剪每个节点。

9. 发明授权

US5870735A Method and system for generating a decision-tree classifier in parallel in a multi-processor system 失效
公开(公告)号：US5870735A
公开(公告)日：1999-02-09
申请号：US641404
申请日：1996-05-01
申请人： Rakesh Agrawal , Manish Mehta , John Christopher Shafer
发明人： Rakesh Agrawal , Manish Mehta , John Christopher Shafer
IPC分类号： G06F17/30
CPC分类号： G06F17/30705 , G06F17/30625 , Y10S707/962 , Y10S707/966 , Y10S707/968 , Y10S707/99933 , Y10S707/99936 , Y10S707/99937 , Y10S707/99944
摘要： A method and system are disclosed for generating a decision-tree classifier in parallel in a multi-processor system, from a training set of records. The method comprises the steps of: partitioning the records among the processors, each processor generating an attribute list for each attribute, and the processors cooperatively generating a decision tree by repeatedly partitioning the records using the attribute lists. For each node, each processor determines its best split test and, along with other processors, selects the best overall split for the records at that node. Preferably, the gini-index and class histograms are used in determining the best splits. Also, each processor builds a hash table using the attribute list of the split attribute and shares it with other processors. The hash tables are used for splitting the remaining attribute lists. The created tree is then pruned based on the MDL principle, which encodes the tree and split tests in an MDL-based code, and determines whether to prune and how to prune each node based on the code length of the node.

10. 发明授权

US5842200A System and method for parallel mining of association rules in databases 失效
标题翻译：数据库中关联规则并行挖掘的系统和方法
公开(公告)号：US5842200A
公开(公告)日：1998-11-24
申请号：US500717
申请日：1995-07-11
申请人： Rakesh Agrawal , John Christopher Shafer
发明人： Rakesh Agrawal , John Christopher Shafer
IPC分类号： G06Q30/02 , G06F17/30
CPC分类号： G06Q30/02 , G06F2216/03 , Y10S707/99931 , Y10S707/99933
摘要： A multiprocessor including a plurality of processing systems is disclosed for discovering consumer purchasing tendencies. Each processing system of the multiprocessor identifies consumer transaction itemsets that are stored in a database that is distributed among the processing systems and which appear in the database a user-defined minimum number of times, referred to as minimum support. Then, the system discovers association rules in the itemsets by comparing the ratio of the number of times each of the large itemsets appears in the database to the number of times particular subsets of the itemset appear in the database. When the ratio exceeds a predetermined minimum confidence value, the system outputs an association rule which is representative of purchasing tendencies of consumers.
摘要翻译：公开了一种包括多个处理系统的多处理器，用于发现消费者购买倾向。多处理器的每个处理系统识别存储在数据库中的消费者事务项目集，该数据库分布在处理系统之间，并且在数据库中出现用户定义的最小次数，称为最小支持。然后，通过比较数据库中出现的每个大项目集的次数与数据库中出现的项目集的特定子集的次数之间的比例，系统发现项目集中的关联规则。当比率超过预定的最小置信度值时，系统输出代表消费者购买倾向的关联规则。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式