会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明授权
    • Full and semi-batch clustering
    • 全和半批聚类
    • US08880525B2
    • 2014-11-04
    • US13437079
    • 2012-04-02
    • Matthias GalleJean-Michel Renders
    • Matthias GalleJean-Michel Renders
    • G06F17/30
    • G06F17/30G06F17/30707
    • A method for clustering documents is provided. Each document is represented by a multidimensional data point. The data points are initially assigned to a respective cluster and serve as their initial representative points. Thereafter, in an iterative process, the data points are clustered among the clusters, by assigning the data points to the clusters based on a comparison measure of each data point with the cluster or its representative point, and a threshold of the comparison measure. Based on this clustering, a new representative point for each of the clusters can be computed. Optionally, overlapping clusters are merged. For the next iteration, the new representative points are used as the representative points. An assignment of the documents to the clusters is output, based on a clustering of the data points in the latest iteration. Multiple batches may be processed, retaining the initial clusters to which the original batch was assigned.
    • 提供了一种聚类文档的方法。 每个文档由多维数据点表示。 数据点最初分配给相应的集群,并充当其初始代表点。 此后,在迭代过程中,通过基于与簇或其代表点的每个数据点的比较度量以及比较度量的阈值将数据点分配给群集,将数据点聚类在群集中。 基于此聚类,可以计算出每个簇的新的代表点。 可选地,重叠的聚类被合并。 对于下一次迭代,将使用新的代表点作为代表点。 基于最新迭代中数据点的聚类,输出文档到集群的分配。 可以处理多个批次,保留分配原始批次的初始集群。
    • 4. 发明申请
    • FULL AND SEMI-BATCH CLUSTERING
    • 全集和半集群
    • US20130262465A1
    • 2013-10-03
    • US13437079
    • 2012-04-02
    • Matthias GalleJean-Michel Renders
    • Matthias GalleJean-Michel Renders
    • G06F17/30
    • G06F17/30G06F17/30707
    • A method for clustering documents is provided. Each document is represented by a multidimensional data point. The data points are initially assigned to a respective cluster and serve as their initial representative points. Thereafter, in an iterative process, the data points are clustered among the clusters, by assigning the data points to the clusters based on a comparison measure of each data point with the cluster or its representative point, and a threshold of the comparison measure. Based on this clustering, a new representative point for each of the clusters can be computed. Optionally, overlapping clusters are merged. For the next iteration, the new representative points are used as the representative points. An assignment of the documents to the clusters is output, based on a clustering of the data points in the latest iteration. Multiple batches may be processed, retaining the initial clusters to which the original batch was assigned.
    • 提供了一种聚类文档的方法。 每个文档由多维数据点表示。 数据点最初分配给相应的集群,并充当其初始代表点。 此后,在迭代过程中,通过基于与簇或其代表点的每个数据点的比较度量以及比较度量的阈值将数据点分配给群集,将数据点聚类在群集中。 基于此聚类,可以计算出每个簇的新的代表点。 可选地,重叠的聚类被合并。 对于下一次迭代,将使用新的代表点作为代表点。 基于最新迭代中数据点的聚类,输出文档到集群的分配。 可以处理多个批次,保留分配原始批次的初始集群。