会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明授权
    • Method, apparatus and programmed medium for clustering databases with
categorical attributes
    • 用于对具有分类属性的数据库进行聚类的方法,装置和程序化介质
    • US6049797A
    • 2000-04-11
    • US55940
    • 1998-04-07
    • Sudipto GuhaRajeev RastogiKyuseok Shim
    • Sudipto GuhaRajeev RastogiKyuseok Shim
    • G06F17/30G06K9/62
    • G06F17/30598G06K9/6218Y10S707/99932Y10S707/99933Y10S707/99935Y10S707/99936Y10S707/99942Y10S707/99945
    • The present invention relates to a computer method, apparatus and programmed medium for clustering databases containing data with categorical attributes. The present invention assigns a pair of points to be neighbors if their similarity exceeds a certain threshold. The similarity value for pairs of points can be based on non-metric information. The present invention determines a total number of links between each cluster and every other cluster bases upon the neighbors of the clusters. A goodness measure between each cluster and every other cluster based upon the total number of links between each cluster and every other cluster and the total number of points within each cluster and every other cluster is then calculated. The present invention merges the two clusters with the best goodness measure. Thus, clustering is performed accurately and efficiently by merging data based on the amount of links between the data to be clustered.
    • 本发明涉及一种计算机方法,装置和用于对包含具有分类属性的数据进行聚类的数据库的编程介质。 如果它们的相似度超过特定阈值,则本发明将一对点分配为邻居。 点对的相似度值可以基于非度量信息。 本发明确定每个群集与每个其他群集之间的链路的总数量,基于群集的邻居。 基于每个集群和每个其他集群之间的链路总数和每个集群和每个其他集群中的总点数,然后计算每个集群和每个其他集群之间的良好度量。 本发明以最佳的品质度量合并了两个群。 因此,通过基于待聚集的数据之间的链接量合并数据,准确而有效地执行聚类。