会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Methods and systems for matching records and normalizing names
    • 匹配记录和规范化名称的方法和系统
    • US08190538B2
    • 2012-05-29
    • US12363057
    • 2009-01-30
    • Ling Qin ZhangMark WassonValentina Templar
    • Ling Qin ZhangMark WassonValentina Templar
    • G06F15/18
    • G06F17/278G06F17/30985
    • Methods and systems are provided for normalizing strings and for matching records. In one implementation, a string is tokenized into components. Sequences of tags are generated by assigning tags to the components. A sequence of states is determined based on the sequences of tags. A normalized string is generated by normalizing the sequence of the states. A key record including key fields is extracted from a first data source. A candidate record including candidate fields is extracted from a second data source. A numerical record including numerical fields is computed by comparing the key fields and the candidate fields using comparison functions. Matching functions determined by an additive logistic regression method are applied to the numerical fields. Whether the key record and the candidate record are a match is determined based on a sum of results of the matching functions.
    • 提供了方法和系统用于规范化字符串和匹配记录。 在一个实现中,字符串被标记化为组件。 通过将标签分配给组件来生成标签的序列。 基于标签的序列确定状态序列。 通过归一化状态序列来生成归一化的字符串。 从第一数据源提取包括关键字段的关键记录。 从第二数据源提取包括候选字段的候选记录。 通过使用比较函数比较关键字段和候选字段来计算包括数字字段的数字记录。 通过加法逻辑回归法确定的匹配函数应用于数值域。 基于匹配函数的结果的和来确定密钥记录和候选记录是否是匹配。
    • 2. 发明申请
    • METHODS AND SYSTEMS FOR MATCHING RECORDS AND NORMALIZING NAMES
    • 用于匹配记录和正常化名称的方法和系统
    • US20100198756A1
    • 2010-08-05
    • US12363057
    • 2009-01-30
    • Ling Qin ZhangMark WassonValentina Templar
    • Ling Qin ZhangMark WassonValentina Templar
    • G06F15/18G06F17/30G06N5/02
    • G06F17/278G06F17/30985
    • Methods and systems are provided for normalizing strings and for matching records. In one implementation, a string is tokenized into components. Sequences of tags are generated by assigning tags to the components. A sequence of states is determined based on the sequences of tags. A normalized string is generated by normalizing the sequence of the states. A key record including key fields is extracted from a first data source. A candidate record including candidate fields is extracted from a second data source. A numerical record including numerical fields is computed by comparing the key fields and the candidate fields using comparison functions. Matching functions determined by an additive logistic regression method are applied to the numerical fields. Whether the key record and the candidate record are a match is determined based on a sum of results of the matching functions.
    • 提供了方法和系统用于规范化字符串和匹配记录。 在一个实现中,字符串被标记化为组件。 通过将标签分配给组件来生成标签的序列。 基于标签的序列确定状态序列。 通过归一化状态序列来生成归一化的字符串。 从第一数据源提取包括关键字段的关键记录。 从第二数据源提取包括候选字段的候选记录。 通过使用比较函数比较关键字段和候选字段来计算包括数字字段的数字记录。 通过加法逻辑回归法确定的匹配函数应用于数值域。 基于匹配函数的结果的和来确定密钥记录和候选记录是否是匹配。