会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Method for automatically extracting by-line information
    • 自动提取离线信息的方法
    • US07464078B2
    • 2008-12-09
    • US11259608
    • 2005-10-25
    • Stephen DillMadhukar R. KorupoluAndrew S. Tomkins
    • Stephen DillMadhukar R. KorupoluAndrew S. Tomkins
    • G06F17/30
    • G06F17/30719Y10S707/99932Y10S707/99933
    • A by-line extraction method detects a set of potential headlines from a title meta-tag of a crawled document, selects a candidate headline from the set of potential headlines, and extracts the by-line information from the document using the location of the selected candidate headline. The method constructs the set of potential headlines based on the title meta-tag. The method selects a candidate headline by evaluating the set of potential headlines in order of the lengths of the potential headlines. The method extracts the by-line information from the document by using the location of the selected candidate headline to extract a string representing a date, a name, or a source located within a minimum distance from the location of the potential headline.
    • 逐行提取方法从爬行文档的标题元标签中检测潜在的标题集合,从潜在标题集合中选择候选标题,并使用所选择的位置从文档中提取副线信息 候选人标题。 该方法基于标题元标签构建潜在标题集。 该方法通过以潜在标题的长度的顺序评估潜在标题集来选择候选标题。 该方法通过使用所选择的候选标题的位置来提取来自文档的旁路信息,以提取表示距离潜在标题的位置的最小距离内的日期,名称或源的字符串。
    • 2. 发明授权
    • Automatically extracting by-line information
    • 自动提取离线信息
    • US08321396B2
    • 2012-11-27
    • US12192917
    • 2008-08-15
    • Stephen DillMadhukar R. KorupoluAndrew S. Tomkins
    • Stephen DillMadhukar R. KorupoluAndrew S. Tomkins
    • G06F7/00G06F17/30
    • G06F17/30719Y10S707/99932Y10S707/99933
    • A by-line extraction system detects a set of potential headlines from a title meta-tag of a crawled document, selects a candidate headline from the set of potential headlines, and extracts the by-line information from the document using the location of the selected candidate headline. The system constructs the set of potential headlines based on the title meta-tag. The system selects a candidate headline by evaluating the set of potential headlines in order of the lengths of the potential headlines. The system extracts the by-line information from the document by using the location of the selected candidate headline to extract a string representing a date, a name, or a source located within a minimum distance from the location of the potential headline.
    • 逐行提取系统从爬行文档的标题元标签中检测潜在的标题集合,从潜在标题集合中选择候选标题,并使用所选择的位置从文档中提取副行信息 候选人标题。 该系统基于标题元标签构建潜在标题集。 该系统通过以潜在标题的长度的顺序评估潜在标题集来选择候选标题。 该系统通过使用所选择的候选标题的位置来提取来自文档的旁路信息来提取表示距离潜在标题的位置的最小距离内的日期,名称或源的字符串。
    • 3. 发明授权
    • System and method for searching dates efficiently in a collection of web documents
    • 在Web文档集合中有效搜索日期的系统和方法
    • US07730013B2
    • 2010-06-01
    • US11259664
    • 2005-10-25
    • Stephen DillMadhukar R. Korupolu
    • Stephen DillMadhukar R. Korupolu
    • G06F17/30G06F17/00
    • G06F17/30616
    • A date querying system processes free-form text in documents to identify and locate some or all of the dates in the documents using extended regular expression matching to capture various date formats. The system packages a canonicalized format of each identified date to support various types of queries such as, for example, specific date querying, hierarchical date querying, range date querying, proximity queries comprising a date and any keywords, and any combination of types of queries. The system scans a document to identify the various format dates occurring in the document, disambiguates the resulting occurrences of dates, and canonicalizes the dates according to one or more predetermined formats.
    • 日期查询系统处理文档中的自由格式文本,以使用扩展正则表达式匹配来识别和定位文档中的某些或全部日期以捕获各种日期格式。 该系统将每个标识日期的规范化格式打包成支持各种类型的查询,例如特定日期查询,分层日期查询,范围日期查询,包括日期和任何关键字的邻近查询以及查询类型的任何组合 。 系统扫描文档以识别文档中出现的各种格式日期,消除所导致的日期的发生,并根据一种或多种预定格式规范化日期。