专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US08880525B2 Full and semi-batch clustering 有权
标题翻译：全和半批聚类
公开(公告)号：US08880525B2
公开(公告)日：2014-11-04
申请号：US13437079
申请日：2012-04-02
申请人： Matthias Galle , Jean-Michel Renders
发明人： Matthias Galle , Jean-Michel Renders
IPC分类号： G06F17/30
CPC分类号： G06F17/30 , G06F17/30707
摘要： A method for clustering documents is provided. Each document is represented by a multidimensional data point. The data points are initially assigned to a respective cluster and serve as their initial representative points. Thereafter, in an iterative process, the data points are clustered among the clusters, by assigning the data points to the clusters based on a comparison measure of each data point with the cluster or its representative point, and a threshold of the comparison measure. Based on this clustering, a new representative point for each of the clusters can be computed. Optionally, overlapping clusters are merged. For the next iteration, the new representative points are used as the representative points. An assignment of the documents to the clusters is output, based on a clustering of the data points in the latest iteration. Multiple batches may be processed, retaining the initial clusters to which the original batch was assigned.
摘要翻译：提供了一种聚类文档的方法。每个文档由多维数据点表示。数据点最初分配给相应的集群，并充当其初始代表点。此后，在迭代过程中，通过基于与簇或其代表点的每个数据点的比较度量以及比较度量的阈值将数据点分配给群集，将数据点聚类在群集中。基于此聚类，可以计算出每个簇的新的代表点。可选地，重叠的聚类被合并。对于下一次迭代，将使用新的代表点作为代表点。基于最新迭代中数据点的聚类，输出文档到集群的分配。可以处理多个批次，保留分配原始批次的初始集群。

2. 发明授权

US09483463B2 Method and system for motif extraction in electronic documents 有权
标题翻译：电子文件中图案提取的方法和系统
公开(公告)号：US09483463B2
公开(公告)日：2016-11-01
申请号：US13608312
申请日：2012-09-10
申请人： Matthias Galle , Jean-Michel Renders
发明人： Matthias Galle , Jean-Michel Renders
IPC分类号： G10L15/22 , G10L15/187 , G06Q30/04 , G06Q30/02 , G06Q40/00 , G06F17/27 , G06F17/24 , G06F19/18 , G06F19/22 , G06F7/24
CPC分类号： G06F17/2775 , G06F17/248
摘要： A method, system, and computer program product for extracting text motifs from the electronic documents is disclosed. A user provides a largest-maximal repeat or a super-maximal repeat as a first text block. The occurrences of the first text block are detected to identify the second text blocks in the vicinity of the occurrences of the first text block on the basis of pre-defined parameters. The text motifs are determined by combining the first text block and the second text block. Finally, the text motifs are extracted from the electronic documents.
摘要翻译：公开了一种用于从电子文档中提取文本图案的方法，系统和计算机程序产品。用户提供最大最大重复或超最大重复作为第一文本块。检测第一文本块的出现，以基于预定义的参数来识别第一文本块的出现附近的第二文本块。通过组合第一文本块和第二文本块来确定文本图案。最后，从电子文件中提取文本图案。

3. 发明申请

US20140074455A1 METHOD AND SYSTEM FOR MOTIF EXTRACTION IN ELECTRONIC DOCUMENTS 有权
标题翻译：电子文档中动力提取的方法和系统
公开(公告)号：US20140074455A1
公开(公告)日：2014-03-13
申请号：US13608312
申请日：2012-09-10
申请人： Matthias Galle , Jean-Michel Renders
发明人： Matthias Galle , Jean-Michel Renders
IPC分类号： G06F17/27
CPC分类号： G06F17/2775 , G06F17/248
摘要： A method, system, and computer program product for extracting text motifs from the electronic documents is disclosed. A user provides a largest-maximal repeat or a super-maximal repeat as a first text block. The occurrences of the first text block are detected to identify the second text blocks in the vicinity of the occurrences of the first text block on the basis of pre-defined parameters. The text motifs are determined by combining the first text block and the second text block. Finally, the text motifs are extracted from the electronic documents.
摘要翻译：公开了一种用于从电子文档中提取文本图案的方法，系统和计算机程序产品。用户提供最大最大重复或超最大重复作为第一文本块。检测第一文本块的出现，以基于预定义的参数来识别第一文本块的出现附近的第二文本块。通过组合第一文本块和第二文本块来确定文本图案。最后，从电子文件中提取文本图案。

4. 发明申请

US20130262465A1 FULL AND SEMI-BATCH CLUSTERING 有权
标题翻译：全集和半集群
公开(公告)号：US20130262465A1
公开(公告)日：2013-10-03
申请号：US13437079
申请日：2012-04-02
申请人： Matthias Galle , Jean-Michel Renders
发明人： Matthias Galle , Jean-Michel Renders
IPC分类号： G06F17/30
CPC分类号： G06F17/30 , G06F17/30707
摘要： A method for clustering documents is provided. Each document is represented by a multidimensional data point. The data points are initially assigned to a respective cluster and serve as their initial representative points. Thereafter, in an iterative process, the data points are clustered among the clusters, by assigning the data points to the clusters based on a comparison measure of each data point with the cluster or its representative point, and a threshold of the comparison measure. Based on this clustering, a new representative point for each of the clusters can be computed. Optionally, overlapping clusters are merged. For the next iteration, the new representative points are used as the representative points. An assignment of the documents to the clusters is output, based on a clustering of the data points in the latest iteration. Multiple batches may be processed, retaining the initial clusters to which the original batch was assigned.
摘要翻译：提供了一种聚类文档的方法。每个文档由多维数据点表示。数据点最初分配给相应的集群，并充当其初始代表点。此后，在迭代过程中，通过基于与簇或其代表点的每个数据点的比较度量以及比较度量的阈值将数据点分配给群集，将数据点聚类在群集中。基于此聚类，可以计算出每个簇的新的代表点。可选地，重叠的聚类被合并。对于下一次迭代，将使用新的代表点作为代表点。基于最新迭代中数据点的聚类，输出文档到集群的分配。可以处理多个批次，保留分配原始批次的初始集群。

5. 发明授权

US09189473B2 System and method for resolving entity coreference 有权
标题翻译：解决实体协同的系统和方法
公开(公告)号：US09189473B2
公开(公告)日：2015-11-17
申请号：US13475250
申请日：2012-05-18
申请人： Matthias Gallé , Jean-Michel Renders , Guillaume Jacquet
发明人： Matthias Gallé , Jean-Michel Renders , Guillaume Jacquet
IPC分类号： G06F17/30 , G06F17/27
CPC分类号： G06F17/278 , G06F17/2795
摘要： A method and a system for coreference resolution are provided. The method includes receiving a set of document clusters, each cluster in the set of document clusters including a set of text documents. Instances of each of a set of candidate named entities are identified in the document clusters. For a pairs of the candidate named entities, at least one socio-temporal feature is computed that is based on the similarity of the distributions of identified instances of the respective candidate name entities among the document clusters. A decision for merging for the candidate named entities into a common real named entity is based on the socio-temporal features.
摘要翻译：提供了一种解决方案和系统。该方法包括接收一组文档集群，每组文档集群中的每个集群包括一组文本文档。一组候选命名实体中的每一个的实例在文档集群中被识别。对于一对候选命名实体，计算至少一个社会时间特征，其基于文档簇中相应候选名称实体的所识别实例的分布的相似性。将候选名称实体合并为一个共同的真实命名实体的决定是基于社会时间特征。

6. 发明授权

US08892562B2 Categorization of multi-page documents by anisotropic diffusion 有权
标题翻译：通过各向异性扩散分类多页文档
公开(公告)号：US08892562B2
公开(公告)日：2014-11-18
申请号：US13558814
申请日：2012-07-26
申请人： Jean-Michel Renders , François Ragnet , Damien Cramet
发明人： Jean-Michel Renders , François Ragnet , Damien Cramet
IPC分类号： G06F17/30
CPC分类号： G06F17/30265 , G06F17/30256 , G06K9/00483
摘要： A computer implemented system and method are provided for refining category scores for pages of a sequence of document pages that potentially includes document boundaries. The method uses initial category scores provided by a categorizer that considers one page at a time or concatenated pairs of pages (called bipages). The category scores represent the probability that a page belongs to a particular category. The method uses anisotropic diffusion to refine the initial page category scores using the scores of neighboring pages as a function of the probability that there is a boundary between the pages. The method may be performed iteratively.
摘要翻译：提供了一种计算机实现的系统和方法，用于对可能包括文档边界的文档页面序列的页面的类别分数进行细化。该方法使用由分类程序提供的初始类别分数，该分类程序一次考虑一个页面或连接的页面对（称为“比较”）。类别分数表示页面属于特定类别的概率。该方法使用各向异性扩散来使用相邻页面的分数来优化初始页面类别分数，作为页面之间存在边界的概率的函数。可以迭代地执行该方法。

7. 发明授权

US08165974B2 System and method for assisted document review 有权
标题翻译：辅助文件审查的系统和方法
公开(公告)号：US08165974B2
公开(公告)日：2012-04-24
申请号：US12479972
申请日：2009-06-08
申请人： Caroline Privault , Jacki O'Neill , Jean-Michel Renders , Victor Ciriza , Yves Hoppenot
发明人： Caroline Privault , Jacki O'Neill , Jean-Michel Renders , Victor Ciriza , Yves Hoppenot
IPC分类号： G06F15/18 , G06F5/00 , G06F17/00
CPC分类号： G06N5/043 , G06Q10/10 , G06Q50/18
摘要： A system and method for reviewing documents are provided. A collection of documents is portioned into sets of documents for review by a plurality of reviewers. For each set, documents in the set are displayed on a display device for review by a reviewer and temporarily organized through grouping and sorting. The reviewer's labels for the displayed documents are received. Based on the reviewer's labels, a class from a plurality of classes is assigned to each of the reviewed documents. A classifier model stored in computer memory is progressively trained, based on features extracted from the reviewed documents in the set and their assigned classes. Prior to review of all documents in the set, a calculated subset of documents for which the classifier model assigns a class different from the one assigned based on the reviewer's label is returned for a second review by a reviewer. Models generated from one or more other document sets can be used to assess the review of a first of the sets.
摘要翻译：提供了一种审查文件的系统和方法。一组文件分为多组文件供多位评审员审阅。对于每个集合，集合中的文档显示在显示设备上，供审阅者查看，并通过分组和排序进行临时组织。接收到显示文件的审阅者标签。根据审阅者的标签，将来自多个类的课程分配给每个经审查的文档。存储在计算机存储器中的分类器模型基于从集合中的经审查的文档及其分配的类中提取的特征而逐渐训练。在审查集合中的所有文档之前，返回分类器模型分配与基于审阅者标签分配的类别不同的类别的文档的计算子集，供审阅者进行第二次审阅。可以使用从一个或多个其他文档集生成的模型来评估第一组的审查。

8. 发明申请

US20140032558A1 CATEGORIZATION OF MULTI-PAGE DOCUMENTS BY ANISOTROPIC DIFFUSION 有权
标题翻译：通过各向异性扩散分类多页文件
公开(公告)号：US20140032558A1
公开(公告)日：2014-01-30
申请号：US13558814
申请日：2012-07-26
申请人： Jean-Michel Renders , François Ragnet , Damien Cramet
发明人： Jean-Michel Renders , François Ragnet , Damien Cramet
IPC分类号： G06F17/30
CPC分类号： G06F17/30265 , G06F17/30256 , G06K9/00483
摘要： A computer implemented system and method are provided for refining category scores for pages of a sequence of document pages that potentially includes document boundaries. The method uses initial category scores provided by a categorizer that considers one page at a time or concatenated pairs of pages (called bipages). The category scores represent the probability that a page belongs to a particular category. The method uses anisotropic diffusion to refine the initial page category scores using the scores of neighboring pages as a function of the probability that there is a boundary between the pages. The method may be performed iteratively.
摘要翻译：提供了一种计算机实现的系统和方法，用于对可能包括文档边界的文档页面序列的页面的类别分数进行细化。该方法使用由分类程序提供的初始类别分数，该分类程序一次考虑一个页面或连接的页面对（称为“比较”）。类别分数表示页面属于特定类别的概率。该方法使用各向异性扩散来使用相邻页面的分数来优化初始页面类别分数，作为页面之间存在边界的概率的函数。可以迭代地执行该方法。

9. 发明申请

US20100313124A1 MANIPULATION OF DISPLAYED OBJECTS BY VIRTUAL MAGNETISM 有权
标题翻译：通过虚拟磁场操纵显示对象
公开(公告)号：US20100313124A1
公开(公告)日：2010-12-09
申请号：US12480002
申请日：2009-06-08
申请人： Caroline Privault , Jacki O'Neill , Jean-Michel Renders , Victor Ciriza , Yves Hoppenot , Gregory Bauduin , Ana Fucs , Ye Deng , Gregoire Gerard , Mathieu Knibiehly
发明人： Caroline Privault , Jacki O'Neill , Jean-Michel Renders , Victor Ciriza , Yves Hoppenot , Gregory Bauduin , Ana Fucs , Ye Deng , Gregoire Gerard , Mathieu Knibiehly
IPC分类号： G06F3/01 , G06F3/041
CPC分类号： G06F3/0488 , G06F3/04812
摘要： A computer implemented tactile user interface (TUI) and a method of manipulating objects with a virtual magnet are provided. The TUI includes a display comprising a touch-screen. The display is configured for displaying a set of graphic objects, each graphic object representing a respective one of a set of items, such as documents, e.g., text documents or images. A virtual magnet is caused to move on the display, in response to touching on the touch-screen, e.g., by dragging a finger or other implement across. The magnet is associated with a particular function command such that a subset of the graphic objects exhibits a response to the virtual magnet (e.g., is caused to move, relative to the virtual magnet or exhibits another visible response), each graphic object in the subset moving or otherwise responding as a function of an attribute of the underlying item represented by the graphic object.
摘要翻译：提供了计算机实现的触觉用户界面（TUI）和使用虚拟磁体操纵对象的方法。 TUI包括包括触摸屏的显示器。显示器被配置为显示一组图形对象，每个图形对象表示一组项目中的相应一个，诸如文档，例如文本文档或图像。导致虚拟磁体在显示器上移动，以响应于触摸触摸屏幕，例如通过拖动手指或其他工具跨越。磁体与特定的功能命令相关联，使得图形对象的子集表现出对虚拟磁体的响应（例如，相对于虚拟磁体移动或呈现另一可见响应），子集中的每个图形对象根据由图形对象表示的基础项目的属性的函数移动或以其他方式进行响应。

10. 发明申请

US20100014762A1 CATEGORIZER WITH USER-CONTROLLABLE CALIBRATION 有权
标题翻译：具有用户可控校准的分类器
公开(公告)号：US20100014762A1
公开(公告)日：2010-01-21
申请号：US12174721
申请日：2008-07-17
申请人： Jean-Michel Renders , Caroline Privault , Eric H. Cheminot
发明人： Jean-Michel Renders , Caroline Privault , Eric H. Cheminot
IPC分类号： G06K9/62
CPC分类号： G06K9/6277
摘要： A calibrated categorizer comprises: a multi-class categorizer configured to output class probabilities for an input object corresponding to a set of classes; a class probabilities rescaler configured to rescale class probabilities to generate rescaled class probabilities; and a resealing model learner configured to learn calibration parameters for the class probabilities rescaler based on (i) class probabilities output by the multi-class categorizer for a calibration set of class-labeled objects, (ii) confidence measures output by the multi-class categorizer for the calibration set of class-labeled objects, and (iii) class labels of the calibration set of class-labeled objects, the class probabilities rescaler calibrated by the learned calibration parameters defining a calibrated class probabilities rescaler. In a method embodiment, class probabilities are generated for an input object corresponding to a set of classes using a classifier trained on a first set of objects, and are rescaled to form rescaled class probabilities using a resealing algorithm calibrated using a second set of objects different from the first set of objects. The method may further entail thresholding the rescaled class probabilities using thresholds calibrated using the second set of objects or a third set of objects.
摘要翻译：校准分类器包括：多类分类器，被配置为输出与一组类对应的输入对象的类概率; 类概率重定标器被配置为重新缩放类概率以产生重新缩放的类概率; 以及重新密封的模型学习者，被配置为基于（i）由多类分类器输出的用于类标记对象的校准集的类概率来学习类概率重定标器的校准参数，（ii）由多类输出的置信度度量分类器，用于类标记对象的校准集，以及（iii）类标记对象的校准集的类标签，通过定义校准的类概率重定标器的所学习的校准参数校准的类概率重新计数器。在方法实施例中，针对与使用在第一组对象上训练的分类器相对应的类的集合的输入对象生成类概率，并且使用使用不同对象的第二组对象校准的重新密码算法重新缩放以形成重新缩放的类概率从第一组对象。该方法还可能使用使用第二组对象或第三组对象校准的阈值来限定重新归类的类概率。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式