会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Method and apparatus for enhanced automatic determination of text line
dependent parameters
    • 用于增强自动确定文本行相关参数的方法和装置
    • US5513304A
    • 1996-04-30
    • US191895
    • 1994-02-04
    • A. Lawrence SpitzAntonio P. Dias
    • A. Lawrence SpitzAntonio P. Dias
    • G06K9/20G06K9/34
    • G06K9/348G06K2209/01
    • An automatic character cell determining apparatus automatically determines the character cells within the text image of a document. A connected component generator means generates connected components from the pixels comprising the text image. An aligning device aligns skewed and warped lines to the proper image axes. A bounding box generator generates a bounding box surrounding each connected component. A character cell determining device for locating character cells including one or more connected components has a vertical splaying device and a horizontal splaying device for ensuring white spaces between lines and connected components, a vertical profile device for determining the vertical positions of a line, a splitting device for splitting ligatures of two or more connected components and a character cell generator for generating character cells grouping together one or more connected components.
    • 自动字符单元确定装置自动确定文档的文本图像内的字符单元。 连接分量发生器装置从包括文本图像的像素生成连接分量。 对准装置将倾斜和翘曲的线对齐到正确的图像轴。 边界框生成器围绕每个连接的组件生成一个边界框。 用于定位包括一个或多个连接分量的字符单元的字符单元确定装置具有用于确定线和连接分量之间的空白的垂直显示装置和水平放映装置,用于确定线的垂直位置的垂直分布装置,分割 用于分离两个或多个连接组件的连接的设备和用于生成将一个或多个连接组件组合在一起的字符单元的字符单元发生器。
    • 2. 发明授权
    • Method and apparatus for automatic language determination of Asian
language documents
    • 用于自动语言确定亚洲语言文件的方法和装置
    • US5425110A
    • 1995-06-13
    • US47673
    • 1993-04-19
    • A. Lawrence Spitz
    • A. Lawrence Spitz
    • G06K9/20G06F17/27G06K9/62G06K9/68G06K9/46
    • G06F17/275G06K9/6807G06K2209/011
    • An automatic language determining apparatus automatically determines the particular Asian language of the text image of a document when the gross script-type is known to be, or is determined to be, an Asian script-type. A connected component generating means generates connected components from the pixels comprising the text image. A character cell generating means generates a character cell surrounding at least one connected component. An optical density determining means determines the optical density, in absolute numbers or percentage of pixels, of the pixels within each character cell. A script feature determining means first generates a histogram, then converts, by linear discriminate analysis, the histogram to a point in a new coordinate space. A language determining means compares the determined point of the text portion in the new coordinate space to predetermined regimes in the new coordinate space corresponding to at least one Asian language to determine the particular Asian language of the text image.
    • 自动语言确定装置当已知或确定为亚洲脚本类型时,自动确定文档的文本图像的特定亚洲语言。 连接分量生成装置从包括文本图像的像素生成连接分量。 字符单元生成单元生成围绕至少一个连接分量的字符单元。 光密度确定装置确定每个字符单元内的像素的光密度(以像素的绝对数或百分比表示)。 脚本特征确定装置首先生成直方图,然后通过线性判别分析将直方图转换为新坐标空间中的点。 语言确定装置将新坐标空间中的文本部分的确定点与对应于至少一种亚洲语言的新坐标空间中的预定方案进行比较,以确定文本图像的特定亚洲语言。
    • 3. 发明授权
    • Method and apparatus for automatic language determination of European
script documents
    • 欧洲脚本文件的自动语言确定方法和装置
    • US5377280A
    • 1994-12-27
    • US47539
    • 1993-04-19
    • Takehiro Nakayama
    • Takehiro Nakayama
    • G06K9/62G06F17/27G06F17/28G06K9/46
    • G06F17/275
    • An automatic language-determining apparatus automatically determines the particular European language of the text image of a document when the gross-script-type is known to be, or is determined to be, an European script-type. A word token generating means generates word tokens from the text image. A feature determining means determines the frequency of appearance of word tokens of the text portion which correspond to predetermined word tokens. A language determining means converts the determined frequency of appearance rates to a point in a new coordinate space, then determines which predetermined region of the new coordinate space the point is closes to, to determine the language of the text portion.
    • 一种自动语言确定装置,当已知或者被确定为欧洲脚本类型的大写字母类型时,自动确定文档的文本图像的特定欧洲语言。 单词标记生成装置从文本图像生成单词令牌。 特征确定装置确定对应于预定字标记的文本部分的字标记的出现频率。 语言确定装置将确定的出现频率的频率转换为新坐标空间中的一个点,然后确定该点被关闭的新坐标空间的哪个预定区域,以确定文本部分的语言。
    • 4. 发明授权
    • Method and apparatus for automatic character script determination
    • 自动字符脚本确定的方法和装置
    • US5444797A
    • 1995-08-22
    • US47515
    • 1993-04-19
    • A. Lawrence SpitzDavid A. Hull
    • A. Lawrence SpitzDavid A. Hull
    • G06K9/20G06K9/62G06K9/68G06K9/46G06K9/34
    • G06K9/6807
    • An automatic script determining apparatus automatically determines the gross script-type of the text image of a document. A connected component generating means generates connected components from the pixels comprising the text image. A bounding box generating means generates a bounding box surrounding each connected component. A centroid determining means determines a centroid for each bounding box. A script feature determining means determines the locations, relative to the centroid, of one or more predetermined types of features, for each bounding box. A script determining means determines a distribution of the located script features for the entire text image, and compares the determined spatial distribution to predetermined distribution for at least one script-type to determine the script type of the text image.
    • 自动脚本确定装置自动确定文档的文本图像的总脚本类型。 连接分量生成装置从包括文本图像的像素生成连接分量。 边界框生成装置生成围绕每个连接的部件的边界框。 重心确定装置确定每个边界框的质心。 脚本特征确定装置为每个边界盒确定相对于质心的一个或多个预定类型的特征的位置。 脚本确定装置确定整个文本图像的所定位的脚本特征的分布,并将确定的空间分布与至少一个脚本类型的预定分布进行比较,以确定文本图像的脚本类型。
    • 5. 发明授权
    • Method for matching text images and documents using character shape codes
    • 使用字符形状代码匹配文本图像和文档的方法
    • US5438628A
    • 1995-08-01
    • US220926
    • 1994-03-31
    • A. Lawrence SpitzAntonio P. Dias
    • A. Lawrence SpitzAntonio P. Dias
    • G06K9/62G06K9/68G06K9/00
    • G06K9/6807
    • A first method for exact and inexact matching of documents stored in a document database includes the step of converting the documents in the database to a compacted tokenized form. A search string or search document is then converted to the compact tokenized form and compared to determine if the test string occurs in the documents of the database or whether the documents in the database correspond to the test document. A second method for inexact matching of a test document to the documents in the database includes generating sets of one or more floating point values for each document in the database and for the test document. The sets of floating point numbers for the database are then compared to the set for the test document to determine a degree of matching. A threshold value is established and each document in the database which generates a matching value closer to the test document that the threshold is considered to be an inexact match of the test document.
    • 用于精确和不精确匹配存储在文档数据库中的文档的第一种方法包括将数据库中的文档转换为压缩的标记化形式的步骤。 然后将搜索字符串或搜索文档转换为紧凑的标记表单并进行比较,以确定测试字符串是否出现在数据库的文档中,或者数据库中的文档是否对应于测试文档。 测试文档与数据库中的文档的不精确匹配的第二种方法包括为数据库中的每个文档和测试文档生成一个或多个浮点值的集合。 然后将数据库的浮点数集合与测试文档的集合进行比较,以确定匹配程度。 建立阈值,并且数据库中的每个文档生成更接近测试文档的匹配值,阈值被认为是测试文档的不精确匹配。