会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Methods and apparatus for optimizing keyword data analysis
    • 优化关键字数据分析的方法和设备
    • US08001166B2
    • 2011-08-16
    • US12057559
    • 2008-03-28
    • Hirobumi ToyoshimaDaisuke TakumaHiroki Oya
    • Hirobumi ToyoshimaDaisuke TakumaHiroki Oya
    • G06F17/30G06F17/00G06Q10/00
    • G06Q10/06
    • Techniques for analyzing keyword data for quality management purposes are provided. One or more keywords are selected. Each of the one or more keywords represent a category of quality management. A keyword time series is prepared for each of the one or more selected keywords. A set of fixed form time series is prepared for each of the one or more selected keywords. The set of fixed form time series comprises one or more fixed form time series representing statistical data related to the one or more selected keywords. One or more correction sets comprising one or more correction parameters are obtained. Each of the one or more correction parameters correspond to one of the one or more fixed form time series within each set of fixed form time series. A set of corrected time series is generated for each of the one or more correction sets. The set of corrected time series comprises a combination of the keyword time series and the set of fixed form time series for each of the one or more selected keywords, the combination being in accordance with the one or more correction sets. A similarity score is calculated for each set of corrected time series. The set of corrected time series with the highest similarity score is selected. The selected set of corrected time series comprises optimized keyword data for quality management purposes.
    • 提供了分析关键字数据进行质量管理的技术。 选择一个或多个关键字。 一个或多个关键字中的每一个表示质量管理的类别。 为一个或多个所选择的关键字中的每一个准备关键字时间序列。 为一个或多个所选择的关键字中的每一个准备一组固定时间序列。 所述固定格式时间序列集合包括一个或多个表示与所述一个或多个所选关键字相关的统计数据的固定格式时间序列。 获得包括一个或多个校正参数的一个或多个校正组。 一个或多个校正参数中的每一个对应于每组固定格式时间序列中的一个或多个固定格式时间序列之一。 为一个或多个校正组中的每一个产生一组校正时间序列。 校正时间序列组合包括关键词时间序列和一个或多个所选择的关键词中的每一个的固定形式时间序列的组合,该组合是根据一个或多个校正集合。 对于每组校正时间序列计算相似度分数。 选择具有最高相似性得分的校正时间序列集合。 所选择的一组校正时间序列包括用于质量管理目的的优化的关键字数据。
    • 2. 发明申请
    • Methods and Apparatus for Optimizing Keyword Data Analysis
    • 优化关键词数据分析的方法与手段
    • US20090248628A1
    • 2009-10-01
    • US12057559
    • 2008-03-28
    • Hirobumi ToyoshimaDaisuke TakumaHiroki Oya
    • Hirobumi ToyoshimaDaisuke TakumaHiroki Oya
    • G06F7/10
    • G06Q10/06
    • Techniques for analyzing keyword data for quality management purposes are provided. One or more keywords are selected. Each of the one or more keywords represent a category of quality management. A keyword time series is prepared for each of the one or more selected keywords. A set of fixed form time series is prepared for each of the one or more selected keywords. The set of fixed form time series comprises one or more fixed form time series representing statistical data related to the one or more selected keywords. One or more correction sets comprising one or more correction parameters are obtained. Each of the one or more correction parameters correspond to one of the one or more fixed form time series within each set of fixed form time series. A set of corrected time series is generated for each of the one or more correction sets. The set of corrected time series comprises a combination of the keyword time series and the set of fixed form time series for each of the one or more selected keywords, the combination being in accordance with the one or more correction sets. A similarity score is calculated for each set of corrected time series. The set of corrected time series with the highest similarity score is selected. The selected set of corrected time series comprises optimized keyword data for quality management purposes.
    • 提供了分析关键字数据进行质量管理的技术。 选择一个或多个关键字。 一个或多个关键字中的每一个表示质量管理的类别。 为一个或多个所选择的关键字中的每一个准备关键字时间序列。 为一个或多个所选择的关键字中的每一个准备一组固定时间序列。 所述固定格式时间序列集合包括一个或多个表示与所述一个或多个所选关键字相关的统计数据的固定格式时间序列。 获得包括一个或多个校正参数的一个或多个校正组。 一个或多个校正参数中的每一个对应于每组固定格式时间序列中的一个或多个固定格式时间序列之一。 为一个或多个校正组中的每一个产生一组校正时间序列。 校正时间序列组合包括关键词时间序列和一个或多个所选择的关键词中的每一个的固定形式时间序列的组合,该组合是根据一个或多个校正集合。 对于每组校正时间序列计算相似度分数。 选择具有最高相似性得分的校正时间序列集合。 所选择的一组校正时间序列包括用于质量管理目的的优化的关键字数据。
    • 3. 发明授权
    • System, method and program for creating index for database
    • 用于创建数据库索引的系统,方法和程序
    • US08190613B2
    • 2012-05-29
    • US12132301
    • 2008-06-03
    • Daisuke TakumaIssei Yoshida
    • Daisuke TakumaIssei Yoshida
    • G06F17/00
    • G06F17/30616
    • A computer implemented method for creating indices for a database having a plurality of documents each being associated with one or more keywords. The method includes the steps of: dividing the database into a plurality of subsets; separating the keywords into a plurality of keyword groups based upon modulo G of the hash value of the keyword for each subset; reading each document of each subset to create a first sub-index and writing same to a storage device of the computer; reading the first sub-indices to merge the first sub-indices into a second sub-index for each keyword group to write same to the storage device; and reading the second sub-indices from the storage device to merge the second sub-indices into an index for the database and write same on the storage device. A program and a system for creating indices are also provided.
    • 一种用于创建具有多个文档的数据库的索引的计算机实现的方法,每个文档都与一个或多个关键字相关联。 该方法包括以下步骤:将数据库划分成多个子集; 基于每个子集的关键字的散列值的模G将关键字分离成多个关键字组; 读取每个子集的每个文档以创建第一子索引并将其写入计算机的存储设备; 读取第一子索引以将每个关键字组的第一子索引合并成第二子索引以将其写入存储设备; 以及从所述存储装置读取所述第二子索引,以将所述第二子索引合并到所述数据库的索引中,并将其写入所述存储装置。 还提供了一个用于创建索引的程序和系统。
    • 4. 发明授权
    • System of effectively searching text for keyword, and method thereof
    • 有效搜索关键字文本的系统及其方法
    • US07945552B2
    • 2011-05-17
    • US12055420
    • 2008-03-26
    • Daisuke TakumaIssei YoshidaYuta Tsuboi
    • Daisuke TakumaIssei YoshidaYuta Tsuboi
    • G06F7/00
    • G06F17/3061Y10S707/99935
    • A system of the present invention stores: a first index which designates lists of keywords contained in texts from identifications of the respective texts; a second index which designates lists of texts containing keywords from identifications of the respective keywords; and the number of texts containing the respective keywords. Then, upon receiving an input of a text search condition, the system calculates an estimation of search time by the first index and an estimation of search time by the second index, and determines which one of the first and second indexes makes a search faster. Then, by using the index which has been determined to make the search faster, the system searches for keywords which appear in texts satisfying the text search condition with higher frequency.
    • 本发明的系统存储:从各文本的标识指定文本中包含的关键字的列表的第一索引; 第二索引,其指定包含来自各个关键词的标识的关键字的文本的列表; 以及包含各个关键字的文本数量。 然后,在接收到文本搜索条件的输入时,系统通过第一索引计算搜索时间的估计和通过第二索引的搜索时间的估计,并且确定第一和第二索引中哪一个使得搜索更快。 然后,通过使用确定的索引更快地进行搜索,系统搜索出现在满足文本搜索条件的文本中的较高频率的关键字。
    • 5. 发明授权
    • Document data retrieval and reporting
    • 文件数据检索和报告
    • US07571383B2
    • 2009-08-04
    • US11180328
    • 2005-07-13
    • Hiroshi NomiyamaDaisuke Takuma
    • Hiroshi NomiyamaDaisuke Takuma
    • G06F17/30G06F7/00
    • G06F17/30634Y10S707/99933Y10S707/99934Y10S707/99935
    • Enables retrieving document data appropriately reflecting content of a retrieval statement and detecting problems in sequentially added document data. A retrieval system retrieves document data having content specified by an inputted retrieval statement among a plurality of document data, including: document database storing the plurality of document data, concept database storing a plurality of concepts using a hierarchical structure; document data concept extraction extracting document concepts based on keywords contained in respective document data, the concepts being concepts corresponding to the document data; retrieval statement concept extraction extracting a retrieval statement concept based on a keyword contained in the retrieval statement; a concept retrieving section retrieving concepts wherein the retrieval statement concept is a higher or lower layer of the document concept among the plurality of document data, retrieval result output section outputting document data retrieved, as the document data containing content specified by the retrieval statement.
    • 能够检索适当地反映检索语句内容的文档数据,并检测顺序添加的文档数据中的问题。 检索系统在多个文档数据中检索具有由输入的检索语句指定的内容的文档数据,包括:存储多个文档数据的文档数据库,使用分层结构存储多个概念的概念数据库; 文档数据概念提取基于各文档数据中包含的关键词提取文档概念,概念是与文档数据对应的概念; 检索语句概念提取基于检索语句中包含的关键字提取检索语句概念; 概念检索部分,其检索概念,其中所述检索语句概念是所述多个文档数据中的所述文档概念的较高或较低层,检索结果输出部分输出检索到的文档数据,作为包含由所述检索语句指定的内容的文档数据。
    • 6. 发明申请
    • SYSTEM OF EFFECTIVELY SEARCHING TEXT FOR KEYWORD, AND METHOD THEREOF
    • 关键词有效搜索文本系统及其方法
    • US20090030892A1
    • 2009-01-29
    • US12055420
    • 2008-03-26
    • Daisuke TakumaIssei YoshidaYuta Tsuboi
    • Daisuke TakumaIssei YoshidaYuta Tsuboi
    • G06F17/30
    • G06F17/3061Y10S707/99935
    • A system of the present invention stores: a first index which designates lists of keywords contained in texts from identifications of the respective texts; a second index which designates lists of texts containing keywords from identifications of the respective keywords; and the number of texts containing the respective keywords. Then, upon receiving an input of a text search condition, the system calculates an estimation of search time by the first index and an estimation of search time by the second index, and determines which one of the first and second indexes makes a search faster. Then, by using the index which has been determined to make the search faster, the system searches for keywords which appear in texts satisfying the text search condition with higher frequency.
    • 本发明的系统存储:从各文本的标识指定文本中包含的关键字的列表的第一索引; 第二索引,其指定包含来自各个关键词的标识的关键字的文本的列表; 以及包含各个关键字的文本数量。 然后,在接收到文本搜索条件的输入时,系统通过第一索引计算搜索时间的估计和通过第二索引的搜索时间的估计,并且确定第一和第二索引中哪一个使得搜索更快。 然后,通过使用确定的索引更快地进行搜索,系统搜索出现在满足文本搜索条件的文本中的较高频率的关键字。
    • 7. 发明申请
    • SYSTEM, METHOD AND PROGRAM FOR CREATING INDEX FOR DATABASE
    • 用于创建数据库索引的系统,方法和程序
    • US20080319987A1
    • 2008-12-25
    • US12132301
    • 2008-06-03
    • Daisuke TakumaIssei Yoshida
    • Daisuke TakumaIssei Yoshida
    • G06F17/30
    • G06F17/30616
    • An entire document set is decomposed into a sum of subsets each having no common part. Next, a set of keywords appearing in each of the subsets divided in the aforementioned manner is categorized into groups on the basis of a remainder resulting from dividing a hash value of each of the keywords by a certain fixed integer value. Thereby, index files for the respective groups are created. Among the index files prepared for the respective subsets of the document in the aforementioned manner, ones each having the same group number are merged. Thereby, integrated index files corresponding to the respective individual group numbers are created. Such index files, however, exist as many as the number of group numbers, and have not yet become an index corresponding to the entire document set. In this respect, the index files existing as many as the number of group numbers are next merged into one, and thereby, an index file corresponding to the entire document set is created.
    • 整个文件集被分解为每个没有共同部分的子集的总和。 接下来,根据以每个关键字的哈希值除以某个固定整数值的余数,出现在以上述方式划分的每个子集中的一组关键字被分类成组。 由此,创建各组的索引文件。 在以上述方式为文档的各个子集准备的索引文件中,合并具有相同组号的索引文件。 由此,创建与各个组号对应的综合索引文件。 然而,这样的索引文件与组号的数量一样多,并且尚未成为与整个文档集相对应的索引。 在这方面,存在与组编号一样多的索引文件接下来合并成一个,从而创建与整个文档集相对应的索引文件。
    • 8. 发明申请
    • Word boundary probability estimating, probabilistic language model building, kana-kanji converting, and unknown word model building
    • 词边界概率估计,概率语言模型建立,假名汉字转换和未知词模型构建
    • US20080228463A1
    • 2008-09-18
    • US12126980
    • 2008-05-26
    • Shinsuke MoriDaisuke Takuma
    • Shinsuke MoriDaisuke Takuma
    • G06F17/28G06F17/21
    • G06F17/2863G06F17/2715
    • Calculates a word n-gram probability with high accuracy in a situation where a first corpus), which is a relatively small corpus containing manually segmented word information, and a second corpus, which is a relatively large corpus, are given as a training corpus that is storage containing vast quantities of sample sentences. Vocabulary including contextual information is expanded from words occurring in first corpus of relatively small size to words occurring in second corpus of relatively large size by using a word n-gram probability estimated from an unknown word model and the raw corpus. The first corpus (word-segmented) is used for calculating n-grams and the probability that the word boundary between two adjacent characters will be the boundary of two words (segmentation probability). The second corpus (word-unsegmented), in which probabilistic word boundaries are assigned based on information in the first corpus (word-segmented), is used for calculating a word n-grams.
    • 作为一个相对较大的语料库的第二语料库作为训练语料库,作为一种训练语料库,给出了作为一个相对较小的包含手动分割的单词信息的语料库的第一语料库的精度计算单词n-gram概率, 是包含大量样本句子的存储。 包括上下文信息的词汇通过使用从未知词模型和原始语料库估计的单词n-gram概率从相对小尺寸的第一语料库中出现的单词扩展到出现在较大大小的第二语料库中的单词。 第一个语料库(文字分段)用于计算n-gram,两个相邻字符之间的字边界将是两个字的边界(分段概率)的概率。 用第一语料库(word-segmented)中的信息分配概率词边界的第二语料库(word-unsegmented)用于计算单词n-gram。
    • 10. 发明授权
    • Word boundary probability estimating, probabilistic language model building, kana-kanji converting, and unknown word model building
    • 词边界概率估计,概率语言模型建立,假名汉字转换和未知词模型构建
    • US07917350B2
    • 2011-03-29
    • US12126980
    • 2008-05-26
    • Shinsuke MoriDaisuke Takuma
    • Shinsuke MoriDaisuke Takuma
    • G06F17/28G06F17/20G06F17/27G06F17/21
    • G06F17/2863G06F17/2715
    • Calculates a word n-gram probability with high accuracy in a situation where a first corpus), which is a relatively small corpus containing manually segmented word information, and a second corpus, which is a relatively large corpus, are given as a training corpus that is storage containing vast quantities of sample sentences. Vocabulary including contextual information is expanded from words occurring in first corpus of relatively small size to words occurring in second corpus of relatively large size by using a word n-gram probability estimated from an unknown word model and the raw corpus. The first corpus (word-segmented) is used for calculating n-grams and the probability that the word boundary between two adjacent characters will be the boundary of two words (segmentation probability). The second corpus (word-unsegmented), in which probabilistic word boundaries are assigned based on information in the first corpus (word-segmented), is used for calculating a word n-grams.
    • 作为一个相对较大的语料库的第二语料库作为训练语料库,作为一种训练语料库,给出了作为一个相对较小的包含手动分割的单词信息的语料库的第一语料库的精度计算单词n-gram概率, 是包含大量样本句子的存储。 包括上下文信息的词汇通过使用从未知词模型和原始语料库估计的单词n-gram概率从相对小尺寸的第一语料库中出现的单词扩展到出现在较大大小的第二语料库中的单词。 第一个语料库(文字分段)用于计算n-gram,两个相邻字符之间的字边界将是两个字的边界(分段概率)的概率。 用第一语料库(word-segmented)中的信息分配概率词边界的第二语料库(word-unsegmented)用于计算单词n-gram。