慶應義塾大学学術情報リポジトリ(KOARA)KeiO Associated Repository of Academic resources

慶應義塾大学学術情報リポジトリ(KOARA)

Home  »»  Listing item  »»  Detail

Detail

Item Type Article
ID
AN00003152-00000039-0031  
Preview
Image
thumbnail  
Caption  
Full text
AN00003152-00000039-0031.pdf
Type :application/pdf Download
Size :1.7 MB
Last updated :Apr 20, 2007
Downloads : 2784

Total downloads since Apr 20, 2007 : 2784
 
Release Date
 
Title
Title 図書をNDCカテゴリに分類する試み  
Kana トショ オ NDC カテゴリ ニ ブンルイ スル ココロミ  
Romanization Tosho o NDC kategori ni bunrui suru kokoromi  
Other Title
Title An experiment of automatic classification of books using Nippon Decimal Classification  
Kana  
Romanization  
Creator
Name 石田, 栄美  
Kana イシダ, エミ  
Romanization Ishida, Emi  
Affiliation 慶應義塾大学大学院文学研究科図書館・情報学専攻  
Affiliation (Translated) Graduate School of Library and Information Science, Keio University  
Role  
Link  
Edition
 
Place
 
Publisher
Name 三田図書館・情報学会  
Kana ミタ トショカン ジョウホウ ガッカイ  
Romanization Mita toshokan joho gakkai  
Date
Issued (from:yyyy) 1998  
Issued (to:yyyy)  
Created (yyyy-mm-dd)  
Updated (yyyy-mm-dd)  
Captured (yyyy-mm-dd)  
Physical description
 
Source Title
Name Library and information science  
Name (Translated)  
Volume  
Issue 39  
Year 1998  
Month  
Start page 31  
End page 45  
ISSN
03734447  
ISBN
 
DOI
URI
JaLCDOI
NII Article ID
 
Ichushi ID
 
Other ID
 
Doctoral dissertation
Dissertation Number  
Date of granted  
Degree name  
Degree grantor  
Abstract
In information retrie’val, texts are usually retrieved by them with queries. ln this study, anapproach was suggested that texts are automatically classified into categories and retrieved bymatching them with queries classified in the same way. For an efficient information retrievalusing automatic classification, extracting methods of words from texts and matching methodsare essential. Some extracting methods from Japanese texts have been suggested in naturallanguages processing. However, it is difiicult to extract significant words from Japanese textsbecause Japanese texts are written without blank space separating words. As for matchingmethods, many weighting methods have been suggested as well as vector space models andprobabilistic models.   This article reports the results of an experiment of classifying Japanese texts into NipponDecimal Classification (NDC) categories based on the title information in Japanese MARCrecords. ln this experiment, three extracting methods: 一一juman, MHSA, n-gram-are tested ona set of 1,000 books. Four weighting methods: 一relative term frequency between categories, tf・idf and tf (max)・idf一一一一一are tested. The results indicate that the extracting method using jumanachieved best and the best weighting method was the relative term frequency between categories, being able to select correct classification categories (upper three digits of NDC) for about55.99060 of 1,000 books.
 
Table of contents

 
Keyword
 
NDC
 
Note

 
Language
日本語  
Type of resource
text  
Genre
Journal Article  
Text version
publisher  
Related DOI
Access conditions

 
Last modified date
May 05, 2024 16:35:01  
Creation date
Apr 20, 2007 10:27:03  
Registerd by
mediacenter
 
History
 
Index
/ Public / Faculty of Letters / Library and information science / 39 (1998)
 
Related to