図書をＮＤＣカテゴリに分類する試み

石田, 栄美

ホーム »» アイテム一覧 »» アイテム詳細

アイテム詳細

アイテムタイプ

Article

ID

AN00003152-00000039-0031 　

プレビュー

画像
キャプション

本文

AN00003152-00000039-0031.pdf

Type	:application/pdf	Download
Size	:1.7 MB
Last updated	:Apr 20, 2007
Downloads	: 3231

Total downloads since Apr 20, 2007 : 3231
　

本文公開日

タイトル

タイトル	図書をＮＤＣカテゴリに分類する試み
カナ	トショオＮＤＣカテゴリニブンルイスルココロミ
ローマ字	Tosho o NDC kategori ni bunrui suru kokoromi

別タイトル

名前	An experiment of automatic classification of books using Nippon Decimal Classification
カナ
ローマ字

著者

名前	石田, 栄美
カナ	イシダ, エミ
ローマ字	Ishida, Emi
所属	慶應義塾大学大学院文学研究科図書館・情報学専攻
所属(翻訳)	Graduate School of Library and Information Science, Keio University
役割
外部リンク

版

出版地

出版者

名前	三田図書館・情報学会
カナ	ミタトショカンジョウホウガッカイ
ローマ字	Mita toshokan joho gakkai

日付

出版年(from:yyyy)	1998
出版年(to:yyyy)
作成日(yyyy-mm-dd)
更新日(yyyy-mm-dd)
記録日(yyyy-mm-dd)

形態

上位タイトル

名前	Library and information science
翻訳
巻
号	39
年	1998
月
開始ページ	31
終了ページ	45

ISSN

03734447 　

ISBN

DOI

URI

JaLCDOI

NII論文ID

医中誌ID

その他ID

博士論文情報

学位授与番号
学位授与年月日
学位名
学位授与機関

抄録

In　information　retrie’val，　texts　are　usually　retrieved　by　them　with　queries．　ln　this　study，　anapproach　was　suggested　that　texts　are　automatically　classified　into　categories　and　retrieved　bymatching　them　with　queries　classified　in　the　same　way．　For　an　efficient　information　retrievalusing　automatic　classification，　extracting　methods　of　words　from　texts　and　matching　methodsare　essential．　Some　extracting　methods　from　Japanese　texts　have　been　suggested　in　naturallanguages　processing．　However，　it　is　difiicult　to　extract　significant　words　from　Japanese　textsbecause　Japanese　texts　are　written　without　blank　space　separating　words．　As　for　matchingmethods，　many　weighting　methods　have　been　suggested　as　well　as　vector　space　models　andprobabilistic　models．　　　This　article　reports　the　results　of　an　experiment　of　classifying　Japanese　texts　into　NipponDecimal　Classification　（NDC）　categories　based　on　the　title　information　in　Japanese　MARCrecords．　ln　this　experiment，　three　extracting　methods：　一一juman，　MHSA，　n－gram－are　tested　ona　set　of　1，000　books．　Four　weighting　methods：　一relative　term　frequency　between　categories，　tf・idf　and　tf　（max）・idf一一一一一are　tested．　The　results　indicate　that　the　extracting　method　using　jumanachieved　best　and　the　best　weighting　method　was　the　relative　term　frequency　between　categories，　being　able　to　select　correct　classification　categories　（upper　three　digits　of　NDC）　for　about55．99060　of　1，000　books．

キーワード

NDC

注記

言語

日本語　

資源タイプ

text 　

ジャンル

Journal Article 　

著者版フラグ

publisher