慶應義塾大学学術情報リポジトリ(KOARA)KeiO Associated Repository of Academic resources

慶應義塾大学学術情報リポジトリ(KOARA)

Home  »»  Listing item  »»  Detail

Detail

Item Type Article
ID
AN00003152-00000049-0033  
Preview
Image
thumbnail  
Caption  
Full text
AN00003152-00000049-0033.pdf
Type :application/pdf Download
Size :2.9 MB
Last updated :Nov 13, 2008
Downloads : 3703

Total downloads since Nov 13, 2008 : 3703
 
Release Date
 
Title
Title 文書クラスタリングの技法 : 文献レビュー  
Kana ブンショ クラスタリング ノ ギホウ : ブンケン レビュー  
Romanization Bunsho kurasutaringu no giho : bunken rebyu  
Other Title
Title Techniques of document clustering : a review  
Kana  
Romanization  
Creator
Name 岸田, 和明  
Kana キシダ, カズアキ  
Romanization Kishida, Kazuaki  
Affiliation 駿河台大学文化情報学部  
Affiliation (Translated) Surugadai University  
Role  
Link  
Edition
 
Place
 
Publisher
Name 三田図書館・情報学会  
Kana ミタ トショカン ジョウホウ ガッカイ  
Romanization Mita toshokan joho gakkai  
Date
Issued (from:yyyy) 2003  
Issued (to:yyyy)  
Created (yyyy-mm-dd)  
Updated (yyyy-mm-dd)  
Captured (yyyy-mm-dd)  
Physical description
 
Source Title
Name Library and information science  
Name (Translated)  
Volume  
Issue 49  
Year 2003  
Month  
Start page 33  
End page 75  
ISSN
03734447  
ISBN
 
DOI
URI
JaLCDOI
NII Article ID
 
Ichushi ID
 
Other ID
 
Doctoral dissertation
Dissertation Number  
Date of granted  
Degree name  
Degree grantor  
Abstract
The document clustering technique is widely recognized as a useful tool for informationretrieval, organizing web documents, text mining and so on. The purpose of this paper is toreview various document clustering techniques, and to discuss research issues for enhancingeffectiveness or efficiency of the clustering methods. We explore extensive literature onnon-hierarchical methods (single-pass methods), hierarchical methods (single-link, completelink, etc.), dimensional reduction methods (LSI, principal component analysis, etc.), probabilisticmethods, data mining techniques, and so on. ln particular, this paper focuses on typicaltechniques, such as the k-means algorithm, the leader-follower algorithm, self-organizing map(SOM), single一 or complete-link methods, bisecting k-means methods, latent semantic indexing(LSI), Gaussian-Mjxture model and so on. After reviewing the techniques and algorithms, wediscuss research issues on document clustering; computational complexity, feature extraction(selection of words), methods for defining term weights and similarity, and evaluation of results.
 
Table of contents

 
Keyword
 
NDC
 
Note
展望論文
 
Language
日本語  
Type of resource
text  
Genre
Journal Article  
Text version
publisher  
Related DOI
Access conditions

 
Last modified date
Nov 12, 2008 17:02:27  
Creation date
Apr 20, 2007 10:19:05  
Registerd by
mediacenter
 
History
Nov 12, 2008    フリーキーワード, 本文 を変更
 
Index
/ Public / Faculty of Letters / Library and information science / 49 (2003)
 
Related to