Item Type |
Article |
ID |
|
Preview |
Image |
|
Caption |
|
|
Full text |
AN00003152-00000049-0033.pdf
Type |
:application/pdf |
Download
|
Size |
:2.9 MB
|
Last updated |
:Nov 13, 2008 |
Downloads |
: 3703 |
Total downloads since Nov 13, 2008 : 3703
|
|
Release Date |
|
Title |
Title |
文書クラスタリングの技法 : 文献レビュー
|
Kana |
ブンショ クラスタリング ノ ギホウ : ブンケン レビュー
|
Romanization |
Bunsho kurasutaringu no giho : bunken rebyu
|
|
Other Title |
Title |
Techniques of document clustering : a review
|
Kana |
|
Romanization |
|
|
Creator |
Name |
岸田, 和明
|
Kana |
キシダ, カズアキ
|
Romanization |
Kishida, Kazuaki
|
Affiliation |
駿河台大学文化情報学部
|
Affiliation (Translated) |
Surugadai University
|
Role |
|
Link |
|
|
Edition |
|
Place |
|
Publisher |
Name |
三田図書館・情報学会
|
Kana |
ミタ トショカン ジョウホウ ガッカイ
|
Romanization |
Mita toshokan joho gakkai
|
|
Date |
Issued (from:yyyy) |
2003
|
Issued (to:yyyy) |
|
Created (yyyy-mm-dd) |
|
Updated (yyyy-mm-dd) |
|
Captured (yyyy-mm-dd) |
|
|
Physical description |
|
Source Title |
Name |
Library and information science
|
Name (Translated) |
|
Volume |
|
Issue |
49
|
Year |
2003
|
Month |
|
Start page |
33
|
End page |
75
|
|
ISSN |
|
ISBN |
|
DOI |
|
URI |
|
JaLCDOI |
|
NII Article ID |
|
Ichushi ID |
|
Other ID |
|
Doctoral dissertation |
Dissertation Number |
|
Date of granted |
|
Degree name |
|
Degree grantor |
|
|
Abstract |
The document clustering technique is widely recognized as a useful tool for informationretrieval, organizing web documents, text mining and so on. The purpose of this paper is toreview various document clustering techniques, and to discuss research issues for enhancingeffectiveness or efficiency of the clustering methods. We explore extensive literature onnon-hierarchical methods (single-pass methods), hierarchical methods (single-link, completelink, etc.), dimensional reduction methods (LSI, principal component analysis, etc.), probabilisticmethods, data mining techniques, and so on. ln particular, this paper focuses on typicaltechniques, such as the k-means algorithm, the leader-follower algorithm, self-organizing map(SOM), single一 or complete-link methods, bisecting k-means methods, latent semantic indexing(LSI), Gaussian-Mjxture model and so on. After reviewing the techniques and algorithms, wediscuss research issues on document clustering; computational complexity, feature extraction(selection of words), methods for defining term weights and similarity, and evaluation of results.
|
|
Table of contents |
|
Keyword |
|
NDC |
|
Note |
|
Language |
|
Type of resource |
|
Genre |
|
Text version |
|
Related DOI |
|
Access conditions |
|
Last modified date |
|
Creation date |
|
Registerd by |
|
History |
Nov 12, 2008 | | フリーキーワード, 本文 を変更 |
|
|
Index |
|
Related to |
|