Item Type |
Article |
ID |
|
Preview |
Image |
|
Caption |
|
|
Full text |
AN00003152-00000047-0027.pdf
Type |
:application/pdf |
Download
|
Size |
:778.3 KB
|
Last updated |
:Nov 13, 2008 |
Downloads |
: 2476 |
Total downloads since Nov 13, 2008 : 2476
|
|
Release Date |
|
Title |
Title |
大規模文献集合に対して階層的クラスタ分析法を適用するための単連結法アルゴリズム
|
Kana |
ダイキボ ブンケン シュウゴウ ニ タイシテ カイソウテキ クラスタ ブンセキホウ オ テキヨウ スル タメ ノ タンレンケツホウ アルゴリズム
|
Romanization |
Daikibo bunken shugo ni taishite kaisoteki kurasuta bunsekiho o tekiyo suru tame no tanrenketsuho arugorizumu
|
|
Other Title |
Title |
A single-link method algorithm for clustering large document collections
|
Kana |
|
Romanization |
|
|
Creator |
Name |
岸田, 和明
|
Kana |
キシダ, カズアキ
|
Romanization |
Kishida, Kazuaki
|
Affiliation |
駿河台大学文化情報学部
|
Affiliation (Translated) |
Surugadai University
|
Role |
|
Link |
|
|
Edition |
|
Place |
|
Publisher |
Name |
三田図書館・情報学会
|
Kana |
ミタ トショカン ジョウホウ ガッカイ
|
Romanization |
Mita toshokan joho gakkai
|
|
Date |
Issued (from:yyyy) |
2002
|
Issued (to:yyyy) |
|
Created (yyyy-mm-dd) |
|
Updated (yyyy-mm-dd) |
|
Captured (yyyy-mm-dd) |
|
|
Physical description |
|
Source Title |
Name |
Library and information science
|
Name (Translated) |
|
Volume |
|
Issue |
47
|
Year |
2002
|
Month |
|
Start page |
27
|
End page |
38
|
|
ISSN |
|
ISBN |
|
DOI |
|
URI |
|
JaLCDOI |
|
NII Article ID |
|
Ichushi ID |
|
Other ID |
|
Doctoral dissertation |
Dissertation Number |
|
Date of granted |
|
Degree name |
|
Degree grantor |
|
|
Abstract |
In the 1960s and 1970s, techniques for clustering a set of documents, in order to improvethe effectiveness or efficiency of information retrieval systems, have been widely explored.Similar attempts have recently been made by many researchers to allow the visualisation ofsearch results, to provide browsing based search modes or to enhance performance in searchingvery large collections. The purpose of this paper is to develop an algorithm for hierarchicalclustering that can work for very large document collections. The algorithm is based on acombination of two ideas proposed by other researchers to save time and space in the processof hierarchical clustering; (1) the use of an inverted file for reducing the number of documentpairs for which a similarity degree is calculated, and (2) a procedure for constructing adendrogram based on single-link method from similarity data recorded on disk and not themain memory. ln this paper, the algorithm is experimentally applied to a documentset consisting of about 10,000 bibliographic records, and the processing time is analyzedempirically. ln addition, the effects of removing words frequently appearing in documents areexamined. As a result, we find that removing such words enable us to greatly reduce theprocessing time without significant change in .the resulting set of clusters. Finally, an empiricalcomparison between the single-link method and the single-pass algorithm (leader-followeralgorithm) is attempted.
|
|
Table of contents |
|
Keyword |
|
NDC |
|
Note |
|
Language |
|
Type of resource |
|
Genre |
|
Text version |
|
Related DOI |
|
Access conditions |
|
Last modified date |
|
Creation date |
|
Registerd by |
|
History |
Nov 12, 2008 | | フリーキーワード, 本文 を変更 |
|
|
Index |
|
Related to |
|