慶應義塾大学学術情報リポジトリ(KOARA)KeiO Associated Repository of Academic resources

慶應義塾大学学術情報リポジトリ(KOARA)

Home  »»  Listing item  »»  Detail

Detail

Item Type Article
ID
AN00003152-00000047-0027  
Preview
Image
thumbnail  
Caption  
Full text
AN00003152-00000047-0027.pdf
Type :application/pdf Download
Size :778.3 KB
Last updated :Nov 13, 2008
Downloads : 2476

Total downloads since Nov 13, 2008 : 2476
 
Release Date
 
Title
Title 大規模文献集合に対して階層的クラスタ分析法を適用するための単連結法アルゴリズム  
Kana ダイキボ ブンケン シュウゴウ ニ タイシテ カイソウテキ クラスタ ブンセキホウ オ テキヨウ スル タメ ノ タンレンケツホウ アルゴリズム  
Romanization Daikibo bunken shugo ni taishite kaisoteki kurasuta bunsekiho o tekiyo suru tame no tanrenketsuho arugorizumu  
Other Title
Title A single-link method algorithm for clustering large document collections  
Kana  
Romanization  
Creator
Name 岸田, 和明  
Kana キシダ, カズアキ  
Romanization Kishida, Kazuaki  
Affiliation 駿河台大学文化情報学部  
Affiliation (Translated) Surugadai University  
Role  
Link  
Edition
 
Place
 
Publisher
Name 三田図書館・情報学会  
Kana ミタ トショカン ジョウホウ ガッカイ  
Romanization Mita toshokan joho gakkai  
Date
Issued (from:yyyy) 2002  
Issued (to:yyyy)  
Created (yyyy-mm-dd)  
Updated (yyyy-mm-dd)  
Captured (yyyy-mm-dd)  
Physical description
 
Source Title
Name Library and information science  
Name (Translated)  
Volume  
Issue 47  
Year 2002  
Month  
Start page 27  
End page 38  
ISSN
03734447  
ISBN
 
DOI
URI
JaLCDOI
NII Article ID
 
Ichushi ID
 
Other ID
 
Doctoral dissertation
Dissertation Number  
Date of granted  
Degree name  
Degree grantor  
Abstract
In the 1960s and 1970s, techniques for clustering a set of documents, in order to improvethe effectiveness or efficiency of information retrieval systems, have been widely explored.Similar attempts have recently been made by many researchers to allow the visualisation ofsearch results, to provide browsing based search modes or to enhance performance in searchingvery large collections. The purpose of this paper is to develop an algorithm for hierarchicalclustering that can work for very large document collections. The algorithm is based on acombination of two ideas proposed by other researchers to save time and space in the processof hierarchical clustering; (1) the use of an inverted file for reducing the number of documentpairs for which a similarity degree is calculated, and (2) a procedure for constructing adendrogram based on single-link method from similarity data recorded on disk and not themain memory. ln this paper, the algorithm is experimentally applied to a documentset consisting of about 10,000 bibliographic records, and the processing time is analyzedempirically. ln addition, the effects of removing words frequently appearing in documents areexamined. As a result, we find that removing such words enable us to greatly reduce theprocessing time without significant change in .the resulting set of clusters. Finally, an empiricalcomparison between the single-link method and the single-pass algorithm (leader-followeralgorithm) is attempted.
 
Table of contents

 
Keyword
 
NDC
 
Note
短報
 
Language
日本語  
Type of resource
text  
Genre
Journal Article  
Text version
publisher  
Related DOI
Access conditions

 
Last modified date
Nov 12, 2008 17:13:46  
Creation date
Apr 20, 2007 10:20:35  
Registerd by
mediacenter
 
History
Nov 12, 2008    フリーキーワード, 本文 を変更
 
Index
/ Public / Faculty of Letters / Library and information science / 47 (2002)
 
Related to