このエントリーをはてなブックマークに追加
ID 19639
Eprint ID
19639
FullText URL
Author
Yeh Chi-Yuan
Ouyang Jeng
Lee Shie-Jue
Abstract
Finding an efficient data reduction method for large-scale problems is an imperative task. In this paper, we propose a similarity-based self-constructing fuzzy clustering algorithm to do the sampling of instances for the classification task. Instances that are similar to each other are grouped into the same cluster. When all the instances have been fed in, a number of clusters are formed automatically. Then the statistical mean for each cluster will be regarded as representing all the instances covered in the cluster. This approach has two advantages. One is that it can be faster and uses less storage memory. The other is that the number of new representative instances need not be specified in advance by the user. Experiments on real-world datasets show that our method can run faster and obtain better reduction rate than other methods.
Keywords
Large-scale dataset
fuzzy similarity
data reduction
prototype reduction
instance-filtering
instance-abstraction
Published Date
2009-11-12
Publication Title
Proceedings : Fifth International Workshop on Computational Intelligence & Applications
Volume
volume2009
Issue
issue1
Publisher
IEEE SMC Hiroshima Chapter
Start Page
65
End Page
70
ISSN
1883-3977
NCID
BB00577064
Content Type
Conference Paper
language
English
Copyright Holders
IEEE SMC Hiroshima Chapter
Event Title
5th International Workshop on Computational Intelligence & Applications IEEE SMC Hiroshima Chapter : IWCIA 2009
Event Location
東広島市
Event Location Alternative
Higashi-Hiroshima City
File Version
publisher
Refereed
True
Eprints Journal Name
IWCIA