毕业设计时简单研究了聚类和分类问题,整理了一下用到的数据集,有需要的可以参考一下。。。

聚类数据集信息

序号 数据集 记录数 特征数 类别 简单分布 是否有overlap 来源
1 iris 150 4 3 50/50/50 No UCI <http://archive.ics.uci.edu/ml/datasets/Iris>
2 wine 178 13 3 59/71/48 No UCI <http://archive.ics.uci.edu/ml/datasets/Wine>
3 emotions(music) 593 72 6 173/166/264/148/168/189 YES sourceforge
<http://mulan.sourceforge.net/datasets.html>
4 yeast 2417 103 14 混合分布 YES sourceforge
<http://mulan.sourceforge.net/datasets.html>
5 scene 2407 294 6 427/364/397/433/533/431 YES sourceforge
<http://mulan.sourceforge.net/datasets.html>
6 wdbc 569 30 2 212/357 No UCI
<http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29>
7 breasttissue 106 9 6 21/15/18/16/14/22 No UCI
<http://archive.ics.uci.edu/ml/datasets/Breast+Tissue>
8 seeds 210 7 3 70/70/70 No UCI <http://archive.ics.uci.edu/ml/datasets/seeds>
9 glass 214 9 6(7) 70/76/17/13/9/29 No UCI
<http://archive.ics.uci.edu/ml/datasets/Glass+Identification>
分类数据集信息

序号 数据集 记录数 特征数 类别 简单分布 是否有缺失值 来源
1 appendicitis 106 7 2 21/85 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=183>
2 balance 625 4 3 288/49/288 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=54>,UCI
<http://archive.ics.uci.edu/ml/datasets/Balance+Scale>
3 banana 5300 2 2 2924/2376 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=182#inicio>
4 bands 365(539) 19 2 230/135 Yes KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=89>,UCI
<http://archive.ics.uci.edu/ml/datasets/Cylinder+Bands>
5 bupa 345 6 2 145/200 No KEEL <http://sci2s.ugr.es/keel/dataset.php?cod=55>,
UCI <http://archive.ics.uci.edu/ml/datasets/Liver+Disorders>
6 cleveland 297(303) 13 5 160/54/35/35/13 Yes KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=57>,UCI
<http://archive.ics.uci.edu/ml/datasets/Heart+Disease>
7 dermatology 358(366) 34 6 111/60/71/48/48/20 Yes KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=60>,UCI
<http://archive.ics.uci.edu/ml/datasets/Dermatology>
8 haberman 306 3 2 225/81 No KEEL <http://sci2s.ugr.es/keel/dataset.php?cod=62>
,UCI <http://archive.ics.uci.edu/ml/datasets/Haberman%27s+Survival>
9 hayes-roth 160 4 3 65/64/31 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=186>,UCI
<http://archive.ics.uci.edu/ml/datasets/Hayes-Roth>
10 heart 270 13 2 150/120 No KEEL <http://sci2s.ugr.es/keel/dataset.php?cod=99>
,UCI <http://archive.ics.uci.edu/ml/datasets/Statlog+%28Heart%29>
11 hepatitis 80(155) 19 2 13/67 Yes KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=100>,UCI
<http://archive.ics.uci.edu/ml/datasets/Hepatitis>
12 ionosphere 351 34 2 225/126 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=101>,UCI
<http://archive.ics.uci.edu/ml/machine-learning-databases/ionosphere/>
13 iris 150 4 3 50/50/50 No KEEL <http://sci2s.ugr.es/keel/dataset.php?cod=18>,
UCI <http://archive.ics.uci.edu/ml/datasets/Iris>
14 led7digit 500 7 10 45/37/51/57/52/52/47/57/53/49 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=63>,UCI
<http://archive.ics.uci.edu/ml/datasets/LED+Display+Domain>
15 mammographic 830(961) 5 2 427/403 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=86>,UCI
<http://archive.ics.uci.edu/ml/datasets/Mammographic+Mass>
16 marketing 6876(8993) 13 9 1255/529/505/618/527/846/784/1069/743 Yes KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=163>,biolab <http://orange.biolab.si/>
17 monks2 432 7 2 290/142 No KEEL <http://sci2s.ugr.es/keel/dataset.php?cod=65>
,UCI <http://archive.ics.uci.edu/ml/datasets/MONK%27s+Problems>
18 movement_libras 360 90 15 24/…/24 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=165>,UCI
<http://archive.ics.uci.edu/ml/datasets/Libras+Movement>
19 newthyroid 215 5 3 150/35/30 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=66>,UCI
<http://archive.ics.uci.edu/ml/datasets/Thyroid+Disease>
20 pageblocks 5473 10 5 4913/329/28/88/115 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=104>,UCI
<http://archive.ics.uci.edu/ml/datasets/Page+Blocks+Classification>
21 penbased 10092 16 10 … No KEEL <http://sci2s.ugr.es/keel/dataset.php?cod=70>
,UCI
<http://archive.ics.uci.edu/ml/datasets/Pen-Based+Recognition+of+Handwritten+Digits>
22 phoneme 5404 5 2 3818/1586 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=105#sub1>,UCL
<https://www.elen.ucl.ac.be/neural-nets/Research/Projects/ELENA/databases/REAL/phoneme/>
23 pima 768 8 2 500/268 No KEEL <http://sci2s.ugr.es/keel/dataset.php?cod=21>,
UCI <http://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes>
24 ring 7400 20 2 3664/3736 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=106>,UTO
<http://www.cs.utoronto.ca/~delve/data/ringnorm/desc.html>
25 satimage 6435 36 7 1533/703/1358/626/707/0/1508 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=71>,UCI
<http://archive.ics.uci.edu/ml/datasets/Statlog+%28Landsat+Satellite%29>
26 segment 2310 19 7 330/…/330 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=107>,UCI
<http://archive.ics.uci.edu/ml/datasets/Image+Segmentation>
27 sonar 208 60 2 97/111 No KEEL <http://sci2s.ugr.es/keel/dataset.php?cod=85>,
UCI
<http://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+%28Sonar%2C+Mines+vs.+Rocks%29>
28 spambase 4597(4601) 57 2 2788/1813 Yes KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=109>,UCI
<http://archive.ics.uci.edu/ml/datasets/Spambase>
29 spectfheart 267 44 2 55/212 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=185>,UCI
<http://archive.ics.uci.edu/ml/datasets/SPECTF+Heart>
30 tae 151 5 3 49/50/52 No KEEL <http://sci2s.ugr.es/keel/dataset.php?cod=188>,
UCI <http://archive.ics.uci.edu/ml/datasets/Teaching+Assistant+Evaluation>
31 texture 5500 40 11 500/…/500 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=72>,UCL
<https://www.elen.ucl.ac.be/neural-nets/Research/Projects/ELENA/databases/REAL/texture/>
32 thyroid 7200 21 3 166/368/6666 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=67>,UCI
<http://archive.ics.uci.edu/ml/datasets/Thyroid+Disease>
33 titanic 2201 3 2 1490/711 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=189>,TOR
<http://www.cs.toronto.edu/~delve/data/titanic/titanicDetail.html>
34 twonorm 7400 20 2 3703/3697 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=110>,UTO
<http://www.cs.utoronto.ca/~delve/data/twonorm/desc.html>
35 vehicle 846 18 4 212/218/199/217 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=68>,UCI
<http://archive.ics.uci.edu/ml/datasets/Statlog+%28Vehicle+Silhouettes%29>
36 vowel 990 13 11 90/…/90 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=113>,UCI
<http://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+%28Vowel+Recognition+-+Deterding+Data%29>
37 wdbc 569 30 2 212/357 No UCI
<http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29>
38 wine 178 13 3 59/71/48 No UCI <http://archive.ics.uci.edu/ml/datasets/Wine>
39 winequality-red 1599 11 11 10/53/681/638/199/18 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=210>,UCI
<http://archive.ics.uci.edu/ml/datasets/Wine+Quality>
40 wisconsin 683(699) 9 2 444/239 No KEEL
<http://sci2s.ugr.es/keel/dataset.php?cod=73>,UCI
<http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29>

友情链接
KaDraw流程图
API参考文档
OK工具箱
云服务器优惠
阿里云优惠券
腾讯云优惠券
华为云优惠券
站点信息
问题反馈
邮箱:[email protected]
QQ群:637538335
关注微信