本文介绍了为什么wss-plot线(用于优化聚类分析)看起来如此波动?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个R的聚类图,而我想用wss图来优化聚类的肘标准,所以我为我的聚类画了一个wss图,但看起来确实很奇怪,我不知道有多少个弯头我应该聚类,有人可以帮助我吗?

I have a cluster plot by R while I want to optimize the "elbow criterion" of clustering with a wss plot, so I drew a wss plot for my cluster, but is looks really strange and I do not know how many elbows should I cluster, anyone could help me?

这是我的数据:

Here is my data:

Friendly<-c(0.533,0.854,0.9585,0.925,0.9125,0.9815,0.9645,0.981,0.9935,0.9585,0.996,0.956,0.9415)
Polite<-c(0,0.45,0.977,0.9915,0.929,0.981,0.9895,0.9875,1,0.96,0.996,0.873,0.9125)
Praising<-c(0,0,0.437,0.9585,0.9415,0.9605,0.998,0.998,0.8915,1,1,1,0.977)
Joking<-c(0,0,0,0.617,0.942,0.9665,0.9935,0.992,0.935,0.987,0.975,0.9915,0.9665)
Sincere<-c(0,0,0,0,0.617,0.8335,0.985,0.9895,0.977,0.9205,1,0.9585,0.8895)
Serious<-c(0,0,0,0,1,0.642,0.975,0.9605,0.9645,0.9895,0.8125,0.9605,0.925)
Hostile<-c(0,0,0,0,0,0,0.629,0.656,0.948,0.9705,0.9645,0.998,0.9685)
Rude<-c(0,0,0,0,0,0,0,0.687,0.979,0.954,0.954,0.996,0.956)
Irony<-c(0,0,0,0,0,0,0,0,0.354,0.9815,0.996,1,0.971)
Insincere<-c(0,0,0,0,0,0,0,0,1,0.396,0.996,0.9915,0.9415)
Commanding<-c(0,0,0,0,0,0,0,0,0,1,0.462,0.9605,0.9165)
Suggesting<-c(0,0,0,0,0,0,0,0,0,0,0,0.867,0.775)
Neutral<-c(0,0,0,0,0,0,0,0,0,0,0,0,0.283)

data <- data.frame(Friendly,Polite,Praising,Joking,Sincere,Serious,Hostile,Rude,Blaming,Insincere,Commanding,Suggesting,Neutral)

这是我的聚类代码:加文(Gavin)在以下方法的最后一行中给出了此方法:

And here is my code of clustering: the method is given by Gavin in the last line of :How to draw the plot of within-cluster sum-of-squares for a cluster?

##cluster analysis
dist<-as.dist(data)
hc<-hclust(dist, method="average")
plot(hc, main="", sub='Method="Average"', ann=T, axes=T, hang=0.2)
##draw a wss plot
res <- sapply(seq.int(1, 13), wrap, h = hc, x = data) 
plot(seq_along(res), res, type="b", pch=19)

但是看起来像这样,任何人都可以解释为什么会发生这种情况以及如何确定肘部标准?

But it looks like this, anyone can explain why this happened and how to decide the "elbow criterion"?

推荐答案

您为什么期望WSS随着群集数量的增加而平稳下降?正如您所发现的,它不需要。仅凭表现良好的数据,我才能看到表现良好的卵石图。

Why do you expect that WSS will decline smoothly with increasing numbers of clusters? It need not, as you found out. Only with well-behaved data have I seen nicely behaved scree plots.

WSS有7个群集,下降幅度很大,这可能表明您想停在那里。但是,在评估时,您还应该查看树状图。

There is a big drop in the WSS with 7 clusters which might suggest you want to stop there. However, you should also look at the dendrogram when you evaluate this.

这篇关于为什么wss-plot线(用于优化聚类分析)看起来如此波动?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-24 15:47