本文介绍了R:igraph,与“已知”成员匹配。聚类到观察到的聚类的成员,并返回%match的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Walktrap社区检测方法来返回一定数量的集群(在这种情况下为19)。我有一个属于这些集群中一个或多个集群的成员列表。

I'm using the Walktrap community detection method to return a number (19 in this case) of clusters. I have a list of members which belong to one or more of these clusters.


  1. 我需要一种方法来搜索每个群集中是否存在
    成员并返回找到的匹配项的百分比。 (例如cluster [0]
    = 0%,cluster [1] = Y%.... cluster [18] = Z%)因此,选择代表列表中成员的最佳簇。

  1. I need a method to search each cluster for the presence of themembers and return the percentage of matches found. ( e.g cluster[0]= 0%, cluster[1] =Y%.....cluster[18]=Z%) Thus selecting the optimum cluster that represents the members on the list.

找到最佳聚类后,我需要一种方法来从原始$ b $中计算最佳聚类的
个成员数b(19-1)个群集选择另一个尺寸最接近
(成员数)的群集

Once the optimum cluster is found, I need a method to count the number of members of the optimum cluster and from the original (19-1) clusters select another cluster that is nearest in size (number of members)

 library(igraph)
 edges <- read.csv('http://dl.dropbox.com/u/23776534/Facebook%20%5BEdges%5D.csv')
 list<-read.csv("http://dl.dropbox.com/u/23776534/knownlist.csv")
 all<-graph.data.frame(edges)
 summary(all)
all_wt<- walktrap.community(all, steps=6,modularity=TRUE,labels=TRUE)
all_wt_memb <- community.to.membership(all,all_wt$merges,steps=which.max(all_wt$modularity)-1)
all_wt_memb$csize

>[1] 176  13 204  24   9 263  16   2   8   4  12   8   9  19  15   3   6   2   1



推荐答案

% in%函数的用法如下: a%in%b 将确定哪个向量 a 中的元素也存在于向量 b 中。因此,对于每个群集,我将

The %in% function, when used like: a %in% b will determine which of the elements in vector a are also present in vector b. So for each cluster, I would


  • 提取该群集的成员

  • 给出一个列表您感兴趣的成员,计算哪些是该簇中的%in%,这将返回布尔矢量

  • 您可以在布尔向量上使用 sum()来计算真实元素的数量(即,初始向量中存在于此簇中的元素的数量)

  • (可选),您可以根据簇的长度进行归一化,以获得该簇的百分比,该百分比由您感兴趣的列表组成,或者由您做出的列表的长度来表示您列表中存在于此群集中的成员。

  • Extract the members of that cluster
  • Given a list of members in which you're interested, calculate which ones are %in% this cluster -- which will return a Boolean vector
  • You can use sum() on the Boolean vector to count the number of true elements (i.e. the number of elements in your initial vector which are present in this cluster
  • (Optionally) you can normalize by the length of the cluster to get the percentage of this cluster which is made up of your list of interest, or by the length of the list you made, to indicate the number of members in your list which are present in this cluster.

您可以使用 for()遍历每个群集 apply 变体。

然后给出 all_wt_memb $ csize ,您将有一个给定的值作为目标,并且想要找到最接近的数字。 ,但您只是在计算最小值绝对差异:

Then given all_wt_memb$csize, you'll have a given value which is your target, and you'll want to find the nearest number. See this link, but you're just calculating the minimum absolute difference:

x=c(1:100)
your.number=5.43
which(abs(x-your.number)==min(abs(x-your.number)))

这篇关于R:igraph,与“已知”成员匹配。聚类到观察到的聚类的成员,并返回%match的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-24 15:48