Modularity:

称为模块度，是Community Detection(社区检测）中用来衡量社区被划分质量的一种方法。

Finding and evaluating community structure in networks：2003年newman第一次提出Modularity,以下为原文阅读笔记：

A property that seems to be common to many networks is community structure, the division of network nodes into groups within which the network connections are dense, but between which they are sparser—see Fig. 1.

hierarchical clustering：根据相似度的权重和相关度的紧密程度可以把节点分为两级别，根据他们在网络中是添加边还是减少边可以把它们划分为凝聚集和分离集。在一个凝聚集，相似度用一个函数或者节点对在初始网络中增加的边来表示。程序可以随时中止，得到的结果就是一个个社区块。在一个空图绘制成整图的过程可以用图二所示的树来表示，水平切割代表不同时刻的停止点：

在向一群孤立的点添加边的时候，点就凝聚成越来越大的社团，当到达顶端的时候所有的点构成一个社团，当我们从上面向下面看的时候，一个社团就被分割成越来越小的小社团，横街线会给出那个等级的社团，树中分裂点的垂直高度仅表示分裂或连接发生的顺序，尽管可以构造更精细的树形图，其中这些高度可以包含其他信息。

存在的第一个问题是当社区的结构已知，有很大的频率会不能划分出正确的社区，这让人丧失对他的信任。另一个问题是重视核心节点的社区划分忽视边缘节点的社区划分，因为核心节点有很强的相似性能够凝聚到一起，对于边缘节点来说没有很强的相似性，在现实中使用聚类的时候边缘节点可能只有一条边和明确的社区相连，没有那么强的相似性使边缘节点不能很好的被划分到社区里。

上面讲的是从下到上的聚类算法，还有从上到下的分离算法：先找出最不相似的两个节点，然后把两个节点之间的边移除，重复的做这项工作，可以把社区划分成越来越小的部分，当然也可以随时停下算法得到当前的社区划分。

由此可以想到一种新想法，边缘节点往往连接着不同的社团，边缘节点在实际问题中不应该过度的弱化他的作用，下文会用到这个思想。

II. FINDING COMMUNITIES IN A NETWORK

Divisive algorithms:不同于之前在最弱相似点之间移除边，而是找到最高几率的中间节点，betweennwss是连接两个社区的边界点，而不是社区内部的点。介绍两个概念：介数通常分为边介数和节点介数两种，节点介数定义为网络中所有最短路径中经过该节点的路径的数目占最短路径数目的比例；边介数定义为网络中所有最短路径中经过该边的路径数目占最短路径总数的比例。本文的想法是，存在一些边是两个社区所共有的，算所有节点的边介值可以知道两个社区共有的边的边介值为最大，找出边介值从而找到两个社区。存在

1shortest-path betweenness 时间O(mn)，m指的是边数，n指的是节点数最快速的算法。

2random-walk betweenness:Sec.|||C

3.current-flow betweenness Sec|||B。

最后的算法大概思想如下

III. IMPLEMENTATION

A. Shortest-path betweenness

科普广度优先搜索时间复杂度O(m):

广度优先搜索算法（Breadth-First-Search BFS)，首先访问起始点v,然后由v出发访问v的各个未被访问过的邻接节点w1,w2..wn，然后再按照这个顺序依次访问w1所有未被访问过的邻接节点，以此类推直到所有的顶点都被访问过为止。以下附上深度优先搜索和广度优先搜索的c源码，这玩意儿弄了一晚上一上午，然后继续往下读论文。

#include "stdafx.h"
#include<stdio.h>
#include<stdlib.h>
#include<string.h>#define MAX 100
#define isletter(a)    ((((a)>='a')&&((a)<='z'))||(((a)>='A')&&((a)<='Z')))typedef struct _graph
{char vex[MAX];int vexnum;int edgnum;int matrix[MAX][MAX];
}Graph,*PGraph;static int get_position(Graph G, char ch)
{int i;for (i = 0; i < G.vexnum; i++){if (G.vex[i] == ch){return i;}}return -1;
}static char read_char()
{char ch;do {ch = getchar();} while (!isletter(ch));return ch;
}Graph* make_graph()
{char c1, c2;int e, v, i, p1, p2;Graph *pg;printf("节点数：");scanf_s("%d", &v);printf("边数：");scanf_s("%d", &e);if (v < 1 || e<1 || e>(v*(v - 1))){return NULL;}if ((pg = (Graph*)malloc(sizeof(Graph))) == NULL){return NULL;}memset(pg, 0, sizeof(Graph));pg->vexnum = v;pg->edgnum = e;for (i = 0; i < pg->vexnum; i++){printf("输入顶点%d: ", i);pg->vex[i] = read_char();}for (i = 0; i < pg->edgnum; i++){printf("输入边%d: ", i);c1 = read_char();c2 = read_char();p1 = get_position(*pg,c1);p2 = get_position(*pg,c2);if (p1 == -1 || p2 == -1){return NULL;}pg->matrix[p1][p2] = 1;pg->matrix[p2][p1] = 1;}return pg;
}
static int first_ver(Graph G, int v)
{int i;if (v<0 || v>(G.vexnum - 1))return -1;for (i = 0; i < G.vexnum; i++){if (G.matrix[v][i] == 1){return i;}}return -1;
}
static int next_ver(Graph G, int v, int w)
{int i;if (v<0 || v>(G.vexnum - 1) || w<0 || w>(G.vexnum - 1))return -1;for (i = w + 1; i < G.vexnum; i++){if (G.matrix[v][i] == 1){return i;}}return -1;
}
static void DFS(Graph G, int i, int *visited)
{int w;visited[i] = 1;printf("%c", G.vex[i]);for (w=first_ver(G,i);w>=0;w=next_ver(G,i,w)){if (!visited[w]){DFS(G, w, visited);}}
}
static void DFSTrivel(Graph G)
{int i;int visited[MAX];for (i = 0; i < G.vexnum; i++){visited[i] = 0;}printf("DFS: ");for (i = 0; i < G.vexnum; i++){if (!visited[i]){DFS(G, i, visited);}}printf("\n");
}
//
void BFS(Graph G)
{int head = 0;int rear = 0;int queue[MAX];int i,j,k;int visited[MAX];for (i = 0; i < G.vexnum; i++){visited[i] = 0;}printf("BFS: ");for (i = 0; i < G.vexnum; i++){if (!visited[i]){visited[i] = 1;printf("%c", G.vex[i]);queue[rear++] = i;}while (head != rear){j = queue[head++];for (k = first_ver(G, j); k >= 0; k = next_ver(G, j, k)){if (!visited[k]){visited[k] = 1;printf("%c", G.vex[k]);queue[rear++] = k;}}}}
}
void print_dodo(Graph G)
{int i,j;printf("Martix Graph:\n");for (i = 0; i <G.vexnum; i++){for (j = 0; j < G.vexnum; j++){printf("%d", G.matrix[i][j]);}printf("\n");}}
void main()
{Graph *pg;pg = make_graph();print_dodo(*pg);DFSTrivel(*pg);BFS(*pg);system("pause");
}

when there is only a single shortest path from the source vertex to any other (we will consider other cases in a moment) the resulting set of paths forms a shortestpath tree—see Fig. 4a.

当从源节点到所有其他的节点只存在一条最短路径的时候，最短路径的集合可以构成一颗树。

我们先找树的叶子节点，没有经过叶子节点到其余节点的最短路径，所以叶子节点相连的边打1分，然后从离源节点最远的的节点从下往上进行打分，每一个边的分数都等于与其相连的所有边的和值加上1，当走完整棵树的时候边界值的计算就来源与s节点，

Repeating the process for all possible vertices s and summing the scores, we arrive at the full betweenness scores for shortest paths between all pairs. The breadth-first search and the process of working up through the tree both take worst-case time O(m) and there are n vertices total, so the entire calculation takes time O(mn) as claimed.

对于有可能的s重复上述操作，然后把所有的s分数值相加，得到了最短路径总数，也就得到了每一条边的边介值，广度优先搜索时间复杂度最差o(m)量级，对n个节点做，最后就是o(mn)量级。

4b展示的树的最短路径有时候有等价的两条，当这种树存在的时候就应该平分权值，当有一条最短路径则权值就为1，当有三条等价的最小路径的时候权值就为1/3，例如Fig.4b当作广度优先搜索的时候采取如下的方案：

1. The initial vertex s is given distance ds = 0 and a weight ws = 1.

2. Every vertex i adjacent to s is given distance di = ds + 1 = 1, and weight wi = ws = 1.

3. For each vertex j adjacent to one of those vertices i we do one of three things:

(a) If j has not yet been assigned a distance, it is assigned distance dj = di + 1 and weight wj = wi .

(b) If j has already been assigned a distance and dj = di + 1, then the vertex’s weight is increased by wi , that is wj ← wj + wi .

4. Repeat from step 3 until no vertices remain that have assigned distances but whose neighbors do not have assigned distances.

1.给最初的节点s一个距离值ds=0，一个权重ws=1；

2.每一个与s相邻的节点i 距离di=ds+1=1 wi=ws=1;

3.与i相邻的所有j节点需要做三件事情：（a）如果j没有距离则给dj=di+1 wj=wi;(b)如果j已经有了距离：dj=di+1，wj=wj+wi；

（c）如果j有距离而且dj<di+1,什么都不做。

4.重复第三步的操作使得所有节点的邻接点都有距离和权重值。

如图的广度优先搜索使用栈和队列更有效，再进行如下的操作：

1. Find every “leaf” vertex t, i.e., a vertex such that no paths from s to other vertices go though t.

2. For each vertex i neighboring t assign a score to the edge from t to i of wi/wt.

3. Now, starting with the edges that are farthest from the source vertex s—lower down in a diagram such as Fig. 4b—work up towards s. To the edge from vertex i to vertex j, with j being farther from s than i, assign a score that is 1 plus the sum of the scores on the neighboring edges immediately below it (i.e., those with which it shares a common vertex), all multiplied by wi/wj .

4. Repeat from step 3 until vertex s is reached.

发现移除某一个边只会影响只会影响组成相同的边的边介值，具有较强社区结构的网络会在算法中很快分开，省掉了后续大量的工作，虽然不能适用于所有情况但是本文提到的事例在速度方面都有大幅度的减少。

有向图在路径最小化的时候要计算前进方向的路径，但是有时候忽略它们的方向性对于它们的社区聚类有更好的效果，大多数情况下边只是两个节点之间存在联系的一种表现，它的方向在这种意义下不是那么的重要，例如在食物链方面，方向代表一个物种以另一个物种为食，但是在不带有方向的社区划分中也能够很好的把物种区分开，还有很多例子都能证明方向在社区划分的边上并没有那么重要。

B. Resistor networks

to be continue

先贴两个准备参考的连接，先贴为敬

https://blog.csdn.net/marywbrown/article/details/62059231

https://blog.csdn.net/xuanyuansen/article/details/68941507

https://blog.csdn.net/u011604052/article/details/49685239

https://blog.csdn.net/marywbrown/article/details/62059231

http://www.yalewoo.com/modularity_community_detection.html#comments

https://en.wikipedia.org/wiki/Modularity_(networks)

Modularity and community structure in networks相关推荐

网络中的模块化和社区结构(Modularity and community structure in networks)
Machine learning and the physical sciences 摘要 Ⅰ.引言 Ⅱ.最佳模块化方法(The Method of Optimal Modularity) Ⅲ.将网络 ...
【论文解读】Finding community structure in networks using the eigenvectors of matrices
文章目录一.论文主体二.复现代码 2.1 算法大致流程 2.2 Python代码实现三.匈牙利算法 Reference 一.论文主体论文题目:<Finding community str ...
CS224W-图神经网络笔记4.1：Community Structure in Networks - 网络中社区的特性
CS224W-图神经网络笔记4.1:Community Structure in Networks - 网络中社区的特性本文总结之日CS224W Winter 2021只更新到了第四节,所以下文会 ...
【从零开始】CS224W-图机器学习-2021冬季学习笔记13.1：Community Structure in Networks
课程主页:CS224W | Home 课程视频链接:[双语字幕]斯坦福CS224W<图机器学习>课程(2021) by Jure Leskovec 文章目录 1 前言 2 预备知识和案 ...
Modularity的计算方法——社团检测中模块度计算公式详解（转）
文章转自http://www.yalewoo.com/modularity_community_detection.html Modularity,中文称为模块度,是 Community Detect ...
模块度(Modularity)与Fast Newman算法讲解与代码实现
原创文章,转载请注明,谢谢~ 一.背景介绍 Modularity(模块度), 这个概念是2003年一个叫Newman的人提出的.这个人先后发表了很多关于社区划分的论文,包括2002年发表的著名的G ...
模块度计算python代码_转：模块度(Modularity)与Fast Newman算法讲解与代码实现
一.背景介绍 Modularity(模块度), 这个概念是2003年一个叫Newman的人提出的.这个人先后发表了很多关于社区划分的论文,包括2002年发表的著名的Girvan-Newman(G-N) ...
最优化方法（Optimization methods）中的Modularity
Modularity介绍 2006年Newman在文献Modularity and community structure in networks中提出了modularity的概念,并将其作为一种在网 ...
Modularity的计算方法——社团检测中模块度计算公式详解
Modularity,中文称为模块度,是 Community Detection(社区发现/社团检测) 中用来衡量社区划分质量的一种方法.要理解Modularity,我们先来看社团和社团检测的概念. ...

Modularity and community structure in networks