In this paper, i will give a summary about the article A nonparametric view of network models and Newman – Girvan and other modularities written by Peter J. Bickela and Aiyou Chen \cite{nonp}.

The purpose is to identificate community structure in a network, where we look at networks as a set of vertices with connections among them, and vertices whitin a community have the same densities of connecting with vertices in their own community as well as different ones with other communities. The analysis is asymptotic, insofar the number of vertices go to infinity. The networks can be described by parametric models, the so called blockmodels, as follows.

There are unknown subcommunities. The connections among the vertices are represented by undirected edges. This gives us an undirected graph and the corresponding adjacency matrix with the following properties.

- Individuals independently belong to community with probability
- If an individual belongs to the community , then the probability of is in connection with the individual from the community is . This gives a symmetric matrix of probabilities.

** — Newman–Girvan and Likelihood Modularities — **

In order to find the communities we have to give an assignment for the vertices where . In the case of an observed graph is a realization of independent variables. To measure the adequacy of an assignment we can use the matrix

where . If we would get

where and . In the definition of the Newman-Girvan modularity we use the following quantities.

It can be seen, that is twice the number of edges among the members of the -th community. Let , so is the sum of degrees in the -th community. Let , so is twice the number of edges in the network. Now we can define the Newman-Girvan modularity as

The Newman-Girvan algorithm searches for the assignment vector that maximizes . To express this in a shorter form we define as

The other studied modularity is the likelihood modularity, which we get by maximizing the log-likelihood of over .

where if , , and .

In general we think of modularities in the form,

where , and is the matrix , and is a real valued function. We will give conditions under which the assignment is consistent. The consistency of a sequance of assignments means that

It can be seen that

where .

We assume that can be approximated with a function . We also assume first that . Let We need the following conditions.

- with is the right assignment for the limiting problem.
- does not have two identical columns, and
- The third condition contains technical conditions.

Suppose and satisfy 1-3 and . Suppose . Then,

It can be shown that the likelihood modularity is always consistent, but the Newman-Girvan is not.

** — Simulation — **

To perform an empirical comparison of Newman-Girvan, likelihood modularities and spectral clustering,where , random matrices were generated and the modularities were maximized on them. The percent of incorrect labeling was plotted depending on the average node degree.

\leavevmode \includegraphics[width=0.7\textwidth]{simu.png}

\caption{Empirical comparison} On figure 0 we can see that the likelihood method gives much less incorrect labeling than Newman-Girvan and spectral clustering.

** — Data example — **

The L-M and N-G modularity algorithms were also compared with applications to real data sets. One of them was Zachary’s Karate Club Network, which is often used to test community detection algorithms. In this network the edges mean friendship among the members of the karate club. Shortly after the observation of the friendships the club split into two disjoint parts as a result of internal agreement. This is marked whit a dashed line on the figure. The left side of figure 0 shows the result given by the likelihood algorithm, and the right side shows the result given by the N-G algorithm. The shapes of the nodes give the assignment for the vertices.

\leavevmode \includegraphics[width=0.8\textwidth]{karate.png}

\caption{Zachary’s karate club network}

{9} \bibitem{nonp} Peter J. Bickel, Aiyou Chen: *A nonparametric view of network models and Newman-Girvan and other modularities*, Proceedings of the National Academy of Sciences, Vol. 106, No. 50. (15 December 2009), pp. 21068-21073.

## Hozzászólás

Comments feed for this article