Abstract

Proceedings Abstracts of the Twenty-Fourth International Joint Conference on Artificial Intelligence

CEIL: A Scalable, Resolution Limit Free Approach for Detecting Communities in Large Networks / 2097
Vishnu Sankar, Balaraman Ravindran, Shivashankar S
PDF

Real world networks typically exhibit non uniform edge densities with there being a higher concentration of edges within modules or communities. Various scoring functions have been proposed to quantify the quality of such communities. In this paper, we argue that the popular scoring functions suffer from certain limitations. We identify the necessary features that a scoring function should incorporate in order to characterize good community structure and propose a new scoring function, CEIL (Community detection using External and Internal scores in Large networks), which conforms closely with our characterization. We also demonstrate experimentally the superiority of our scoring function over the existing scoring functions. Modularity, a very popular scoring function, exhibits resolution limit, i.e., one cannot find communities that are much smaller in size compared to the size of the network. In many real world networks, community size does not grow in proportion to the network size. This implies that resolution limit is a serious problem in large networks. Modularity is still very popular since it offers many advantages such as fast algorithms for maximizing the score, and non-trivial community structures corresponding to the maxima. We show analytically that the CEIL score does not suffer from resolution limit. We also modify the Louvain method, one of the fastest greedy algorithms for maximizing modularity, to maximize the CEIL score. We show that our algorithm gives the expected communities in synthetic networks as opposed to maximizing modularity. We also show that the community labels given by our algorithm matches closely with the ground truth community labels in real world networks. Our algorithm is on par with Louvain method in computation time and hence scales well to large networks.