Differentially Private Community Detection over Stochastic Block Models
We are happy to co-host the talk by Prof. Ravi Tandon as a part of Trust Lab Colloquium.
Community detection is a fundamental problem arising in graph mining and machine learning, with numerous interesting applications in social networks, image segmentation and biological networks. The goal of community detection over graphs is to recover underlying labels/attributes of users (e.g., political affiliation) given the connectivity between users (represented by adjacency matrix of the graph). There has been significant recent progress on understanding the fundamental limits of community detection when the graph is generated from a stochastic block model (SBM). Specifically, sharp information theoretic limits and efficient algorithms have been obtained for SBMs as a function of p and q, which represent the intra-community and inter-community connection probabilities. In this talk, we will give an overview of our recent work on differentially private community detection. In many problem scenarios, one may wishto perform clustering/community detection while protecting privacy of sensitive information (such as connectivity between two users which could be sensitive information). SBMs provide a fertile ground to design and study such privacy preserving mechanisms while facilitating utility analysis. Focusing on the notion of edge-differential privacy (edge-DP), we seek to understand the fundamental tradeoffs between SBMs network connectivity (p, q), privacy budget and computational efficiency for the recovery of community labels. To this end, I will discuss and present the associated information-theoretic tradeoffs for three broad classes of differentially private community recovery mechanisms: a) stability based mechanism; b) sampling based mechanisms; and c) graph perturbation mechanisms. Our main findings are that stability and sampling based mechanisms lead to a superior privacy-recovery tradeoff however this comes at the expense of higher computational complexity. On the other hand, albeit low complexity, graph perturbation mechanisms require the privacy budget to scale logarithmically with the graph size for exact recovery. I will conclude with several open problems and challenges in this topic. This talk is based on joint work with Mohamed Seif, Dung Nguyen and Anil Vullikanti and was presented at ICML.
Ravi Tandon is the Litton Industries John M. Leonis Distinguished Associate Professor in the Department of ECE at the University of Arizona. He received the B.Tech. degree in Electrical Engineering from IIT Kanpur in 2004 and the Ph.D. degree in Electrical and Computer Engineering from the University of Maryland, College Park (UMCP) in 2010. >From 2010 to 2012, he was a post-doctoral research associate at Princeton University. Prior to joining the University of Arizona in Fall 2015, he was a Research Assistant Professor at Virginia Tech with positions in the Bradley Department of ECE, Hume Center for National Security and Technology and at the Discovery Analytics Center in the Department of Computer Science. He is a recipient of the 2018 Keysight Early Career Professor Award, NSF CAREER Award in 2017, and a Best Paper Award at IEEE GLOBECOM 2011. He is a Senior Member of IEEE and currently serves as an Editor for IEEE Transactions on Information Theory, IEEE Transactions on Wireless Communications, and IEEE Transactions on Communications. His current research interests include information theory and its applications to wireless networks, signal processing, communications, security and privacy, machine learning and data mining.