Nicolás Kim


About me

I am a Ph.D. candidate in the Department of Statistics & Data Science at Carnegie Mellon University. Previously, I was at Boston University where I graduated with a B.A. in Mathematics.

Alessandro Rinaldo is my advisor. My current research interests are elements of:
Statistics ⋂ (Networks ⋃ Algebra).

My email is [AndrewID]
My AndrewID is nicolask.



  1. Nicolas Kim, Alessandro Rinaldo (2017). Edge-Induced Sampling from Graphons. Preprint. [pdf].
  2. Nicolas Kim, Alessandro Rinaldo (2017). Community Detection on Ego Networks via Mutual Friend Counts. Preprint. [pdf].
  3. Jin Cao, Sining Chen, Sean Kennedy, Nicolas Kim, Lisa Zhang (2017). Extracting Mobile User Behavioral Similarity via Cell-Level Location Trace. 20th IEEE Global Internet Symposium (GI 2017).
  4. Nicolas Kim*, Dane Wilburne*, Sonja Petrović, Alessandro Rinaldo (2016). On the Geometry and Extremal Properties of the Edge-Degeneracy Model. Third SDM Workshop on Mining Networks and Graphs. [arXiv].
  5. Nicolas Kim (2016). The Effect of Data Swapping on Analyses of American Community Survey Data. Journal of Privacy and Confidentiality: Vol. 7: Iss. 1, Article 3. [Published] [arXiv] [Code].

Social networks

Global community detection for giant social networks is intractable. In recent work, we proposed an algorithm that solves a local version of the community detection problem, and also established statistical guarantees for this procedure. (Publication 4)

Extending this idea, we described a theoretical framework for sampling via connectivity for a general class of random graph models. (Publication 5)

Algebraic statistics

I wrote some slides to introduce the ideas of Diaconis and Sturmfels (1998) along with some basic computational algebraic geometry. These are intended for a 30-minute talk.

For those interested in learning more about this new field (which is at the intersection of statistics and algebraic geometry) I recommend the book "Lectures on Algebraic Statistics" by Drton, Sturmfels, and Sullivant.

We characterized the extremal asymptotics of a particular exponential random graph model (ERGM), the edge-degeneracy ERGM. (Publication 2)

Data privacy

I am interested in improving the quality of publicly-released Census data for the Decennial Census as well as non-census surveys. In particular, I analyzed the statistical properties of a class of privacy-inducing algorithms. (Publication 1)

I gave a talk in Washington, D.C., at the NSF-Census Research Network Fall 2015 meeting; here are the slides.

This research was partially funded by Stephen E. Fienberg and William F. Eddy through NSF grant SES 1130706.


Meta-analysis and biostatistics

I was a statistics research intern at Lilly during the summer of 2017. There, I developed a Bayesian model for predicting which diseases a new drug may be effective in treating. In simultaneously modeling various diseases, this work extends existing meta-analytic methodology for predicting the outcome of untested comparisons between drugs for a single disease.

Spatiotemporal modeling

I was a statistics research intern at Bell Labs during the summer of 2016. There, I modeled routines in human location data.

The CMU wordmark as a scatterplot of data, clustered with DBSCAN

Code repositories

My directory listing where you can download code, resources, etc.

My GitHub profile.

Learn You a Data Analysis!

Check out my new project, Learn You a Data Analysis!

CMU Statistics T-shirt

I wrote the code to generate the Department of Statistics' T-shirt: data-ify your logo or wordmark!

My résumé, LinkedIn, blog, and ORCID.

I am a member of the Networkshop research group at CMU.

Updated August 2017