Tim LaRock

I am currently a postdoctoral research associate at Princeton University working in the Complex Infrastructure Systems group with Professor Jürgen Hackl. Previously, I was a postdoctoral research associate at the Oxford Mathematical Institute with Professor Renaud Lambiotte where I developed methods for modeling multi-way interaction networks. In December 2021 I earned my PhD in Network Science at the Northeastern University Network Science Institute, where I was advised by Professor Tina Eliassi-Rad and specialized in the analysis of higher-order sequential correlations in network data. Prior to joining the Institute in 2016, I completed a BS in Computer Science and Applied Mathematics with a minor in Philosophy at the State University of New York at Albany, where I conducted research on load balancing in cellular networks and unsupervised transmitter detection in wireless frequency spectrum data, under the supervision of Prof. Petko Bogdanov and Prof. Mariya Zheleva.

You can find a copy of my CV here and my Google Scholar page here. I have also written for a general audience on science, politics, and more; you can find clips on my Writing page.

Research

My research falls at the intersection of network science, data mining and machine learning. In particular, my work seeks to identify and understand “higher-order” patterns in network data. This includes both the structure of group interactions (usually modeled with hypergraphs), as well as sequential patterns and dependencies in network data, such as passenger movement through public transit systems, goods through logistics networks, or users navigating the Web. I am also interested in evolutionary game theory and studying the co-evolution of group interaction networks with evolutionary game dynamics to help understand both the dynamic and structural features of important phenomenon, especially altruistic cooperation. Finally, some of my early work used machine learning to improve partially observed (sampled) network data and I remain interested in the limitations of machine learning for solving network problems.

Projects

Non-uniqueness of Node Co-occurrence Matrices of Hypergraphs

Hypergraphs extend traditional networks by capturing multi-way or group interactions. Given the complexity of hypergraph data and the wide range of methodology available for pairwise network analysis, hypergraph data is often projected onto a weighted and undirected network. The simplest of these projections, often referred to as a node co-occurrence matrix, is known to be non-unique, as distinct non-isomorphic hypergraphs can produce the same weighted adjacency matrix. This non-uniqueness raises important questions about the structural information lost during the projection and how to efficiently quantify the complexity of the original hypergraph. Here we develop a search algorithm to identify all hypergraphs corresponding to a given projection, analyze its runtime, and explore its parallelisability. Applying this algorithm to projections derived from a random hypergraph model, we characterize conditions under which projections are non-unique. Our findings provide a new framework and set of computational tools to investigate projections of hypergraphs.

Encapsulation Structure and Campfire Dynamics in Hypergraphs

We explore the properties of real-world hypergraphs, focusing on the encapsulation of their hyperedges, which is the extent that smaller hyperedges are subsets of larger hyperedges. Building on the concept of line graphs, our measures quantify the relations existing between hyperedges of different sizes and, as a byproduct, the compatibility of the data with a simplicial complex representation – whose encapsulation would be maximum. We then turn to the impact of the observed structural patterns on diffusive dynamics, focusing on a variant of threshold models, called encapsulation dynamics, and demonstrate that non-random patterns can accelerate the spreading in the system.

Understanding Higher Order Correlations in Pathway Data

Data representing pathways or sequences of nodes traversed in a network, such as people moving through a public transit system or navigating hyperlinks on the Web, is commonly studied in Network Science. Traditionally, network scientists studied such data by aggregating it into weighted networks, destroying sequential or temporal correlations in the process. More recently, researchers have begun to dig in to these temporal correlations to understand mechanisms of pathway generation and how this generation impacts network structure. I am interested in studying “higher order networks” (specifically De Bruijn graph representations) to better understand pathway data on its own terms. I am also interested in connecting the sequential pattern mining literatures, developed in large part by the computer science/data mining community, with perspectives and approaches developed more recently by network scientists.

Sequential Motifs from Pathway Data

We use DeBruijn graphs to extend the concept of motifs as building blocks of complex networks to pathway data, studying sequential motifs. We use the fact that a weighted edge in a kth order DeBruijn graph represents the frequency of a length k path through a network, and that these edges can be mapped into a common motif space. We show that analyzing motifs based on pathway data using traditional static-network techniques can be misleading if the static structure encodes patterns that are possible based on the structure alone, but do not actually appear in the pathway data. Beyond counting, we can also compare the overall frequency of motif structures with their frequency after applying HYPA, a null model for DeBruijn graphs that identifies paths observed significantly more or less often than expected. This analysis provides insight into the mesoscale navigation patterns that drive microscale interactions between nodes.

Higher-order Analysis of Global Shipping Network Data

We build on previous work studying global liner shipping service route data, contributing a path-centric approach that takes advantage of sequential information in the shipping routes. In place of the shortest paths that were central to previous work, we focus on minimum-route paths, or paths that use the minimum number of transfers between shipping routes. We find that previous work overestimated the role of nodes and edges through the “structural core” of the network as defined by that work.

Resampling Partially Observed Network Data

In network science, we often deal with partially observed data, such as sampled interactions on social media gathered from Twitter. In many circumstances, we have some resource limited ability to resample the data, for example by accessing an API. In our work, we develop methods for the following scenario: You are given a sample of a larger network, the ability to query nodes in the sample to learn more accurate information about them (such as their true neighborhood or attribute labels), and a function that provides a mathematical reward given the outcome of a query. The goal of our methods is to learn to predict which nodes one should query to maximize reward in their sample.

Human Mobility and Physical Distancing during the COVID-19 Pandemic

Collective physical distancing has been one of the most important tools used to slow the spread of COVID-19. In a collaboration with the MOBS Lab at Northeastern (among many others), we analyzed mobility in the United States via large-scale anonymized cell phone GPS data.