I hold a joint position in the Department of Linguistics and the Program in Indo-European Studies at UCLA, along with a courtesy appointment in the Department of Classics. My research centers on two main areas. One line of work applies computational phylogenetics and statistical modeling to the study of language change and sociocultural evolution. The other explores how language change informs linguistic theory, with a focus on syntactic and semantic change in Indo-European.
From December 2022 through July 2023, I was a Visiting Fellow at Clare Hall, University of Cambridge and during Easter Term 2023, I served as a Lewis-Gibson Fellow at the Cambridge Centre for Greek Studies. For 2025-26, I will be a Fellow at the Swedish Collegium for Advanced Study (SCAS).
I am honored to be a member of the 2021 cohort of Guggenheim Fellows.
Download my CV.
Ph.D., 2010
University of California, Berkeley
M.A., 2004
University of California, Berkeley
M.Phil., 2002
University of Oxford, Corpus Christi College
B.A., 2000
Amherst College
The relationship between inflectional case marking and the emergence of definite and indefinite articles has been widely invoked but rarely tested quantitatively. This study provides a large-scale statistical evaluation across 94 Indo-European languages. It uses Bayesian logistic regression models that account for phylogenetic and spatiotemporal autocorrelation and integrate over phylogenetic uncertainty. The results establish a robust inverse association between case-inventory size and the presence of articles. Estimated probabilities indicate that lower case-inventory sizes are associated with higher probabilities of definite articles and, conditional on their presence, of indefinite articles. The posterior median predicted probability exceeds 0.5 at four cases for definite articles and between three and four cases for indefinite articles. These thresholds correspond to reduced case inventories composed primarily of grammatical cases. The presence of a definite article further increases the probability of an indefinite article independent of case-inventory size. The analysis shows that the observed association is compatible with direct, mediated, and common-cause accounts and does not uniquely identify a causal pathway. The study sharpens the empirical basis of the debate and identifies the causal and evolutionary questions that remain open.
Linguistic phylogenies are commonly inferred from abstract cognate classifications that encode relationships among lexemes. Although widespread, this practice has well-recognized limitations: it discards the phylogenetic signal contained in segmental word forms; restricts the range of evolutionary questions that can be addressed; and treats cognacy judgments, which are hypotheses, as observed data. We introduce a comparative framework that addresses these limitations by modeling the evolution of aligned cognate word forms directly. Our approach adapts the TKF91 model of molecular evolution, originally developed to account for insertion and deletion events in DNA sequences, to the domain of linguistic data. By operating on segmental strings rather than abstract character codings, the framework enables phylogenetic inference from observable word forms and supports quantitative investigation of sound change. We demonstrate its utility through analyses that illuminate patterns of segmental stability and the evolution of phonological inventories.
Divergence-time estimation is one of the most important endeavors in historical linguistics. Its importance is matched only by its difficulty. As Bayesian methods of divergence-time estimation have become more common over the past two decades, a number of critical issues have come to the fore, including model sensitivity, the dependence of root-age estimates on uncertain interior-node ages, and the relationship between ancient languages and their modern counterparts. This study addresses these issues in an investigation of a particularly fraught case within Indo-European, the diversification of Latin into the Romance languages. The results of this study support a gradualist account of their formation that most likely begins after 300 CE. They also bolster the view that Classical Latin is a sampled ancestor of the Romance languages (i.e., it lies along the branch leading to the Romance languages).