Ravindranath Abtahian Maya, Cohn Abigail C., Pepinsky Thomas
This article introduces a quantitative approach to modeling language shift in communities with millions of speakers. Using Indonesia as a case study, and employing a large body of data from the Indonesian population census, we document how factors such as urbanization, ethnicity, economic development, gender, and religion correlate with the shift from local languages (Javanese, Sundanese, etc.) to the national language, Bahasa Indonesia. Our findings inform ongoing research on the sociological foundations of language shift across both small and large communities. Methodologically, we introduce a statistical approach that borrows from other social sciences, and show how to exploit massive amounts of untapped linguistic, demographic, and sociological data.