Innovation and Decolonization: Decolonizing Cutting-Edge Machine Learning and Natural Language Processing Advancements for Non-Western Contexts

Rapid innovation within machine learning (ML) and natural language processing (NLP) have revolutionized various domains, including the social sciences. However, these methods oft remain Eurocentric, presenting significant challenges when applied to non-European languages and contexts. Addressing this gap requires urgent efforts to decolonize ML and NLP——reconfiguring these technologies to serve diverse cultural and linguistic landscapes more equitably.

This talk highlights the importance of adapting ML and NLP innovations to non-Western contexts, focusing on their social impacts. A particular emphasis is placed on Chinese-language contexts—the most spoken language in the world—wherein the integration of these technologies has led to the emergence of a cutting-edge interdisciplinary subfield: Chinese computational sociology. This evolving field resides at the exciting nexus of China studies, computer science, and sociology, offering unique insights into societal issues in Chinese-speaking regions and diasporas.

A critical review of extant literature underscores the unique challenges associated with applying ML and NLP to Chinese-language data, including linguistic complexity, contextual nuance, and limitations in algorithmic design. This talk introduces potential novel solutions to overcome these barriers——paving the way for more inclusive and culturally sensitive applications of machine learning.

Chinese computational sociology has the potential to redefine and transform the relationship between technology and social impact——fostering meaningful innovations that address the needs of non-Western contexts. In an age of rapid tech innovation, it is essential that foundational ML and NLP algorithms are just, inclusive, and representative.

Linda Hong Cheng is an award-winning researcher & public speaker, Computational Sociology PhD researcher & Clarendon Scholar Oxford, and WEAI Fellow Columbia University. Linda is the Founding Director of Girlpane, an art-tech startup dedicated to uplifting women artists and revolutionizing creative spaces through groundbreaking exhibitions and summits at the cutting-edge of art, tech, academia, and business, and the Co-Founding CEO of Mung!, the world’s first AI-driven AgeTech startup dedicated to making digital tech inclusive of all ages.

At Oxford, Linda’s research grapples with the contentious relationship between gender and tech, focusing on global digital gaps, social demographic trends, and contentious politics using novel computational methods (e.g., machine learning, agent based modeling, NLP). Her dissertation, generously supported by the Gates Foundation-funded Digital Gender Gaps project, establishes ‘digital gender circularity’: The bidirectional relationship between increasing digital gender equality and offline gender equality—encompassing educational equality, economic empowerment, political representation, and health outcomes.

Linda’s talk is based on her chapter for the Oxford Handbook of the Sociology of Machine Learning, which she was the youngest scholar invited to contribute to. Her work takes a critical decolonial approach to natural language processing (NLP) as both conceptual framework and practical toolkit, pointing out its inherent Eurocentrism and Anglocentrism. Establishing a new subfield situated at the intersections of sociology, computational methods, and China studies she terms ‘Chinese computational sociology’, Linda suggests new and exciting avenues for the incorporation of non-European languages, particularly Chinese, into NLP frameworks. You can find her on LinkedIn and Instagram @ Linda Hong Cheng.