Ethics in AI Lunchtime Research Seminar: Governing AI Datastructures

Professor Julie E Cohen: Both in the U.S. and in Europe, initiatives for AI governance have focused principally on identifying and mitigating the risks created by AI models and their downstream uses rather than on those created by the datasets on which the models are trained. However, some of the most intractable dysfunctions of generative AI systems involve datasets. In particular, the very large datasets amassed by dominant providers of generative AI and related services are rapidly taking on infrastructural characteristics and importance. Effective AI governance therefore requires an infrastructural turn in thinking about data. After explaining the significance of the infrastructure lens, the lecture will sketch some of the distinctive implications of data infrastructures, in particular, for governance of networked digital processes and the social and economic activities that they facilitate. Next, it will explore two interrelated problems manifesting within generative AI systems—simulation and sociopathy—that illustrate the extent to which the project of AI governance is, unavoidably, a data governance project. In brief, generative AI models trained on mass content from the open internet are also trained on data infrastructures that have been developed for behaviorist, extractive purposes and that encourage the production and spread of particular kinds of content and particular styles of communication. Finally, it will outline some needed directions for an infrastructural turn in AI governance.