Recently, the National Science Foundation (NSF) together with the Office of Science and Technology Policy (OSTP) issued a Request for Information to inform the development of federal government infrastructure to support research with AI (see here).
What does this request have to do with archaeology in general, and AAI/Open Context programs in particular?
For the most part, our organization has had little direct role with Artificial Intelligence (AI) projects inside or outside of archaeology. We mainly focus on issues related to the publication and reuse of archaeological data. But AI can be used with archaeological data, so it needs to be on our radar.
Before we explore this topic further, we should try to clarify what we mean by “AI”. The Wikipedia says:
…Leading AI textbooks define the field as the study of “intelligent agents“: any system that perceives its environment and takes actions that maximize its chance of achieving its goals.[a] Some popular accounts use the term “artificial intelligence” to describe machines that mimic “cognitive” functions that humans associate with the human mind, such as “learning” and “problem solving”.[b]
Different technologies for “machine learning” (ML) seem to fall under the AI umbrella, and machine learning is increasingly a part of archaeology’s data analysis toolkit. Besides that, AI technologies also get deployed in computer games and simulations, including archaeology themed games (see Meghan’s presentation).
We see the NSF/OSTP request for information as an important opportunity to promote computing infrastructure with governance that helps promote equity and ethics. AI is often deployed on private computing platforms to monitor and manipulate people. The so-called “invisible hand” of the marketplace is increasingly manipulated by the invisible machine-minds of big tech monopolies with little public accountability or scrutiny. We should take this opportunity to articulate how an AI infrastructure managed on behalf of the public can better align with public interests.
Some issues to consider:
- Environmental impacts: Some approaches to machine learning consume vast computing resources, which translates to energy consumption that further exacerbates climate change. To be environmentally justified, machine learning projects should lead to outcomes that overall reduce energy and other environmental costs.
- Equitable participation and benefits: AI and machine learning can be powerful tools to advance the agendas of certain organizations and individuals. As such, they can (and do) further entrench existing inequalities in wealth and power. A public AI and machine learning infrastructure needs governance and leadership inclusive of underrepresented communities.
- Reduced risks: AI and machine learning technologies are powerful tools and can lead to a host of negative outcomes if deployed without careful consideration of risks. Projects that propose to use public AI infrastructure need something like an IRB (Institutional Review Board) process to demonstrate how they will reduce risks that would harm people, especially people from more vulnerable populations.
- Transparency, reproducibility: AI and machine learning have well-known “black box” concerns. Biased and prejudicially collected data often inform AI and machine learning algorithms in ways that reinforce and further entrench systemic biases and prejudices. All aspects of AI development, including selection and use of training datasets, data analysis steps and assumptions, software, and the like, need to be made transparent and evaluated in open and inclusive forums.
- Stakeholder driven agendas: Public AI infrastructure should seek the guidance and leadership of wider and more inclusively represented stakeholders to shape research agendas, risk evaluation processes, and benefit sharing processes. More inclusively shaped research agendas will not only promote more ethical use of AI and machine learning technologies, it will help build stronger foundations of trust, understanding, and public support for technologies that serve the interests of broader publics.
Again, Open Context does not currently use any AI technologies, and we have no special expertise about the risks and opportunities of AI. However, as a publisher of open data about the human past, we have responsibilities to advocate for the ethically responsible use of these and other research datasets, including in their use in AI applications. In general, the research community needs to devote more attention toward the governance of the information infrastructure that powers their work. Publishing services, libraries, archives, social media, search engines, and more make up that infrastructure. We should use this opportunity to voice how AI infrastructure can and should be governed to also reimagine how other infrastructures should be run with greater fairness and equity.