April 19, 2017
This is a cross-posting from an April 17, 2017 post on Heritage Bytes, the Open Context blog:
For the week of April 17-21, we’re joining a large community-wide effort to raise greater awareness of “endangered data”. In light of all of the other crises in the world, highlighting endangered data may seem silly. After all, given the daily news onslaught of increasing authoritarianism, kleptocracy, war, bigotry, poverty and environmental problems, the fate of abstract electronic databases seems low on the priority list.
However, we argue that safeguarding data represents a need to safeguard our civil liberties, civil society, future environment, and broader understanding of our world. This last point is key. Data are often integral to how we try to understand the world.
As authoritarianism takes hold, data become increasingly politicized and precarious. Authoritarians attempt to dictate what is and is not true. Truth must conform to the needs of vested interests or ideologies or it will be suppressed. The current administration’s assault on climate science represents a stunning assault on an “Inconvenient Truth” (so aptly named by Al Gore). Beyond climate science, researchers create data key to understanding social, historical, and governance issues. Like climate science, better understanding in these other domains can threaten powerful and entrenched interests, which is why authoritarians may seek to suppress or corrupt data documenting such topics.
Unfortunately, we don’t really understand the full scope and magnitude of what data may be under threat. We also don’t have a good understanding of what threats may be more immediate and where to prioritize our “data rescue” efforts. But here are some (incomplete) thoughts about what threatens data:
- Outright Suppression: Some datasets may be suppressed and destroyed overtly. This is a digital equivalent of burning books or even whole libraries.
- Lack of Funding: Creating, maintaining, curating and preserving data all require effort, often by dedicated professionals and institutions. Cutting off funding to these professionals and organizations will quickly endanger data.
- Lack of Time: People need time to dedicate their attention to work on data. Badly structured rewards, incentive systems, and other bureaucratic pressures in academic research, force many researchers to neglect data. Researchers need intellectual freedom to devote their time toward data, where the rewards are still uneven and uncertain.
- Lack of Access: Hiding data away from wider scrutiny makes it easier to delete, alter or corrupt. It also makes it easier to make spurious claims (and harder to refute them).
- Collection Biases: Political and ideological agendas shape how we collect data and what data we collect. We’ve already seen Republican attempts to cease collecting data about housing discrimination, no doubt with a motivation to make the problem “disappear”.
- Analytical Biases: Data need analysis to be interpreted and used. People apply different models and analytic methods that may (or may not) explicitly or implicitly bias understanding of data.
- Filter Biases: The past several months have provided a hard education on the problem of “fake news” (propaganda) in the contemporary news media. Even if we manage to preserve some integrity in our data and analyses, we face the steep challenge of communicating our understandings in an overtly hostile and ideologically-charged media environment.
In arguing for the importance of data, we’re not suggesting that data are wholly objective or empirical. Data are never complete, perfect, or objective. As brilliantly discussed by Cathy O’Neal, data reflect our incomplete and often biased views of the world. Because data, like other forms of knowledge, are imperfect, they need to be a part of open conversations and debates in civil society. If we do a better job at making data more open to critique and evaluation from people with a wider variety of perspectives, we can improve both the data themselves and our understandings derived from them.
Over the past several months, we have taken part in “data rescue” events organized across the nation. There is a strong focus on climate data, but our participation involved endangered data from National Park Service websites. Working with Max Ogden and colleagues at the California Digital Library, we safeguarded more than a terabyte of data from a National Park Service database, as well as some 20,000 web pages, especially those that bring US national parks to underrepresented communities (African American, Asian American, Native American, LGBTQ).
As we move forward with Endangered Data Week, we will post more about the needs to protect public data, some of the importance of public data for a healthy civil society, and some of our broader collaborations to make public data better protected and understood.
January 23, 2017
Early next month, the AAI will participate in a conference at Harvard University on Critical Perspectives on the Practice of Digital Archaeology. Hosted by Harvard University’s Standing Committee on Archaeology, the February 3-4 event will cover topics related to digital technologies and how they are transforming archaeological practice.
Conference co-organizers Eric Kansa (Program Director for Open Context at the AAI) and Rowan Flad (Professor of Anthropology and Chair of the Standing Committee on Archaeology, Harvard University) ask participants to consider how current research data management and curation practices can better support new scholarship, instruction and engagement in archaeology. Speakers herald from the Harvard community and from institutions across North America and include partners from the DINAA project and the Secret Life of Data (SLO-data) project, funded by the Institute of Museum and Library Services and the National Endowment for the Humanities, respectively.
An overarching theme of the conference is the need for new skills, professional roles, and professional incentives to make data more meaningful to scholarship. Attendees will hear presentations and panel discussions on the first day discussing the impact of digital technologies on the entire life-cycle of archaeological data, from the process of data capture and creation to the challenges of data curation and reuse.
The second morning of discussions in a workshop format, led by Anne Austin (Stanford University) and Eric Kansa, will introduce archaeologists to the fundamentals of good data practices, open source software tools for data cleanup, and practice to better share and preserve research data.
Kansa, who for more than a decade has led programs to preserve and share archaeology’s digital record through AAI’s Open Context data publishing service, explains that the industry is at a crossroads with most archaeologists, historians, and other social scientists uninformed about how to make their research accessible. “There is an urgent need for this conference to improve the application and integrity of stored research data,” Kansa said. “We have a tremendous responsibility to the public to share our understanding about what’s factual, what’s uncertain, and do so in a way that builds more trust and confidence in research. That’s why data skills are so critical in the 21st century.”
Visit the conference webpage to view the full program and panelist bios: http://archaeology.harvard.edu/critical-perspectives-practice-digital-archaeology. The conference is free and open to all, but attendees are requested to register on the website by January 25.
December 30, 2016
“…though the past never really repeats itself, it offers lessons about the risks and opportunities of periods of fundamental change.”
The end of the year invites reflection about changes and prospects for the future. Obviously, 2016 brought considerable turmoil and uncertainty to key institutions. The past year highlighted issues of “fake-news” and disinformation, as well as widespread suspicion of science and rationality.
As social scientists, we recognize the complexity and challenge of understanding the forces behind the events of 2016. As historians, we also recognize that though the past never really repeats itself, it offers lessons about the risks and opportunities of periods of fundamental change. If anything, 2016 highlights how the conventional wisdom of recent years got so much so wrong.
Conventional wisdom saw technology and innovation as keys to prosperity and global competitiveness. Public policy promoted science, technology and engineering for narrow and short-term immediate goals. The social science jargon term for this is “instrumentalism,” meaning learning and knowledge have to have immediate practical application to be of value, while learning for the sake of curiosity or the joy of discovery represents nothing but waste. However, instrumentalism has costs. In the process of ignoring and marginalizing learning about culture and history, we lost perspective. Technology is not divorced from society and culture, and if we fail to put as much energy and care into understanding and improving society and culture, we’ll lose or badly warp any gains we make from technical innovation.
Moving forward, we urgently need humanistic and historically-informed perspectives. Archaeology provides a lens to explore past cultures and alternate ways of organizing society. In learning about the past, we broaden our horizons and that broadens our imagination about better futures.
AAI helps fill this gap. We unite learning and excellence in technology with the exploration of culture and history through the archaeological past. Just as importantly, we also work to imagine new institutional frameworks to support research and learning. Currently, the vast majority of our understanding about the human past is locked in the Ivory Tower behind expensive pay-walls. This hinders important breakthroughs and stifles progress.
Much of this research is publicly funded, paid for with your tax dollars, but has been privatized and made accessible only to a few. AAI democratizes knowledge by opening-up access to databases, images, and other excavation discoveries – primary sources that are rarely shared and vulnerable to loss.
Through our open-data publishing program, Open Context, we work to break down those pay-wall barriers by pioneering ways to make heritage accessible to everyone.
As 2016 comes to a close, we invite you to share in our mission by making a 100% tax-deductible contribution and join some of the major institutions that have supported our organization, including the U.S. National Endowment for the Humanities, the Alfred P. Sloan Foundation, Google (via Eric Kansa), and William and Flora Hewlett Foundation.
We would greatly appreciate any donation amount. Every act of support will help us develop a better way to unlock our past, and inspire the future.
December 15, 2016
We are happy to announce that Dr. Federico Buccellati will join the AAI in 2017 as a Research Fellow, thanks to the generous support of his project by the National Endowment for the Humanities (NEH) and the Andrew W. Mellon Foundation. In an announcement from the NEH this week, Buccellati is named as one of 86 recipients in the Fellowships grant program. His grant is a Fellowship for Digital Publication, supported jointly by the NEH and the Mellon Foundation.
Federico is the principal investigator of Calculating the Costs of Ancient Buildings, an innovative publication project that makes the study of ancient architecture and the logistics of constructing monumental buildings more reproducible.
“Architecture is one of the main elements of material culture that archaeologists find in the archaeological record. One of the most important aspects of architecture is the process of construction leading up to the first use of the building. Cost-calculation-algorithms can be applied to the volumes of ancient architecture to explore the temporal, material or energetic ‘cost’ of the steps of that process. Up to now this has been done on an ad-hoc basis, with scholars finding appropriate comparisons. This project will produce an interactive interface where scholars enter volumetric data from their research. The algorithms draw from a wide variety of sources from across diverse cultural spheres. The final result will be a web-based interface published on GitHub so that future scholars can add to the algorithms and sources.”
We are very excited about this project because it ties together the types of data we work to curate with the kinds of reproducible analyses needed to strengthen the rigor of our knowledge about the past. The AAI and Open Context will provide Federico with technical assistance and support in developing, disseminating and preserving his publication outcomes.
December 12, 2016
This past summer, we kicked off a 3-year project aimed at improving the flow of information from the moment of discovery through to publication and beyond. This project, funded by a grant from the NEH, takes a unique approach of exploring the data lifecycle through a series of interviews with data creators and reusers, in excavation and laboratory settings. Work this past summer included interviewing archaeologists about their field data collection procedures and visiting excavations to conduct ethnographic observations on data documentation processes in the field. This work resulted in an abundance of unstructured and semi-structured observation and interview content for our team to analyze. In order to do this analysis, however, we had to develop a codebook—a set of terms that we could use to mark up and analyze the transcriptions. In September, our team came together for four days to develop the codebook and to begin coding interview transcriptions to determine our rate of inter-researcher reliability. We are fortunate to have team members with experience in enthographic research, interview coding and qualitative social sciences. We will share the interview protocols and codebooks that our team develops over the course of this project on the project webpage, with the hope that others will find these tools and approaches useful for the analysis of qualitative data.
(At left: NEH project team members and codebook developers Anne Austin, Ixchel Faniel and Jennifer Jacobs;
Top right: A view of one of the excavations where we carried out our field data collection
In November, the DINAA project held a face-to-face meeting in Berkeley, CA, supported by a new grant from the IMLS. DINAA team members attended the Phoebe A. Hearst Museum’s Native American Advisory Council meeting to present the DINAA project and discuss the promises and challenges we face as DINAA expands. Currently, DINAA has published almost half a million sites from 15 states. With new funding from the IMLS and the NSF, DINAA’s coverage will expand to include most the continental US over the next two years. The coverage is comprehensive enough already to enable innovative visualizations that can help understand important issues such as the impact of projected sea level rise on coastal archaeological sites (shown in the example below). We also made progress using DINAA for linked data applications, as discussed in this recent blog post.
Anderson, D.G., S.J. Yerka, E.C. Kansa, S.W. Kansa, J.J. Wells, T.G. Bissett, R.C. DeMuth, and K.N. Myers. 2015a. Big Data & Big Picture Research: DINAA (The Digital Index of North American Archaeology) and the Things Half a Million Archaeological Sites Can Tell Us. Poster presented in the session “The Acid Test: Exploring the Utility of the Digital Index of North American Archaeology (DINAA) for Use in Applied Research” (Sponsored by Digital Index of North American Archaeology), organized by Stephen Yerka and Kelsey Noack Myers, at the 80th Annual Meeting of the Society for American Archaeology 16 April 2015, San Francisco, California.
Anderson, D.G., S.J. Yerka, J.J. Wells, E.C. Kansa, and S.W. Kansa. 2015b. Climate Change and the Destruction of History: Documenting Sea Level Change and Site Loss Using DINAA (Digital Index of North American Archaeology). Paper presented in the session ‘Responses to Climate Change’ at the Second Disasters, Displacement, and Human Rights Conference, Knoxville, Tennessee. 26 September 2015.