On December 17, 2016 The Faculty of Information at the University of Toronto hosted a Guerrilla Archiving Event to preserve American environmental data in preparation for the incoming President. There was concern leading up to the inauguration that the new administration under Trump would be hostile to evidence based environmental studies. I had the pleasure of connecting with two integral pieces of the event, Patrick Keilty and Sam-Chin Li to discuss their roles in preserving environmental data that may have been lost.
The event, which took place on a Saturday afternoon in the Faculty’s Library, the Inforum, brought together people from various academic backgrounds concerned about the deletion of environmental data. Coders, environmental scientists, librarians and information professionals, as well as a number of volunteers gathered to participate in a hack-a-thon to save data from the EPA.
Initially participants highlighted all vulnerable programs and data on the United States Environmental Protection Agency (EPA) website for safekeeping. Then, those volunteers who were more technically skilled conducted web crawls, scraping that data and sending it to the Internet Archive’s End of Term Project. By the end of the day, thousands of URLs were sent to the Internet Archive including 192 at-risk programs and data sets. This event gained national and even worldwide recognition bringing attention to the concept of ‘guerrilla archiving’ and have encouraged many people to get involved.
“Guerrilla Archiving” is a relatively new term and essentially refers to archiving things that are political in nature. Patrick Keilty, Assistant Professor in the Faculty of Information at the University of Toronto (UofT) and one of the main organizers of the December event, believes that guerrilla archiving actions are inherently activist in nature and involve a grassroots element. Keilty notes however, that the procedures of guerilla archiving can vary, “There is normally not a found process for archiving the things we focused on. With the EPA, data would normally be subjected to a standard internet crawl but the fear is that some data, like that contained in spreadsheets, would be missed.”
Sam-Chin Li, the Reference/Government Publications Librarian at U of T and one of the event’s facilitators noted that, “Content on the Web is continually being updated, replaced, or lost so it is important to ensure the continued ability to access valuable content like government information, political candidates’ campaign websites and etc.” Guerrilla archiving is not just limited to the protection of data from the EPA but any organization where there is a real concern over data being lost.
Interested in getting involved? If so, check out the Environmental Data Governance Initiative (EDGI), which grew out of the Guerrilla Archiving Event at UofT. The EDGI contains resources, ways to participate, a digital toolkit, and a calendar of upcoming events. “While technical skills like knowledge of coding, are an asset they are not necessary,” Keilty explained. Above all else, research skills are desired. Knowing what to look for, what’s important to preserve, and what might be missed by standard web crawls is just as, if not more important, that having technical skills. Keilty’s advice, “Archive information you know about, or are passionate about.” If Canadian contexts on the subject spark your interest, be sure to check out the Erasure of Information post by Bryan Short.