Abbie Grotke Web Archiving Team Lead Library of Congress |
Kathleen Murray Post-Doctoral Research Fellow University of North Texas |
In the spring of 2008 an ad-hoc collaboration was formed to build a comprehensive archive of the United States Federal Government Web domain before, during, and immediately after the transition to a new presidency. The Library of Congress, the Internet Archive, the California Digital Library, the University of North Texas and the Government Printing Office collaborated to assemble a comprehensive list of sites, provide a nomination tool to engage federal documents experts in site selection, and distribute the work of harvesting content. This presentation will include discussion of various aspects of the ongoing collaboration, including recent work to provide researchers access to the archive, which consists of over 3000 sites, and plans which are underway for collecting in 2012 and 2013. The archive will be demonstrated at this session. The speakers will also discuss a two-year grant from the Institute of Museum and Library Services (IMLS) funding research into comparing machine clustering of Web pages to classification by subject matter experts.
http://eotarchive.cdlib.org/index.html
Presentation (PDF)