Christopher (Cal) Lee
Professor
University of North Carolina
Kam Woods
University of North Carolina
This presentation will explore open-source software (OSS) tools and methods for libraries, archives and museums (LAMs) to identify email in born-digital collections, review email sources for sensitive or restricted materials, and perform appraisal and triage tasks to identify and annotate records. We’ll specifically focus on products of the Review, Appraisal and Triage of Mail (RATOM) project’s use of machine learning to separate records from non-records, along with natural language processing methods to identify entities of interest within those records. In addition to describing and demonstrating the tools, participants will also learn about the rationale for their development, how they relate to other available software, and how processing of email can fit into larger digital curation workflows.