Leveraging Artificial Intelligence for Improved Transcription in Special Collections and Academic Libraries
Rachel Senese Myers
Digital Projects Coordinator
Georgia State University
The Georgia State University Library houses an extensive collection of audio-visual assets that serve as invaluable resources for research and education, including oral histories, television and radio broadcasts, labor union meetings, and more. Historically, the process of transcribing these sensitive materials has been labor-intensive and time-consuming. This project briefing will explore an in-progress project to create a custom user interface for OpenAI’s Whisper automatic speech recognition system to improve the processing of these materials. It will discuss the motivations and needs assessment for the project, project planning, development and challenges, and system testing and refinement, in addition to reporting on any findings regarding the project’s impact on efficiency, accessibility, and cost savings.
https://github.com/gsu-library/whisper-scribe
Leveraging Consumer-Level Artificial Intelligence for Descriptive Metadata Creation in Archival Collections
Hope Dunbar
University Archivist
University at Buffalo
The University at Buffalo, University Archives is leveraging consumer-grade artificial intelligence (AI) to enhance the creation of descriptive metadata for over 2,000 hours of audio in the UB-WBFO Radio Archive. Utilizing Microsoft Copilot, this initiative aims to produce concise and detailed program descriptions from audio transcriptions, facilitating inclusion in the University Libraries, Digital Collections, and the National Archive of Public Broadcasting. By developing targeted command prompts paired with transcription files, archivists have drastically reduced processing time, generating generic summaries efficiently. This innovative approach not only improves access to archival content and exemplifies the impact of AI on archival practices but also evidences how entry-level or consumer-grade AI tools can be integrated successfully into project workflows.
Designing SpeakEZ: An AI System to Transcribe and Process Audio and Video Collections
Douglas Boyd
Director, Louie B. Nunn Center for Oral History
University of Kentucky
Over the past five years, the Louie B. Nunn Center for Oral History in the University of Kentucky Libraries has accessioned an annual average of over 1,000 new oral history interviews into the archival collection, exceeding the staff’s capacity to process. This presentation reflects on designing the SpeakEZ system using AI and natural language processing to transcribe and process the Center’s rapidly growing oral history collection, totaling over 20,000 interviews. SpeakEZ consists of automated transcription, the generation of new dimensions of descriptive metadata, the OHMS-ifier, which prepares draft versions of time logs/indexes for use in the Oral History Metadata Synchronizer (OHMS), and finally, the Riskalizer, which assesses and evaluates content for various points of potential sensitivity. The session will include discussion of the system’s successes and role in addressing accessibility requirements, some of the design and workflow challenges introduced by the system, and possible applications of SpeakEZ for libraries beyond archived oral history collections.