Efficient search for information in large volumes of oral history audiovisual data came into prominence roughly at the turn of the century when substantial amounts of archive materials (stored so far on films, video tapes and various analogue audio storage devices) started to be digitized and, at the same time, digital personal recording devices became so affordable that also the amount of newly generated content grew with a geometric rate.
Efficient search for information in large volumes of oral history audiovisual data came into prominence roughly at the turn of the century when substantial amounts of archive materials (stored so far on films, video tapes and various analogue audio storage devices) started to be digitized and, at the same time, digital personal recording devices became so affordable that also the amount of newly generated content grew with a geometric rate. One of the first digital archives that was in need for efficient search capabilities were the recordings of testimonies given by the Holocaust survivors, collected by the Survivors of the Shoah Visual History Foundation (now USC Shoah Foundation). The consortium of research teams from the US and the Czech Republic was established in 2001 and started to build a system that used automatic speech recognition and information retrieval techniques to give users an effective and user-friendly way of accessing the information contained in the archive.
The aim of the talk is to provide a detailed overview of the core methods that our team has been using in both automatic speech recognition (from HMM-based systems used at the beginning to modern end-to-end neural models that are used today) and information retrieval (from the heuristic approach that was implemented about 10 years ago as a proof-of-concept to the current transformer-based solution). We will also show the progress in the system's performance over the two decades of work on this task, as well as the evolution of the graphical user interface used for the actual access to the collection.
Pavel Ircing is an associate professor at the Department of Cybernetics, Faculty of Applied Sciences, University of West Bohemia (UWB). His research interests include speech recognition and information retrieval from speech data. He is the author or co-author of over 80 scientific publications in those
areas. He has spent over a year in total as a visiting scholar at the Center for Language and Speech Processing, Johns Hopkins University in 1999, 2000 and 2004. Among other projects, he also served as UWB’s principal investigator of the NSF-funded project MALACH (2001-2007), whose aim was to employ speech recognition and information retrieval techniques for improving access to large archives of testimonies given by the Holocaust survivors. The international cooperation and research efforts that started during this project are still active today.
Jan Švec is a scientific researcher at the Department of Cybernetics, Faculty of Applied Sciences, University of West Bohemia. His research focuses on spoken dialog systems, speech recognition and spoken term detection. He applies state-of-the-art neural models trained using the transfer learning paradigm in many practical applications, including speech recognition for oral history archives and spoken language understanding. He designed the overall architecture of web-based audiovisual archive technology used in many projects, such as the MALACH project or archives of the Czech Institute for the Study of Totalitarian Regimes.
Its program consists of a one-hour lecture followed by a discussion. The lecture is based on an (internationally) exceptional or remarkable achievement of the lecturer, presented in a way which is comprehensible and interesting to a broad computer science community. The lectures are in English.
The seminar is organized by the organizational committee consisting of Roman Barták (Charles University, Faculty of Mathematics and Physics), Jaroslav Hlinka (Czech Academy of Sciences, Computer Science Institute), Michal Chytil, Pavel Kordík (CTU in Prague, Faculty of Information Technologies), Michal Koucký (Charles University, Faculty of Mathematics and Physics), Jan Kybic (CTU in Prague, Faculty of Electrical Engineering), Michal Pěchouček (CTU in Prague, Faculty of Electrical Engineering), Jiří Sgall (Charles University, Faculty of Mathematics and Physics), Vojtěch Svátek (University of Economics, Faculty of Informatics and Statistics), Michal Šorel (Czech Academy of Sciences, Institute of Information Theory and Automation), Tomáš Werner (CTU in Prague, Faculty of Electrical Engineering), and Filip Železný (CTU in Prague, Faculty of Electrical Engineering)
The idea to organize this seminar emerged in discussions of the representatives of several research institutes on how to avoid the undesired fragmentation of the Czech computer science community.