Kirjasampo: adventures in reading and challenges in research

According to the latest study pertaining to the use of public libraries in Finland, libraries are still dominant in providing fiction to the people; more than 80 % of the respondents in the study stated that the library is the most important channel for obtaining fiction. The opportunity to borrow fiction from libraries is seen as important and useful to patrons.

Developing information services for fiction

Fiction has always been an essential part of the collections in public libraries in Finland. Thanks to this continuum, libraries are able to offer a diverse range of literature, and therefore opportunities to find new adventures in reading are always available. The long tradition of working with fiction is also evident in the fact that new search tools for finding fiction are constantly being developed in Finland. For example, the first version of the Kaunokki Finnish fiction thesaurus was available as early as 1996 and at the end of the 1990s the indexing of novels in Finnish began.

This development continues in the Kirjasampo database project, a nationwide project funded by the Ministry of Education and Culture. The project involves collaboration to create an online service for fiction that also includes older fiction, the content of which has not yet been systematically classifiedand indexed. The various criteria and practices, which patrons use to search for fiction, have been taken into consideration during the planning process.

Associations, intuition and similarity between the works of literature are of great significance when searching for literature. Tools, which assist in making the connections between these books more evident and appeal to the adventure of reading literature, are being developed for the Kirjasampo. These tools call for new types of computer applications, and for this reason the Kirjasampo will be realized with Semantic Web technologies.

The Semantic Web

The Semantic Web is a a conglomerate of technologies, which primarily facilitate creating connections between information more efficiently. This technology was utilized early in the development of the Kirjasampo database; the book information in the HelMet classification system was combined with the author information in the author databases of three provincial libraries and used as the basis of the Kirjasampo database. In addition, the project has had easy access to the coordinates of the locations mentioned in the books.

Since the Semantic Web is based on concepts, creating a multi-language system is easy. After converting the HelMet data records, the indexed material, which was originally in Finnish, was immediately available in Swedish and English. This was made possible through the onto- logization of the Kaunokki glossary, and its connection to the General Finnish Ontology.

Ontologies are useful to the Kirjasampo also when making searches and calculating automatic book recommendations. Ontologies enable the Kirjasampo to recognize that emperors, shahs, moguls, kings and pharaohs, etc. are all types of rulers and in this way find connections between books that have been indexed according to these more closely defined terms.

Challenges to research

From the point of view of a researcherin semantic computing, the Kirjasampo database has been an interesting topicbecause of the diverse and high-quality information it contains. Although there is indeed much Linked Open Data available around the world nowadays, the quality of it is often poor and the classification and indexing of the material is not very detailed semantically speaking. The enthusiasm in the Kirjasampo project toward high-quality and diverse classification and indexing work has offered an excellent foundation on which to construct research pertaining to the real opportunities and advantages of semantic computing.

The development of the data model used in Kirjasampo has been an area of special interest during the project. Firstly, it has been delightful to see how semantic computing technology has adapted extremely well to a procedure in which a model is modified and expanded to answer to new challenges. This was particularly evident in the Kirjasampo project when fiction in Swedish was added to the database. To avoid repetition between the translations when classifying and indexing it and to ensure the different languages and versions of the works were in relation to each other, the entire data model had to be broken down again, a step closer to the four-step division of the FRBRoo model.

Making this model more complex also brought about new pressures for semantic computing technology and for indexing and search interfaces, which had only been previously tested with simpler materials. This gave way to a new and interesting realm of research, the challenges of which were luckily overcome.

Dimensions of open information

The Kirjasampo project not only gave rise to a search tool for fiction, but it also provided much more. A new type of environment was created to access the library’s metadata and traditional working methods were developed, for example, by revising glossaries and making them into ontologies. This has taken more time than expected, but the work can also be utilized in other systems.

On 23 May 2011, we saw an example of this when the material in Kirjasampo was utilized in the experimental HS Open event, which involved the use of open information. The analyses and visualization of the materials reveal, for example, that international detective stories have become longer since the beginning of the 1980s – from 200 pages to 370 pages – but Finnish detective stories did not become longer until the 2000s. Even now, the average number of pages for Finnish detective stories is less than 300. Other results included analyses of the types of topics most likely to receive funding or awards and of how the number of debut authors has increased, as well as an application that located the places associated with Finnish fiction on a map.

The Kirjasampo database can be further developed and its features made more diverse. The more descriptions of works it contains and the more readers comment on them, the better the recommendations are and the easier it is to choose a book. This way, the collections in libraries are more readily visible and available to readers.

Kirjasampo-beta is available in the

Kaisa Hypén
Service manager
Turku City Library

Eetu Mäkelä
Researcher, D.Sc.
Semantic Computing Research Group (SeCo)

 Translated by Turun Täyskäännös

Service manager Turku City Library
Researcher, D.Sc. Semantic Computing Research Group (SeCo)