The future of metadata: Open, linked and multilingual

With an overwhelming amount of material and data being published globally in a wide range of formats, locating and accessing relevant information can be a challenge. Both tools and practices of metadata production and management should be developed to answer to the needs and possibilities of the open Semantic Web environment.

There is also a need for a new working perspective and new practices: firstly, the focus has shifted from records to entities and secondly the spectrum of users, needs and formats has expanded. Therefore cataloguing must be constructed to serve a variety of contexts, i.e. the entire Semantic Web.

Language plays a key role in participating in the global community. Through multilingual and open linked metadata, information can be located and retrieved not only across different collection providers, but also across languages so that resources indexed using one language can be retrieved using another.

In other words, the semantic dimension of open linked data can bring together resources across linguistic and organizational barriers.

Monolingual terms – multilingual concepts

In a country like Finland, operating in a trilingual context is part of everyday life. Finnish and Swedish are both official languages and global participation also requires the command and use of English. The multilingual environment is inevitably challenging, and managing such settings requires the development of shared tools and practices.

In Finland, vocabulary tools for indexing in the Semantic Web are published and developed in the Finto project. Finto (http://finto.fi) is a service for the publication and utilisation of ontologies, thesauri, vocabularies and classifications. It provides a user interface for browsing the vocabularies and open interfaces and for utilising them in other applications.

The service also aims to provide high-quality metadata tools not only for libraries but for institutions across the public sector. Furthermore, in order to promote open science and free access to information, the service is being developed in an open manner and all its contents are available free-of-charge as open linked data.

The service is based on the Skosmos (http://skosmos.org) platform, a webbased tool providing access to controlled vocabularies that has been developed as open source software. Finto is being developed as a joint venture between the National Library of Finland, the Ministry of Finance, and the Ministry of Education and Culture.

YSO multilingual ontology

Libraries have extensive expertise in working with controlled vocabularies and thesauri but the new operating environment clearly calls for new tools and practices. The transfer from a controlled vocabulary or thesaurus to an ontology means two major changes: firstly, the focus needs to shift more firmly from a term-based approach to a concept-based approach, and secondly, the hierarchical structure must be made complete and consistent.

YSO is constructed by merging together the General Finnish Thesaurus and its counterpart in Swedish into a single hierarchical structure that explicitly specifies the concepts of a given domain and their relationships in a machine-readable format.

The hierarchical and thematic structure also provides rich contextual information for the indexer (see Figure 1). Furthermore, the resulting ontology is currently being translated into English and linked to the Library of Congress Subject Headings. Through multilingual subject access and links from YSO to LCSH, materials indexed with YSO will become a part of a global network of metadata.

Fig. 1: ‘coral reefs’ in YSOA trilingual environment

As each culture conceptualises the world from its own viewpoint, meanings are seldom symmetrical across languages. Therefore, the aim has been not to pursue exact equivalence between languages but, instead, to lead the information retriever towards relevant search results regardless of which language is used in the query.

However, the trilingual environment poses a number of language- and culture-related challenges, and building a harmonious and understandable hierarchy in more than one language is a complex process and requires compromises.

For example, both Finnish and Swedish have two separate concepts that both translate into English as dreams: the dreams one has while sleeping (unet (fi) / drömmar (sv)) and dreams referring to desires and aspirations (unelmat (fi) / önskedrömmar (sv)).

Equal status

Both are treated as separate entries in the hierarchy. In the ontological context, however, it is necessary to separate them from each other, and thus the English translations are dreams (sleeping) and dreams (aspirations). This is due to the fact that the ontology has been constructed from the premise that Finnish and Swedish have equal status as the foundation of the hierarchy.

English, on the other hand, has a secondary role as the translated language which aims to relay concepts inherent to the Finnish culture into the language of a foreign culture. (See Figure 2).

Fig. 2: ‘dreams (aspirations)’ in YSOMoreover, translating the complete ontology into English and linking the concepts to LCSH when applicable equivalents are available involves connecting the languages of two very different cultural spheres and requires a clear definition of the acceptable level of equivalence. Currently, YSO comprises nearly 28,000 concepts, of which approximately one third can be linked to LCSH concepts.

Conclusion

The challenge of constructing and harmonising multilingual metadata is a crucial element in the context of the global open linked data environment. However, this cannot be achieved without acknowledging the differences between the specific characteristics of different languages.

Furthermore, subject access is a powerful gateway to various types of materials. By providing efficient tools and developing shared practices, we can ensure the longterm accessibility to these materials in a dynamic and changing information environment.

Information Specialist, Politices Doktor, The National Library of Finland
Translator, The National Library of Finland