Metadata are usually defined as data on data. Although ‘data on data’ is a rather narrow definition, it is nevertheless the definition in an IT context. Generally speaking, metadata are data on data and objects (incl. analogue books and pictures). In a broader context, metadata have over the past few years come to be a familiar concept in connection with intelligence service organisations such as USA’s NSA collection of data on individual persons’ use of digital services.
With the arrival of the internet and particularly the world-wide-web, the situation has changed completely. To begin with, the explosively increasing volume of accessible resources on the net meant that the idea of traditional registration was obviously unrealistic.
In itself, the net offered opportunities of quite a different interoperability between domains than had been possible until then. This occasioned many different initiatives in terms of new formats for metadata. It quickly became apparent that Dublin Core had the greatest impact internationally.
The point of departure was a vision that all documents should have embedded metadata, and the simple Dublin Core format with 15 elements was developed with the intention of being cross-sectorial. The vision of embedded metadata did not catch on, probably because, following a brief flirtation with metadata, the search engines ignored them for a number of years.
On the other hand, Dublin Core was successful in a number of sectors in connection with projects where registration of data on the net was the primary task. Within a large number of domains, Dublin Core has become the basis for the individual domain’s metadata specifications.
In a broader context
Metadata is now a concept used in many different contexts. A quick googling gives information about metadata used in the field of geographical information, public information, and management of building and construction work.
The libraries of course continue to use the bibliographic metadata, but expanding digitisation of the resources offered by the libraries has necessitated using metadata in a broader context. Even before the digital age, the research libraries handled registration at different levels: from no registration other than location to very meticulous registration of manuscripts.
The libraries’ administration of digital resources is wide ranging: own collections, external databases and in terms of national libraries, harvesting of the national parts of the internet. Each of these tasks place their own demands on the internet.
Descriptive metadata are used for locating cand identification and to describe the intellectual content, the intellectual originator as well as details concerning the individual object. The various MARC formats are used for the traditional media, while for other types of resources Dublin Core is used in both the simple original form and the more expanded one like dcterms.
For a slightly more expanded registration, MODS is used in a number of cases, MODS being a simplified version of MARC21 with for example ‘Title’ as element designation instead of ‘245’.
For more specific special descriptions, the following are for example being used: PBCore (Public Broadcast Core) for radio and television programmes and VRA core (Visual Resources Association) for visual objects.
An example of supplementing of meta data are the user-created data in Denmark seen from the air – before Google, which now contains about ¼ million digitised aerial photos. Here the users are able to supplement with e.g. information about the name of the farm, the owners etc. It is important to collect this kind of data, and in practice they cannot be obtained in any other way.
User-created metadata are expected to become a source for the description of some of the libraries’ resources, not least the digitised cultural heritage. Another kind of supplement is to integrate metadata from external databases in the library’s user interface.
Administrative and technical metadata
The administrative metadata represent administration information, such as data for when the object was digitised or altered, technical metadata and information about rights and license. Access to data is governed by metadata on which user categories have access to which material.
Preservation metadata are a kind of administrative metadata, which are necessary for long-term preservation of data. Metadata about i.a. the origin and changes of an object, which are used to determine the authenticity of data as well as document that a digital object is intact. Also metadata about systems and programs that are necessary to be able to read and reproduce the data correctly. PREMIS (Preservation Metadata: Implementation Strategies) is the most important standard within the field.
Structural metadata are used for showing and navigating in an object, for example photos and text that are associated with a subset of an object, or the order of files that make up the pages in a digitised book.
The handling of metadata is alpha and omega in order to ensure an effective and economic dissemination of information.
Qualified metadata don’t just drop out of the sky. They must either be created automatically with registration of facts (time, file size, titles from data fields et cetera.) or with the help of a librarian. However, not all metadata need to have the same quality. User-created data will be a useful supplement.
The research libraries’ use of metadata has grown over the past 20 years. From bibliographic description of own collection to many types of metadata and not only in relation to own collections, but also articles etc. with access via license, administration of metadata about researchers and research results. And the development continues with new types of metadata in order to interact with Linked Open Data generally, and new challenges posed by the expected successor to MARC: BIBFRAME.
For further information about non-bibliographic metadata, see: http://digitalbevaring.dk/metadata/