Research

Reconceiving metadata: language documentation through thick and thin

Authors:

  • David Nathan
  • Peter K. Austin

Abstract

Metadata can be described as ‘data about data’. As a result of recent activities and discussion regarding documentation of endangered languages through projects such as OLAC (Open Language Archives Community) (Simons and Bird 2003) and IMDI (ISLE Meta Data Initiative), metadata within language documentation is now coming to be understood as information that is attached to a file or document for cataloguing purposes (see Johnson, this volume). We call this focus on cataloguing metadata ‘thin metadata’. It runs the risk of not only being a simplistic view of the role of metadata in language documentation, but also, in the longer term, is likely to limit the accomplishments of the field. Thus, a richer, “thick metadata” approach that operates at all levels of linguistic analysis should be central to our field. What has emerged is a “metadata gap”; on the one hand we find minimalist cataloguing schemas promoted for the endangered languages field, and on the other are the rich descriptions that fieldworking linguists write as they create and analyse their data. What is needed to support language documentation is a metadata methodology that provides flexible, richly articulated knowledge representation schemas to encode linguists’ cascading layers of data and metadata.

Keywords:

language documentationendangered languagesmetadatacataloguinglinguistic analysismethodologyflexibilityprioritiescommunities of interestknowledgeresourcesexpressionaccessibility
  • Year: 2004
  • Volume: 2
  • Page/Article: 179-188
  • DOI: 10.25894/ldd299
  • Published on 31 Jul 2014