Dia 13 de novembro 2009
8:15 – 9:30 Conferência de abertura: Steven Bird (University of Melbourne, Austrália; University of Pennsylvania, USA)
Corpus Linguistics and Language Preservation
-
There is a pressing need to document the world's linguistic heritage
-
while there is still time. The consequence of language shift is that
-
many genres -- and many whole languages -- are quickly falling out of
-
use. Digital technologies speed up the task, yet the work is not
-
covering a sufficient number of languages, in sufficient depth, at a
-
sufficient rate. What would it take to compile a million-word corpus
-
consisting of speech recordings and transcriptions, for 5,000
-
languages within the space of a decade?
-
-
This presentation will describe a new approach to corpus creation
-
called "basic oral language documentation" (BOLD). I will describe
-
BOLD and report on a pilot study with Usarufa, a moribund language
-
spoken by approximately 1200 people in the Eastern Highlands Province
-
of Papua New Guinea. Local literacy teachers were trained in the use
-
of digital voice recorders for capturing linguistic events, and then
-
adding spoken transcriptions and interpretations into a language of
-
wider communication. I will describe a variety of technical and
-
sociological challenges, and speculate on how the BOLD methodology
-
could be adopted for preserving the languages of Brazil.
-
-
The presentation will also describe the Open Language Archives
-
Community (OLAC), an international partnership of institutions and
-
individuals who are creating a worldwide virtual library of language
-
resources by: (i) developing consensus on best current practice for
-
the digital archiving of language resources, and (ii) developing a
-
network of interoperating repositories and services for housing and
-
accessing such resources. The software services of OLAC will be
-
demonstrated, with an emphasis on the languages of Brazil.