Arabic Cataloging and the WDL Metadata Process

OSI | WEB SERVICES
World Digital Library www.wdl.org
Constructing Better
Metadata,
Building Intercultural
Understanding:
Arabic Cataloging
and the WDL
Metadata Process
Erin Hawkins
Metadata Specialist
World Digital Library
ehawkins@loc.gov
World Digital Library www.wdl.org
OSI | WEB SERVICES
?
World Digital Library www.wdl.org
OSI | WEB SERVICES
With metadata: Structured data, useful object
Metadata is structured information that describes, explains, locates,
or otherwise makes it easier to retrieve, use or manage an
information resource.
Cartographer
Etcher
Scale
Locations shown
Author
Title
Year of publication
Place of publication
Language
Topics/subjects covered
OSI | WEB SERVICES
World Digital Library www.wdl.org
Without metadata: What is it?
Unless you read the language of the resource, you can’t identify the
author or subject. Unless the information is written on the work, you
can’t identify the author, publisher, or date.
Map
17th Century
Arabian Peninsula
??????
Book
20th Century
??????
World Digital Library www.wdl.org
OSI | WEB SERVICES
Without metadata: Why is it important?
Even knowing the language and having bibliographic information
about the item, you may not be able to tell why that item was
important to a particular culture.
Famous map
maker’s first
image of Arabian
Peninsula.
From the collection of an
important scientist. The
annotations show us
about her thought
process when she
designed experiments.
OSI | WEB SERVICES
World Digital Library www.wdl.org
Without metadata: How can I find it?
Metadata provides access points to discover during search, whether
in a structured search of controlled vocabulary or free text search.
With nothing to say about an item, it is unlikely to be discovered,
associated with a similar item, or used in research. Map
17th Century
Arabian Peninsula
??????
Book
20th Century
??????
OSI | WEB SERVICES
World Digital Library www.wdl.org
Metadata drives:
• Description
• Organization
• Discovery
World Digital Library www.wdl.org
OSI | WEB SERVICES
Metadata is simply data about data
Cataloging is an art and a science: there is room for disagreement
(but be sure of what the item is!)
OSI | WEB SERVICES
World Digital Library www.wdl.org
More information and structured data:
metadata, authorities, classification
Culture,
language,
personal experience
World Digital Library www.wdl.org
OSI | WEB SERVICES
Global diversity in writing systems
WDL Languages
World Digital Library www.wdl.org
OSI | WEB SERVICES
The reality in a multilingual
project: transliteration can
compromise uniformity and
accessibility but is necessary
for display, search, and overall
understanding.
World Digital Library www.wdl.org
Transliteration (Romanization) in
libraries has traditionally been
dependent on two things
1. Style guidelines
2. Technology
MARC extensions to
accommodate nonRoman scripts in 1980s.
Unicode as international
standard in 1993 allowed
for incorporation of script
in authority records. Still
not as widely adopted as
it should be.
OSI | WEB SERVICES
World Digital Library www.wdl.org
OSI | WEB SERVICES
Script and transliteration: working together
viaf.org
World Digital Library www.wdl.org
Arabic and Islamic content and
Western cataloging
OSI | WEB SERVICES
OSI | WEB SERVICES
World Digital Library www.wdl.org
Christian materials
Islam, Babism & Bahai Faith (297)
World Digital Library www.wdl.org
WDL Metadata Element Set
• Dublin Core variant
• Metadata accepted in WDL
spreadsheet, MARC, xml file
• Dublin Core, MODS, MARC,
most common submissions
OSI | WEB SERVICES
World Digital Library www.wdl.org
OSI | WEB SERVICES
Element: Title
• Title of resource, two versions
– Title in original language: Transcribe title from item, if
written or printed title exists. Use original language
and script, no transliteration.
– WDL title (English): translation of original, the way a
work is most commonly known, or a descriptive title if
there is no original title. If you provide a descriptive
title, indicate this (we may not be able to tell it is
descriptive rather than original).
– Provide title language either as written in MARC
Language list or as a MARC Language code:
www.loc.gov/marc/languages/language_name.html
– Mapping: MARC field 245; Dublin Core Title, MODS
Title Info (subelement Title)
OSI | WEB SERVICES
World Digital Library www.wdl.org
Title in original language:
‫صبح االعشى في صناعة اإلنشاء‬
Title:
Dawn for the Blind in the Craft of Composition
Original title language: ara [Arabic]
World Digital Library www.wdl.org
Title in original language:
The most lamentable Romaine tragedie
of Titus Andronicus As it was plaide by
the right honourable the Earle of
Darbie, Earle of Pembrooke, and Earle
of Sussex their seruants
Title: Titus Andronicus
Original title language: eng [English]
OSI | WEB SERVICES
World Digital Library www.wdl.org
OSI | WEB SERVICES
Title in original language: [none]
Title: Manuscript on the Care of Horses
Original title language: zxx [No linguistic
value]
World Digital Library www.wdl.org
OSI | WEB SERVICES
Element: Language
• Language(s) included in the resource
– Provide language either as written in MARC
Language list or as a MARC Language code:
http://www.loc.gov/marc/languages/language_name.html
– If the item has no language (a photograph or
drawing), use “zxx” or “No linguistic content”
– Be as specific as possible. Don’t use “English” if it’s
really “English, Middle (1100-1500)” or “German” if it
is really “Swiss German.”
– Mapping: MARC field 546 or 041; Dublin Core
Language; MODS Language
OSI | WEB SERVICES
World Digital Library www.wdl.org
Title in original language:
‫صبح االعشى في صناعة اإلنشاء‬
Title:
Dawn for the Blind in the Craft of Composition
Original title language: ara [Arabic]
Language: ara [Arabic]
World Digital Library www.wdl.org
OSI | WEB SERVICES
Title in original language:
Мелочная торговля. Продажа
ситцу
Title: Peddling. Selling Printed
Cotton
Original title language: rus
[Russian]
Language: zxx [No linguistic
content]
World Digital Library www.wdl.org
OSI | WEB SERVICES
Element: Description
• Summary of content of item
– Note features of historical or cultural interest,
especially for a user outside of your own culture or
country. Why was this item important enough to
provide to us?
– Mapping: MARC field 520; Dublin Core Description;
MODS Abstract or Note
World Digital Library www.wdl.org
OSI | WEB SERVICES
Element: Contributors
• Persons, groups, or institutions associated with the
physical or intellectual creation of the historical resource.
– Use an authority; very important for disambiguation.
VIAF: http://viaf.org/
– Include all contributors, which may include copyist,
author, author of commentary, author of gloss, author
of marginal notes, illuminator, calligrapher, and more.
– Use Relator Terms, either codes or titles, to specify
contributors’ roles. MARC Code List for Relators:
http://id.loc.gov/vocabulary/relators.html
– Scanner or scanning institution is not a contributor.
– Mapping: MARC Personal Author field 100, Corporate
Body field 110, 111, Added Personal Name field 700;
Dublin Core Contributor; MODS Name and Role
OSI | WEB SERVICES
World Digital Library www.wdl.org
viaf.org
Contributor 1: Qalqashandī, Aḥmad ibn ʻAlī,
1355 or 1356-1418
Contributor Role 1: aut [Author]
ID and permalink
World Digital Library www.wdl.org
OSI | WEB SERVICES
http://id.loc.gov/vocabulary/relators.html
Contributor 2: Ibrāhīm, Muḥammad ʻAbd al-Rasūl
Contributor Role 2: aui [Author of introduction, etc.]
OSI | WEB SERVICES
World Digital Library www.wdl.org
http://id.loc.gov/vocabulary/relators.html
Relator code
Definition of term, if needed
World Digital Library www.wdl.org
OSI | WEB SERVICES
Element: Publisher
• Name of publisher
• Place of publication
– Not linked field on site, simply informational.
– Only applicable for printed or published items, not
manuscripts or drawings.
– Scanning institution is not the publisher.
– Mapping: MARC fiend 260 $b and $a; Dublin Core
Publisher (place not included); MODS Origin Info with
subelements place and publisher
World Digital Library www.wdl.org
OSI | WEB SERVICES
Element: Date Created
• Dates (Western/Gregorian) of creation, completion, or
printing of this physical item.
– Could be a single year, month, day, or a range. Be as
exact as possible.
– Use numerical dates in year/month/day format.
Example: 1888/4/1
– Express centuries in numerical dates.
Example:19th century: 1801-1900
– If this is a photograph of a physical item, use creation
date of the item, not the photograph.
– Mapping: MARC field 260 $c or $g; Dublin Core Date
(although no indication of what the date means unless
you use a subfield); MODS Origin Info with
subelement Date Created
World Digital Library www.wdl.org
OSI | WEB SERVICES
Title in original language: ‫صبح االعشى في صناعة‬
‫اإلنشاء‬
Title:
Dawn for the Blind in the Craft of Composition
Original title language: ara [Arabic]
Language: ara [Arabic]
Contributor 1: Qalqashandī, Aḥmad ibn ʻAlī,
1355 or 1356-1418
Contributor Role 1: aut [Author]
Contributor 2: Ibrāhīm, Muḥammad ʻAbd alRasūl
Contributor Role 2: aui [Author of introduction,
etc.]
Publisher: Dār al-Kutub al-Miṣrīyah and AlMatba’ah al-Amiriyah
Place of Publication: Cairo
Date created: Around 1913-1922
Date of printing of this set of books.
Note that this item is a reprint so
creation of this physical item more
recent than original date of creation
World Digital Library www.wdl.org
OSI | WEB SERVICES
Date of cave painting’s
creation, not
photograph’s creation
World Digital Library www.wdl.org
Element: Time
OSI | WEB SERVICES
• Temporal subject of the resource.
– May be the same as or different than creation date;
same formatting applies.
– If work has no specific temporal subject (math,
science, language, religious material), we use life
dates of author (if known) or date of creation (if
known). These items still exist within the context of
the time period they were created. Even if they are
not “about” a time period, they are “of” the time.
– If item is a reprint, use original production dates or
author’s life dates, if known.
– Mapping: MARC field 045 and 6XX $y; Dublin Core
Coverage; MODS Subject with subelement Temporal
OSI | WEB SERVICES
World Digital Library www.wdl.org
Original title: ‫صبح االعشى في صناعة اإلنشاء‬
Title:
Dawn for the Blind in the Craft of Composition
Original title language: ara [Arabic]
Language: ara [Arabic]
Contributor 1: Qalqashandī, Aḥmad ibn ʻAlī,
1355 or 1356-1418
Contributor Role 1: aut [Author]
Contributor 2: Ibrāhīm, Muḥammad ʻAbd alRasūl
Contributor Role 2: aui [Author of introduction,
etc.]
Publisher: Dār al-Kutub al-Miṣrīyah and AlMatba’ah al-Amiriyah
Place of Publication: Cairo
Date created: Around 1913-1922
Subject date: Around 1355-1418
No temporal subject so used the
author’s life dates, during which the
original work had to be created.
Actual date created unknown.
World Digital Library www.wdl.org
OSI | WEB SERVICES
While dealing with
events in 1st century AD,
this is not a document
discussing history, it is a
religious document.
Used date of creation.
World Digital Library www.wdl.org
OSI | WEB SERVICES
Date of cave painting’s
creation, not
photograph’s creation
World Digital Library www.wdl.org
OSI | WEB SERVICES
Element: Place
• The geographic subject of the resource
– We use modern place names and a particular
geographic hierarchy: region, country, first-level
administrative division, and city.
– Geonames: http://www.geonames.org/
– If no geographic subject (religious works, science,
math, language), use region or country of origin of
author/creator or where that person primarily worked.
These items still exist in a geographic context, even if
they are not “about” a particular place.
– MARC field 752; Dublin Core Coverage; MODS
Subject with subelement Hierarchical Geographic and
each place subelement defined
World Digital Library www.wdl.org
OSI | WEB SERVICES
The Natural Arch, Constantine, Algeria
1899
Detroit Photographic Company
geonames.org
Middle East and North Africa>Algeria>Constantine>Constantine
OSI | WEB SERVICES
World Digital Library www.wdl.org
A Sketch of the Islamic Law
Ma, Boliang, 1640–1711
Islamic scholar from China
geonames.org
East Asia>China
World Digital Library www.wdl.org
Element: Topic
OSI | WEB SERVICES
• Dewey Decimal Classification:
http://bpeck.com/references/DDC/ddc.htm
• Site shows two topics to three digits; check site
• Dewey is not a perfect system; not all books or items fit
neatly into a single number.
• Some books or items can legitimately be catalogued in
several numbers.
• Mapping: MARC field 082; MODS Classification with
attributes defined as DDC
World Digital Library www.wdl.org
OSI | WEB SERVICES
Element: Additional Subjects
• What or who is this item about?
• Use terms from controlled vocabularies
• How are similar items cataloged?
– Search on our site
– Check existing books or resources on WorldCat:
http://www.worldcat.org/
– OCLC SearchFast subject heading guide:
http://fast.oclc.org/searchfast/
– Library of Congress Authorities, subject headings and
names: http://authorities.loc.gov/
– Mapping: MARC field 650, 6XX, 653; Dublin Core
Subject MODS Subject with subelements
World Digital Library www.wdl.org
Element: Type of Item
OSI | WEB SERVICES
• Please check site for item types
– Available under Browse: Type of item
– MARC field 008, Dublin Core Format, MODS Form
(sub element to Physical Description with defined
attribute Form Authority), or Type of Resource
World Digital Library www.wdl.org
Elements: Notes
OSI | WEB SERVICES
• Space for any additional information about item
– This information may end up in description or being
cut, but helpful to have.
– Likely will not keep overly-specialized information
because it will not make sense or help our diverse
users.
– Mapping: MARC field 5XX; MODS Note with various
attributes defined
World Digital Library www.wdl.org
Element: Physical Description
OSI | WEB SERVICES
• The physical description of the resource
– We are trying to describe the item to a user who will
never touch this item. Cataloging notation that may
confuse people will not be included in final record.
– Provide dimensions (inches, centimeters, feet,
meters), extent (pages, folios), and information about
the item (paper, ink, color, condition).
– No abbreviations
– Mapping: MARC field 5XX; MODS Note with various
attributes defined
World Digital Library www.wdl.org
OSI | WEB SERVICES
10 final thoughts:
1. Don’t assume the recipient of the catalog record knows what
you know. Our translators and researchers are not
catalogers and even abbreviations can be confusing.
2. Always provide as much information as possible. Fill out
every field you can.
3. Names: If similar or confusing, provide multiple versions
with dates (often only way of differentiating a name). Check
VIAF and give ID if possible.
4. Creation dates: Even if it is a broad range, your best guess
is better than mine.
5. DDC: Presents challenges for Islamic content. However, we
can add additional subjects to showcase more complex
aspects of the work.
World Digital Library www.wdl.org
OSI | WEB SERVICES
10 final thoughts, continued:
6.
Description: Explain why this item is important to your
culture or country as if you were speaking to an outsider.
(See: number 2)
7.
More background: Any informational links are helpful,
including Wikipedia links, links to online encyclopedias,
pdfs of academic articles, etc.
8.
Translation: Translators can’t skip portions of a description
that they don’t understand or are vague. If it can’t be clearly
described by the owning institution, a stranger will not be
able to make sense of it.
9.
Our site (Arabic and English versions) is a good resource
for transliterations, translation, metadata, and overall
description of items.
World Digital Library www.wdl.org
OSI | WEB SERVICES
10 final thoughts, continued:
10. All of us at WDL strive to enable our users to more easily
discover, use, and appreciate your wonderful contributions.
If you have minimal information on an item, please
consider sending a different item. Your metadata always
matters.
Your content is a gift to the world. Make sure you’ve provided
enough information to make it worth giving.
OSI | WEB SERVICES
World Digital Library www.wdl.org
Questions? Comments?
Erin Hawkins
ehawkins@loc.gov
www.wdl.org
@WDLorg