Skip to Main Content

BIBFRAME: A Manual for Understanding Version 2.0 and Related Tools

Non-Latin Scripts in BIBFRAME

The MARC Multiscript Model A technique for encoding non-Latin script bibliographic metadata is used in current MARC cataloging. The Model A technique was developed during an era where many library systems could not handle data in multiple scripts. To satisfy MARC users who could not handle non-Latin scripts, MARC Model A (linked 880 fields) allows any field with non-Latin data to be segregated into field 880. System limitations have since changed dramatically over time, however. It is assumed that most current library systems in use in research libraries are Unicode-based which can handle multiple scripts. Today the main limitation is the lack of integration of Input Method Editors (IMEs) that allow creation and searching for some scripts and the inability of some Integrated Library Systems (ILSs) to use non-Latin scripts in modules such as acquisitions and circulations.

The MARC Multiscript Model B technique for encoding non-Latin script bibliographic metadata is used in current name authority (NACO/LC-NACO Authority File) MARC cataloging. Model B, which allows for the recording of non-Latin script data in most MARC21 variable fields, is already used by many library systems. The Library of Congress began adding non-Latin data to name authority records in 2008, following the Model B approach, for cross references only. The 1XX authorized access point in name authority records is only provided in the Latin script. The linkage between parallel fields containing the original non-Latin script bibliographic data and transliterations of that same data do not benefit meaningfully from the pairing of linking subfields currently part of Model A.

Experimentation with non-Latin Scripts in BIBFRAME

In the BIBFRAME Pilots, experimentation is taking place with the bibliographic description of materials in non-Latin scripts. The results of this experimentation will be analyzed as the pilots progress.

BIBFRAME Pilot participants working with non-Latin resources are providing minimal romanization in the description of those resources. Access points are romanized, but other parts of the bibliographic description are described in the script of the resource.

Access points are:

  • Creators (1XX field in MARC)
  • Preferred Title (240 field in MARC, although the 245 field may represent both a Preferred Title and a Title Proper (when a 240 field is not present))
  • Genre/Form (655 in MARC)
  • Subjects (6XX in MARC)
  • Contributors (7XX in MARC)

Examples for Southeast Asian Scripts: Thai, Lao, and Burmese

Thai Monograph: Instance

Screenshot showing transcribed instance descriptive metadata in original Thai script

Database View

BIBFRAME Database view of the Thai resource

Lao Monograph: Instance

Screenshot showing transcribed instance descriptive metadata in original Lao script

Database View

BIBFRAME database view of Lao resource

Burmese Monograph: Instance

Screenshot showing transcribed descriptive metadata in Burmese script

Database View

BIBFRAME database view of Burmese resource