Multiscript Information Processing on Crossroads: Demands for Shifting from Diverse Character Code Sets to the Unicode Standard in Library Applications

Publication Title

IFLA Journal

Publication Date


Document Type





library, information processing, coding, standards, implementation, China, information representation, Asia


Library and Information Science


An essential component of any library application is an encoding methodology that allows computers to process characters and symbols used to represent language information in written form. For years the encoding mechanism was not developed under a unified umbrella nor did it reach various languages equally. Without a standard unified character code, users have to use different software and terminals to display or enter data in different languages, especially when dealing with more than a few scripts. The development of the Unicode Standard is a milestone in international computing because it supports the creation of global software that can be easily adapted for local needs. It brings good news for library professionals. However, as observed by the authors, the implementation of the Unicode Standard has not received full attention or strong support from some library communities, such as the Chinese library communities in Asia (Mainland, Hongkong, Taiwan, as well as other multilingual regions and countries in Asia), that present special and unique issues to multiscript information processing. It is the purpose of this paper to analyze and explain those issues to both the librarians and the Unicode developers in order to encourage the shifting from diverse character code sets to the Unicode Standard in library applications. The authors of this paper will focus on what they perceive to be obstacles to using the Unicode Standard in library applications. While examples and observations are from current CJK (Chinese, Japanese, Korean) information processing practice, librarians from other regions, especially third-world countries, may find them interesting. It is the belief of the authors that Unicode is the best solution for truly multiscript processing for library applications ; however, librarians worldwide need to work together with the Unicode Consortium and ISO for the further development and implementation of such a unified language character set.