- Universal Standard Book Code
The increased use of
computers in handlingbibliographic data and the accumulation of large numbers of items, running into millions, will mean less and less involvement of the human element in the various processes such as manual key allocation and quality control. This trend has now become established at least withincomputer professionals and is now accepted as anaxiom that the more we eliminate the human involvement from the internal technical retrieval mechanisms of aninformation system the more successful and free from errors the system will be. Our interest here is the automatic control of large collections ofdatabase records with particular emphasis on unique identification and quality control.Today, the identification and control of bibliographic items is primarily based on an arbitrarily allocated key which accompanies the corresponding record throughout its processing history. Typical keys are the
ISBN (International Standard Book Number) and theISSN (International Standard Serials Number). The USBC (Universal Standard Book Number) is generated automatically from pertinentbibliographic data elements, independent of centralised bodies such as the SBN (Standard Book Number) agency.USBC Criteria
The USBC is an alphanumeric code which is produced by means of an
algorithm which does not require any a priori information about thebibliographic item. Theuniversality of the code implies that it is possible to regenerate this at any time and at any part of the world by means of analgorithm which conforms to the following criteria:# Unique items receive unique codes.
# Thealgorithm is independent of source input.
# The code is as short as possible.
# Thealgorithm is easy to implement.
# The code is regenerable so that the same code is derived for the same item at different times.
# The code can be fixed or variable in length, depending on the operational requirements for record identification.
# It is possible to verify the code manually.Theoretical Basis
The theoretical basis for the derivation of the code is sound since it is based on the well established
information theory . More specifically, a principle ofinformation science states that theentropy of a set of symbols is maximised when the probability of occurrence of each becomes the same. The USBC algorithm utilises this principle to construct codes (keys) from pertinent fields in order to locate and retrieve unique records as well as clusters of records with lexically homogeneous information. The codes derived offer a very high discriminating strength of over 98% with the use of only 7bytes per code, where eachbyte is selected from the least frequent characters found in pertinentbibliographic fields.Research Information
The original research was carried out by Professor [http://www.aueb.gr/Users/yannakoudakis/english/index.htm E. J. Yannakoudakis] at the Postgraduate School of Computer Science, University of Bradford, W. Yorkshire, England, between 1975–1978. The project has received funding from the
British Library , the Ministry of Education, the European Union and several other organisations.References
* Yannakoudakis E. J., Ayres F. H. & Huggill J. A. W., Character coding for bibliographical record control, [http://comjnl.oxfordjournals.org/ Computer Journal] , Vol. 23, No. 1, pp. 53-60, 1980.
* Yannakoudakis E. J., Derived search keys for bibliographic retrieval, Proc. 6th ACM Conference on Research and Development in Information Retrieval, Washington, SIGIR, pp. 220-237, USA, 5-12 June 1983.
* Ayres F. H., Huggill J. A. W. & Yannakoudakis E. J., The Universal Standard Bibliographic Code (USBC): Its use for cleaning, merging and controlling large databases, Program, Vol. 22, No. 2, pp. 117-132, 1988.
* Yannakoudakis E. J. & Wu A. K. P., Quasi-Equifrequent group generation and evaluation, [http://comjnl.oxfordjournals.org/ Computer Journal] , Vol. 25, No. 2, pp. 183-187, 1982.
* Yannakoudakis E. J., A universal record identification scheme, Computer Bulletin, Vol. 2, No. 33, pp. 20, September 1982.
* Yannakoudakis E. J., Intelligent matching and retrieval for electronic document manipulation, Text Processing & Document Manipulation, J. C. van Vliet (Ed.), [http://www.cambridge.org/ Cambridge University Press] , pp. 65-77, April 1986.
* Yannakoudakis E. J. & Ridley M. J., The DOCMATCH Project: Automating document delivery by linking references to full text databases, Journal of Outlook on Research Libraries, Vol. 11, No. 9, pp. 3-7, 1989.
* Yannakoudakis E. J., A formal coding structure for database record processing, International Journal of Cybernetics & General Systems KYBERNETES, Vol. 18, No. 1, pp. 60-70, 1989.
* Yannakoudakis E. J., Ayres F. H. & Huggill J. A. W., An expert system for quality control in cataloguing and document identification, International Journal of Expert Systems for Information Management, Vol. 2, No. 2, pp. 119-139, 1989.
* Yannakoudakis E. J., Ayres F. H. & Huggill J. A. W., Matching of citations between non-standardized databases, International Journal of the American Society for Information Science, Vol. 41, No. 8, pp. 599-610, 1990.
* Yannakoudakis E. J. & Ridley M. J., DOCMATCH II: Automated linking between bibliographic and full-text databases, Bibliographic Access in Europe, Lorcan Dempsey (Ed.), Gower, pp. 232-240, 1990.
* Ayres F. H., Huggill J. A. W., Ridley M. J. & Yannakoudakis E. J., DOCMATCH: Automated input to ADONIS, Journal of Interlending and Document Supply, Vol. 18, No. 3, pp. 92-97, 1990.
* Ayres F. H., Ellis D., Huggill J. A. W. & Yannakoudakis E. J., The USBC and control of the bibliographic data base, Journal of Information Technology and Libraries, Vol. 1, No. 1, pp. 44-48, March l982.
* Ayres F. H., Ellis D., Huggill J. A. W. & Yannakoudakis E. J., Coding for Union File Creation: A National Database, British Library Bibliographic Services Division, London, April 1984, ISBN 0-7123-1020-7 (Review In: Program, Vol. 19, No. 4, pp. 391-394, 1985).
Wikimedia Foundation. 2010.