GEDCOM

GEDCOM

GEDCOM, an acronym for GEnealogical Data COMmunication, is a specification for exchanging genealogical data between different genealogy software. GEDCOM was developed by The Church of Jesus Christ of Latter-day Saints as an aid to genealogical research.

A GEDCOM file is plain text (usually either ANSEL or ASCII) containing genealogical information about individuals, and meta data linking these records together. Most genealogy software supports importing from and/or exporting to GEDCOM format. However, some genealogy software programs incorporate the use of proprietary extensions to the GEDCOM format, which are not always recognized by other genealogy programs. The [https://www.ngsgenealogy.org/ngsgentech/projects/TestBook2001/index.cfm GEDCOM TestBook Project] evaluates how well [https://www.ngsgenealogy.org/ngsgentech/projects/TestBook2001/sumchart.cfm popular genealogy programs] conform to the GEDCOM 5.5 standard. Additionally, many tools exist to convert GEDCOM files to HTML pages.

GEDCOM model

GEDCOM uses a lineage-linked data model. This data model is based on the nuclear family and the individual. This contrasts with evidence models, where data is structured to reflect the discovered and supporting evidence. In the GEDCOM lineage-linked data model, all data is structured to reflect the believed reality, that is, actual (or hypothesized) nuclear families and individuals.

Commsoft [http://sonic.net/~commsoft/rstory.html] , the authors of the Roots series of genealogy software and Ultimate Family Tree, defined a version called [http://www.saintclair.org/ftp/pub/pafutils/eged10ww.zip Event GEDCOM] [http://archiver.rootsweb.com/th/read/TMG/2000-06/0962255126] . Although it is event based it is still a model built on assumed reality rather than evidence. Event GEDCOM was more flexibile as it allowed some separation between believed events and the participants. Roots and Ultimate Family Tree are no longer available, so now very few people are using Event GEDCOM.

GEDCOM file structure

GEDCOM files are somewhat similar to MARC, an interchange format for bibliographic data.

A GEDCOM file consists of a header section, records, and a trailer section.

Records represent people (INDI record), families (FAM records), sources of information (SOUR records), and other miscellaneous records, including notes.

Every line of a GEDCOM file begins with a level number. All top-level records (HEAD, TRLR, SUBN, and each INDI, FAM, OBJE, NOTE, REPO, SOUR, and SUBM) begin with a line with level 0. All other level numbers are positive integers. Although it is theoretically possible to write a GEDCOM file by hand, the format was designed to be used with software and thus is not especially human-friendly. A [http://phpgedview.svn.sourceforge.net/viewvc/*checkout*/phpgedview/trunk/phpGedView/gedcheck.php GEDCOM validator] that can be used to validate the structure of a GEDCOM file is included as part of PhpGedView project, though it is not meant to be a standalone validator.

Example

The following is a sample GEDCOM file. The first column indicates an indentation level.

The header (HEAD) includes the source program and version (Reunion, V8.0), the GEDCOM version (5.5), and the character encoding (MACINTOSH).

The individual records (INDI) define Bob Cox(ID 1—@I1@), Joann Para (ID 2), and Bobby Jo Cox (ID 3).

The family record (FAM) links the husband (HUSB), wife (WIFE), and child (CHIL) by their ID numbers.

0 HEAD 1 SOUR Reunion 2 VERS V8.0 2 CORP Leister Productions 1 DEST Reunion 1 DATE 11 FEB 2006 1 FILE test 1 GEDC 2 VERS 5.5 1 CHAR MACINTOSH 0 @I1@ INDI 1 NAME Bob /Cox/ 1 SEX M 1 FAMS @F1@ 1 CHAN 2 DATE 11 FEB 2006 0 @I2@ INDI 1 NAME Joann /Para/ 1 SEX F 1 FAMS @F1@ 1 CHAN 2 DATE 11 FEB 2006 0 @I3@ INDI 1 NAME Bobby Jo /Cox/ 1 SEX M 1 FAMC @F1@ 1 CHAN 2 DATE 11 FEB 2006 0 @F1@ FAM 1 HUSB @I1@ 1 WIFE @I2@ 1 MARR 1 CHIL @I3@ 0 TRLR

Versions

The current version of the specification is GEDCOM 5.5, which was released on 12 January, 1996. A subsequent draft [http://phpgedview.sourceforge.net/ged551-5.pdf GEDCOM 5.5.1] specification was issued in 1999, introducing nine new tags, including WWW, EMAIL and FACT, and adding UTF-8 as an approved character encoding. This draft has not been formally approved, but its provisions have been adopted in some part by a number of genealogy programs.

As mentioned above, there was also a version (at least a beta version) of " [http://www.saintclair.org/ftp/pub/pafutils/eged10ww.zip Event GEDCOM] ", which included events as first class (zero-level) items. However, this has not been widely adopted, and the lineage-linked GEDCOM is still the de facto common denominator.

On January 23, 2002 a beta version of GEDCOM 6.0 was released for developers to study and begin to implement in their software. [ [http://www.familysearch.org/GEDCOM/GedXML60.pdf GEDCOM 6.0 specification] ] GEDCOM 6.0 was to be the first version to store data in XML format, and was to change the preferred character set from ANSEL to Unicode. (Uniform use of Unicode would allow for the usage of international character sets. An example is the storage of East Asian names in their original CJK characters, without which they could be ambiguous and of little use for genealogical or historical research.)

As of 2007, five years after the publication of the beta version of GEDCOM 6.0, no genealogical software supplier supports it, despite the inherent advantages of an extensible and portable language like XML and its multi-lingual Unicode support. The references list several proposals for other XML-based genealogical interchange formats.

Limitations

One weakness of GEDCOM is that events cannot be shared among multiple people except for a few predefined family events such as marriage (MARR). This means that if an event should be associated with multiple people the data must be duplicated in each record. For example, if multiple people were all listed on a census you would need to duplicate the census record (CENS) on each of the people that were referenced there. However, most genealogy programs would require the user to enter the data multiple times through their user interface.

The way GEDCOM handles places is also considered a weakness of the specification. Places are encoded as strings on the events. An example would look like this following: 1 BIRT 2 PLAC New York City, , New York, USAAdditional references to New York City are represented by additional strings, so changes (for example, to add the county or change spelling) require changing every occurrence throughout the file. It also leads to duplication of information if geo-coding or other subrecords are added to the place.

Sometimes the GEDCOM specification has been made purposefully flexible to support many ways of encoding data, particularly in the area of sources. This flexibility has led to a great deal of ambiguity and has produced the side effect that some genealogy programs which import GEDCOM do not import all of the data.

Myths

A common myth about GEDCOM is that it doesn't support multi-media such as photos which are often attached to people. GEDCOM actually does support the linking of multi-media items as references to files on the file system. An example would look like this: 0 @I1@ INDI 1 NAME John /Doe/ 1 OBJE @O1@ 0 @O1@ OBJE 1 FILE johndoe.jpg2 TITL John Doe at age 50

If this were encoded in a GEDCOM file named "doe.ged" then the file "johndoe.jpg" would need to be in the same directory as the file "doe.ged." The disadvantage with this media encoding comes when you want to transfer this GEDCOM file AND the attached multimedia because you would have to manage the transfer of the files separately. A possible solution is to use a ZIP program or other tool to archive the GEDCOM and the media into the same file.

Another myth is that GEDCOM does not support multiple opinions or conflicting data. Sometimes in genealogy, it is desirable to record conflicting information. An example of such conflicting information would be a birth certificate listing a birth date as 10 January 1800 and a death certificate listing the birth date as 11 January 1800. Such mistakes are common due to transcription errors or perhaps the relative giving the information forgot. Good practices in genealogy research require that the researcher record both birth records along with the source citation information. GEDCOM does not prevent you from record two birth (BIRT) records for a single person. In the case of multiple instances of the same record, the preferred record should be listed first in the record. This example encoded in GEDCOM might look like this: 0 @I1@ INDI 1 NAME John /Doe/ 1 BIRT 2 DATE 10 JAN 1800 2 SOUR @S1@ 3 DATA 4 TEXT Transcription from birth certificate would go here 3 NOTE This birth record is preferred because it comes from the birth certificate 3 QUAY 2 1 BIRT 2 DATE 11 JAN 1800 2 SOUR @S2@ 3 DATA 4 TEXT Transcription from death certificate would go here 3 QUAY 2

There is room for improvement in the area of source citations which could be used to make this clearer. The quality of assessment field (QUAY) for example could be extended to use something besides just a numerical level 0-3.

Another myth about GEDCOM is that it does not support internationalization in name support. In the same way that you can have multiple events on a person, GEDCOM allows you to have multiple names for a person. If the GEDCOM were encoded in the unicode or UTF-8 character sets, then you may also record the name in Cyrillic, Hebrew, or any other alphabet. GEDCOM's NAME field also supports a phonetic variation (FONE) and a romanized variation (ROMN) of the name.

ee also

*FamilySearch
**Ancestral File Number
**International Genealogical Index
*Genealogical numbering systems
*Genealogy software

References

External links

* [http://www.familysearch.org/Eng/Home/FAQ/faq_gedcom.asp Specifications] (from the FamilySearch website)
** [http://www.familysearch.org/GEDCOM/GEDCOM55.EXE GEDCOM 5.5 Standard] (Executable file in Envoy format)
** [http://www.familysearch.org/GEDCOM/GedXML60.pdf Draft Specification for GEDCOM XML 6.0] (PDF)
** [http://homepages.rootsweb.com/~pmcbride/gedcom/55gctoc.htm GEDCOM 5.5 specification] (Paul McBride's HTML version)
** [http://www.saintclair.org/ftp/pub/pafutils/eged10ww.zip Event GEDCOM] (Microsoft Word in ZIPfile)
*Commentary
** [http://www.eogen.com/GEDCOM Overview of GEDCOM and its uses] on Encyclopedia of Genealogy
** [http://www.cyndislist.com/gedcom.htm Cyndi's List — GEDCOM]
** [http://msdn.microsoft.com/msdnmag/issues/04/05/XMLFiles/ Mapping GEDCOM to XML] , an article with example software in the C# programming language.
** [https://www.ngsgenealogy.org/ngsgentech/projects/TestBook2001/index.cfm GEDCOM TestBook Project]
** [https://www.ngsgenealogy.org/ngsgentech/projects/Gdm/Gdm.cfm The GENTECH Genealogical Data Model]
** [http://www.ancestry.com/learn/library/article.aspx?article=3438 on LDS Church's Adoption of the XML Standard]
** Tim Forsythe's [http://www.rumblefische.com/util/validator/tgv.html The Windows GEDCOM Validator 2.0]
** [http://www.legacyfamilytree.com/tipsGEDCOMfiles.asp How to export GEDCOM]
*XML-based genealogy interchange proposals
**Cosoft's [http://cosoft.org/genxml/ GenXML]
**Cover Page [http://xml.coverpages.org/genealogy.html survey] of XML-based genealogy formats
**GRAMPS genealogy software offers the .gramps XML format for wider use
**Michael Kay's [http://users.breathe.com/mhkay/gedml/index.html GedML]
**Tim Forsythe's [http://www.rumblefische.com/util/grendl/grendl11.html Genealogical Record Exchange and Description Language (GREnDL 1.1)]


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • GEDCOM — Расширение .ged Разработан LDS FHD Опубликован 1984 Последний выпуск GEDCOM 5.5 Standard + Errata Sheet / 2 января 1996 Тип формата Обмен генеалогическими данными Стандарт(ы) …   Википедия

  • GEDCOM — es un formato de archivo de datos. Fue desarrollado por el Departamento de Historia Familiar de La Iglesia de Jesucristo de los Santos de los Últimos Días, comúnmente llamada los mormones, que proporciona un formato flexible y uniforme para el… …   Wikipedia Español

  • Gedcom — (engl. GEnealogical Data COMmunication) ist die Spezifikation eines Datenformates, das den Austausch von Daten zwischen verschiedenen Computerprogrammen zur Genealogie ermöglicht. Das GEDCOM Format (Dateiendung: .ged) ist rein text basiert und… …   Deutsch Wikipedia

  • GEDCOM — (engl. GEnealogical Data COMmunication) ist die Spezifikation eines Datenformates, das den Austausch von Daten zwischen verschiedenen Computerprogrammen zur Genealogie ermöglicht. Das GEDCOM Format (Dateiendung .ged) ist rein text basiert und… …   Deutsch Wikipedia

  • gEDCoM — акроним программы «Генеалогические данные» (англ. ), созданной для обмена данными по генеалогии между разными базами данных. Веб страницы «ДНК Моего Семейного Дерева (), «Поиск по хромосоме» () и «Митопоиск» () дают возможность загрузить… …   Генетика. Энциклопедический словарь

  • GEDCOM — Norme GEDCOM GEDCOM est une spécification pour l échange de données généalogiques entre plusieurs systèmes ou logiciels de généalogie. GEDCOM est l acronyme de GEnealogical Data COMmunication. La désignation française courante est norme GEDCOM.… …   Wikipédia en Français

  • GEDCOM — abbr. GEnealogical Data COMmunication (format) Syn: GeDCom …   United dictionary of abbreviations and acronyms

  • GeDCom — abbr. GEnealogical Data COMmunication (format) Syn: GEDCOM …   United dictionary of abbreviations and acronyms

  • Gedcom — …   Википедия

  • Norme GEDCOM — GEDCOM est une spécification pour l échange de données généalogiques entre plusieurs systèmes ou logiciels de généalogie. GEDCOM est l acronyme de GEnealogical Data COMmunication. La désignation française courante est norme GEDCOM. Un fichier… …   Wikipédia en Français

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”