Data Format Description Language

Data Format Description Language

Data Format Description Language (DFDL, often pronounced daff-o-dil) is a modeling language from the Open Grid Forum for describing general text and binary data. A DFDL model or schema allows any text or binary data to be read (or "parsed") from its native format and to be presented as an instance of an information set. The same DFDL schema also allows data to be taken from an instance of an information set and written out (or "serialized") to its native format.

DFDL achieves this by building upon the facilities of W3C XML Schema 1.0. A subset of XML Schema is used, enough to enable the modeling of non-XML data. One of the results of this is that is very easy to use DFDL to convert general text and binary data, via a DFDL information set, into a corresponding XML document.

DFDL is descriptive and not prescriptive. DFDL is not a data format, nor does it impose the use of any particular data format. DFDL allows an application to design an appropriate data representation according to its requirements, and for that format to be described in a standard way so that multiple programs can directly interchange the data.

Contents

History

DFDL was created in response to a need for grid APIs to be able to understand data regardless of source. A language was needed capable of modeling a wide variety of existing text and binary data formats. A working group was established at the Global Grid Forum (which later became the Open Grid Forum) in 2003 to create a specification for such a language.

A decision was made early on to base the language on a subset of W3C XML Schema, using <xs:appinfo> annotations to carry the extra information necessary to describe non-XML physical representations. This is an established approach that is already being used today in commercial systems. DFDL takes this approach and evolves it into an open standard capable of describing many text or binary data formats.

Work continued on the specification, culminating in the publication of DFDL 1.0 [1] as an OGF Proposed Recommendation in January 2011. A summary of DFDL and its features is available at the OGF site.

Implementations of DFDL processors that can parse and serialize data using DFDL schemas are in progress. Any issues with the specification that are encountered during implementation work are being tracked in an errata document.

Example

Take as an example the following text data stream which gives the name, age and location of a person:

Joe Bloggs,46,Hampshire,England


The logical model for this data can be described by the following fragment of an XML Schema document. The order, names, types and cardinality of the fields are expressed by the XML schema model.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" ...>
 
<xs:complexType name="person_type">
  <xs:sequence>
    <xs:element name="name" type="xs:string"/>
    <xs:element name="age" type="xs:short"/>
    <xs:element name="county" type="xs:string"/>
    <xs:element name="country" type="xs:string"/>
  </xs:sequence>
</xs:complexType>
 
</xs:schema>


To additionally model the physical representation of the data stream, DFDL augments the XML schema fragment with annotations on the xs:element and xs:sequence objects, as follows:

<xs:schema xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/" xmlns:xs="http://www.w3.org/2001/XMLSchema" ...>
 
<xs:complexType name="person_type">
  <xs:sequence>
    <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/">
        <dfdl:sequence encoding="ASCII" sequenceKind="ordered" 
                       separator="," separatorType="infix" separatorPolicy="required"/>                   
    </xs:appinfo></xs:annotation>
    <xs:element name="name" type="xs:string">
      <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/">
        <dfdl:element lengthKind="delimited" encoding="ASCII"/>                   
      </xs:appinfo></xs:annotation>
    </xs:element>
    <xs:element name="age" type="xs:short">
      <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/">
        <dfdl:element representation="text" lengthKind="delimited" encoding="ASCII"
                      textNumberRep="standard" textNumberPattern="#0" textNumberBase="10"/>                   
      </xs:appinfo></xs:annotation>
    </xs:element>
    <xs:element name="county" type="xs:string">
      <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/">
        <dfdl:element lengthKind="delimited" encoding="ASCII"/>                   
      </xs:appinfo></xs:annotation>
    </xs:element>
    <xs:element name="country" type="xs:string">
      <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/">
        <dfdl:element lengthKind="delimited" encoding="ASCII"/>                   
      </xs:appinfo></xs:annotation>
    </xs:element>
  </xs:sequence>
</xs:complexType>
 
</xs:schema>


The property attributes on these DFDL annotations express that the data are represented in an ASCII text format with fields being of variable length and delimited by commas.

An alternative, more compact syntax is also provided, where DFDL properties are carried as non-native attributes on the XML Schema objects themselves.

<xs:schema xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/" xmlns:xs="http://www.w3.org/2001/XMLSchema" ...>
 
<xs:complexType name="person_type">
  <xs:sequence dfdl:encoding="ASCII" dfdl:sequenceKind="ordered" 
               dfdl:separator="," dfdl:separatorType="infix" dfdl:separatorPolicy="required">
    <xs:element name="name" type="xs:string"
                dfdl:lengthKind="delimited" dfdl:encoding="ASCII"/>                   
    <xs:element name="age" type="xs:short"
                dfdl:representation="text" dfdl:lengthKind="delimited" dfdl:encoding="ASCII"
                dfdl:textNumberRep="standard" dfdl:textNumberPattern="##0" dfdl:textNumberBase="10"/>                   
    <xs:element name="county" type="xs:string"
                dfdl:lengthKind="delimited" dfdl:encoding="ASCII"/>                   
    <xs:element name="country" type="xs:string"
                dfdl:lengthKind="delimited" dfdl:encoding="ASCII"/>                   
  </xs:sequence>
</xs:complexType>
 
</xs:schema>

Features

The goal of DFDL is to provide a rich modeling language capable of representing any text or binary data format. The 1.0 release is a major step towards this goal. The capability includes support for:

  • Language structures such as COBOL, C and PL/1
  • Industry standards such as CSV, SWIFT, FIX, HL7, X12, HL7, HIPAA, EDIFACT, ISO8583
  • Data delimited by text or binary markup
  • Physical data types including text strings, text numbers, binary two's complement integers, BCD, mainframe zoned and packed decimals, IEEE and mainframe floats, text and binary calendars, text and binary Booleans
  • Any encoding and endian-ness
  • Bi-directional text
  • Bit data of arbitrary length
  • Pattern languages for text numbers and calendars
  • Ordered and unordered content
  • Default values on parsing and serializing
  • Nil values capability for handling out-of-band data
  • A built-in expression language including variables to model dynamic data
  • Mechanisms to resolve choices and optionality
  • Fixed and variable arrays
  • Hiding elements in the data from the information set
  • Calculating element values for the information set
  • Validation to XML Schema 1.0 rules
  • A scoping mechanism that allows common property values to be applied at multiple annotation points

Future releases are anticipated in which it is hoped to include support for:

  • Direct access by offset
  • True multi-dimensional arrays
  • Embedded comments
  • Custom language extensions

See also

References

External links


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Binary Format Description language — The Binary Format Description (BFD) language is an extension of XSIL which has added conditionals and the ability to reference files by their stream numbers, rather than by their public URLs. A template written in the BFD language can be applied… …   Wikipedia

  • Web Services Description Language — Infobox file format name = Web Services Description Language icon = logo = extension = .wsdl mime = application/wsdl+xml type code = uniform type = magic = owner = [http://www.w3.org/ World Wide Web Consortium] genre = contained by = XML… …   Wikipedia

  • Specification and Description Language — Die Specification and Description Language (SDL, engl. Spezifikations und Beschreibungssprache) ist eine von der ITU T, d.h. dem Telekommunikations Standardisierungs Sektor der Internationalen Fernmeldeunion, in den so genannten Standards… …   Deutsch Wikipedia

  • Data model — Overview of data modeling context: A data model provides the details of information to be stored, and is of primary use when the final product is the generation of computer software code for an application or the preparation of a functional… …   Wikipedia

  • Page description language — A page description language (PDL) is a language that describes the appearance of a printed page in a higher level than an actual output bitmap. An overlapping term is printer control language, but it should not be confused as referring solely to… …   Wikipedia

  • Hardware description language — In electronics, a hardware description language or HDL is any language from a class of computer languages and/or programming languages for formal description of electronic circuits. It can describe the circuit s operation, its design and… …   Wikipedia

  • Data General Nova — System Data General Nova 1200 front panel …   Wikipedia

  • Data Web — refers to a government open source project that was started in 1995 to develop open source framework that networks distributed statistical databases together into a seamless unified virtual data warehouse. Originally funded by the U.S. Census… …   Wikipedia

  • Description Definition Language — DDL (Description Definition Language) is part of the MPEG 7 standard. It gives an important set of tools for the users to create their own Description Schemes (DSs) and Descriptors (Ds). DDL defines the syntax rules to define, combine, extend and …   Wikipedia

  • Description of a Career — DoaC redirects here. DoaC may also refer to Diary of a Camper. Description Of A Career (DOAC) is a semantic vocabulary created by Ramon A. Parada to describe professional capabilities of a worker. It has been designed to be compatible with the… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”