What Can You Do With XML Today?

Generalized markup experimentation began in the late 60’s and 70’s at places like IBM and within the secondary publishing industry. For a brief period in the 80’s, the file loading format of Dialog Information Services, Dialog format b, became a de facto markup standard for secondary publishers. As generalized markup developments moved from the labs to the standards arena and started to become meta languages, three basic parts to generalized markup language (GML) emerged. One, GML needs a declaration at the start of a document. In both XML examples above, the opening line of each denotes the markup language to be invoked, XML version 1.0. There are many GML’s in use and within any given GML there are many implementations. Providing a GML declaration gives the document user, either machine or human, the information needed to begin decoding the markup.

Two, GML relies on a document type definition (DTD) or schema. Using the rules, syntax, etc., of the meta language, the DTD is the set of instructions needed to markup an actual document. The DTD describes the tags, their meaning, and how to use them. It is the DTD that says the publication date in the document will be encoded using <pubdate> and </pubdate>. Question: Are the angle brackets, “<” and “>”, required? (The answer is posted to www.accessinn.com and www.dataharmony.com.) The DTD is necessary because the labels or tags are arbitrary. Publication date can be just as well defined and marked up as <publicationdate> or <PD>. Many, many design issues and processing instructions are articulated in a DTD or schema. The two examples above are much easier to process by machine or human, if accompanied by a DTD.

Three, GML requires a “document instance.” GML can be used to encode any digital object, but textual oriented digital files, documents, are currently very common, so “document” is used in conjunction with “instance” to refer to all GML objects. The document instance is your newspaper article about MLB with the markup encoding it. When you have a document instance, you have an article ready for use by a GML tool such as a WEB browser.

Continuing on the history of GML, the publishing industry really pushed strongly for the development of a GML. Standard Generalized Markup Language (SGML) resulted. It is an ISO Standard, ISO 8879:1988 (www.iso.org.) SGML helped publishers and others to wrench their data from propriety editorial and photocomposition systems. The push came about as publishers recognized new revenue streams in digital content. Extracting their data from proprietary markup, markup used only for format, was nearly impossible. As publishers implemented SGML, photocomposition vendors developed import and export routines to handle it.

Comments are closed.