Encoded Archival Description (EAD) is an XML-compliant DTD (Document Type Definition) for markup of archival finding aids—documents that describe archival and manuscript collections.
EAD documents are generally broken into two sections: a “top-level” set of elements that give a broad description of the collection as a whole, and a “subordinate” set of elements that describe the collection’s arrangement and contents more closely. Each element is surrounded by descriptive tagging. Specific tagged elements, such as titles, dates, scope-and-content notes, and physical descriptions, may occur at more than one level in the document’s XML structure, and some may recur within themselves. In other words, the whole range of parent-child-sibling relationships allowed for in XML is used in EAD.
Taken as a whole, EAD elements present both the intellectual arrangement—the organization of the collection into groups of related items—and the physical arrangement—the box numbers, folder numbers, call numbers, or shelf numbers—of the materials in the collection. The subordinate level of description may be broken into component levels, with each succeeding component being nested within the previous one. These are expressed as <c01>, <c02>, <c03>, etc., with <c01> tagging the broadest level of description, and each successive numbered component representing a narrower level.
EAD finding aids can be very complicated documents. The EAD DTD allows for many different styles of organization and markup. When a common searching and delivery environment is considered for EAD documents, standardization of markup becomes of primary importance. At the same time, when finding aids from various types of institutions are to be accommodated, some flexibility must be allowed for.
In EAD, elements of archival description are tagged using a standardized syntax. In keeping with XML standards, tags consist of an opening and closing pair, for example:
<unittitle> Guide to the Will Eisner Collection</unittitle>
Commonly used elements include series titles, folder titles, dates, personal names, corporate names, geographical names, subjects, types of materials (formats), and numbers or codes identifying the physical locations of materials. Here are some examples:
<unittitle>The name given to an item or group of items in a finding aid. This could be, for example, a collection title, a series title, a folder title, or the name of an item.
<unitdate> The date or range of dates of an item or group of items (up to and including the whole collection). Can appear at different places in the finding aid, but most often follows <unittitle>.
<scopecontent> The scope and content of the collection. Can appear at any level of description, providing information about the collection as a whole, the scope of a particular series, or the contents of a single folder, for instance.
<bioghist>A biography of the individual, or history of the corporate body, that created the collection. Usually found at the top level, occasionally found at the series level.
<physloc> Physical location of the collection or part of the collection. Can be used at the top level, to designate the building or room number where the entire collection is stored, or at any other level, down to the shelf location of a single file of materials.
<container>The number of the container holding a file or item, such as a box or box-and-folder, map case, or microfilm reel.
<physdesc>The physical description of materials, such as their dimensions, quantity, format, medium, color, etc. Often contains additional tags such as <dimensions> or <extent>. Used most frequently in collections of audio and visual materials or realia. For other collections, generally used at the top level to record the extent of the collection in number of boxes, or linear or cubic feet.
<controlaccess>Controlled access terms are similar to subject headings in a library catalog record. Used at the top level, they present the names, subjects, and formats significant to the collection. They may also be used at other levels, though this is less common. The following elements are often nested within the <controlaccess> element:
<persname>Names of individuals.
<corpname>Names of corporate bodies.
<geogname>Names of geographical locations.
<genreform>Genre or format of materials. Examples: photographs, original art, diaries.
<subject>Subject headings taken from a controlled vocabulary, such as Library of Congress Subject Headings or the Art and Architectural Thesaurus.
(These elements may occur in other parts of the finding aid, as well.)
On the next page is a simplified version of the structure of a finding aid. It shows the various levels at which tagged elements may occur in the EAD DTD. In a real finding aid, there would likely be two or more <c01> tags, each containing multiple <c02> tags, each of them containing multiple <c03> tags.
<filedesc>( Information about the finding aid: title, date of publication, name and address of the repository that created it)
<profiledesc> (Information about the creation of the finding aid: name of encoder, date of encoding, language of the finding aid, and any revisions made to it)
</eadheader> (information about the finding aid)
<archdesc level="collection"> (a wrapper element for information about the collection)
<did> (descriptive identification, contains core information at <archdesc> and <c> levels)
(At this point, the general description of the collection as a whole ends, and the detailed description of its parts begins. The detailed description is broken into components, tagged as <c01>, <c02>, etc., with each succeeding component representing a more granular level of description.)
<dsc> (description of subordinate components)
<container type="Box"></container><container type="Folder"></container>