Below is an outline of the guidelines for encoding cookbooks in MSU's digitization project "Feeding America: The Historic American Cookbook Project."
Also available is the instruction manual which was written to train our cookbook coders. This document provides a more indepth explanation of the coding process than the outline provides, as well as explaining XML related terms you may be unfamiliar with.
- Feeding America Coder Instruction Manual
The DTD for Feeding America is also available.
- Feeding America Cookbook.dtd
You may select a section for faster browsing:
- Formal Public Identifier for cookbook.dtd
- Preparing to encode
- The Wrapper Element <cookbook>
- Metadata
- Frontmatter and Backmatter
- The Body: Structural Elements
- The Body: Recipes and Formulas
- Low-level Elements for Recipes and Formulas
- "Class=" Values for Food Topics
- "Class=" Values for Nonfood Topics
- General Formatting Elements
- Target and ID Pairs
- External References
- Illustrations
- Editorial Notes
Formal Public Identifier for cookbook.dtd
A customized tagset has been devised for this project:
System identifier: "cookbook.dtd"
Public identifier: ''-//Michigan State University, Digital & Multimedia Center//DTD -//cookbook 1.0//EN"
The DTD may be examined here. However, this is probably not the final version of the DTD. Contact Ruth Ann Jones, MSU Digital & Multimedia Center, for more information.
PREPARING TO ENCODE
The first set of cookbooks were typed using the basic instructions for Sunday
School books, that is, using the TEI tagset instead of the customized Cookbook
tagset. Because of this, all italicized text is tagged as <emph rend="italic>.
The cookbook DTD uses a slightly different model for indicating font shifts. <emph> still
exists as an element, but should be used only for italics used to indicate linguistic emphasis
(as in this sentence). All low-level elements have attributes to indicate
font shift as well as the descriptive content of the tag name:
| Value | Allowable attributes |
| align | center, right, indent1, indent2
|
| rend | bold, italic, ornate
|
| size | larger, smaller (than surrounding text)
|
| placement | heading (on a line by itself) or inline
|
Without these values, it might be necessary to use two sets of tags on the same word or phrase, for example:
With these values, the tagging is a bit more streamlined:
This can be fixed with a global change if you determine that all instances of <emph rend="ital"> occur for the same reason: that is, to mark ingredients. Be cautious with global operations! If you make an error, it can take longer to correct the error than it would have taken to tag the sections manually. If you need to experiment to see if a situation can be corrected with a global search-and-replace, always make a backup copy of the document.
THE WRAPPER ELEMENT <cookbook>
A <cookbook> consists of four sections:
| <meta> | metadata, expressed as Dublin Core elements: required
|
| <front> | frontmatter: required
|
| <body> | main body of book: required
|
| <back> | backmatter: not required, but normally it will be used at least minimally to hold the <pb> reference for the back cover image
|
ATTRIBUTES ESTABLISHED FOR THE WRAPPER ELEMENT <cookbook>
| type= | Required. This contains general categories which can characterize an entire cookbook, a chapter or section, or (infrequently) individual recipes or formulas. Allowed values are general, charity, famous, frugal, restaurant, invalid, histperiod, and encyclopedia. See Table 3 for definitions. |
| chefschool= | Optional, but should be used if a cookbook is identified as type=famous. Fill in the name of the chef or cooking school. |
| histperiod= | Optional, but should be used if a cookbook is identified as type=histperiod. Fill in the name of the historical period (such as "Temperance movement" as given in the cookbook. |
| class1= | Required. These are categories for foods and other types of activities described in the cookbooks. Allowed values are fruitveg, meatfishgame, eggscheesedairy, breadsweets, soups, seasoningmisc, beverages, generalfood, menus, medhealth, household, farmgarden, childrear, etiquette, restaurant, servants, generalnonfood, foodandnonfood. The values shown in italics will probably be the ones used most often at this level, when you are describing an entire cookbook. However, some books will have focus on certain types of foods and one of the other values will be appropriate. The same values are used for individual recipes. See below for definitions. |
| class2= | Optional. Same allowed values as class1; use this if necessary to represent a secondary focus of a particular cookbook. |
| region= | Required. If a recipe is identified with a specific place or region in the U.S. or with a particular ethnic group, use this attribute. Allowed values are northeast, south, midwest, west, ethnic, and general. Use the U.S. Census map to decide which region a place is in. |
| subregion= | Optional but should be used if the region= attribute is used. Fill in the more specific region as identified by the cookbook. |
| ethnicgroup= | Optional, but should be used if a cookbook or portion of a cookbook is identified as <element region="ethnic">. Fill in the name of the group as identified in the cookbook. |
| occasion= | Optional. If a cookbook is identified with a special occasion, use this attribute. Allowed values are Thanksgiving, Christmas, wedding, birthday, patriotic, spring, summer, fall, winter, other. |
VALUES FOR THE TYPE= ATTRIBUTE OF THE WRAPPER ELEMENT <cookbook>
| "general" | General works that do not fall into one of the special categories listed below.
|
| "famous" | Cookbooks by a famous chef (Julia Child, Fannie Farmer) or produced by a well-known cooking school. Put the chef's or school's name in chefschool=
|
| "charity" | Cookbooks produced by church or community groups for fundraising.
|
| "frugal" | Works on cooking economically or using inexpensive ingredients.
|
| "restaurant" | Works featuring large-scale recipes for restaurants.
|
| "invalid" | Works on cooking for invalids or treating various conditions through diet (e.g. diabetes cookbooks).
|
| "histperiod" | Works based on the cooking of a specific historical period. Put
the name of the period in the attribute histperiod=. For example,
a Civil War cookbook:
|
| "encyclopedia" | Works organized as a dictionary or encyclopedia: that is, articles arranged alphabetically by topic.
|
METADATA
The metadata section makes use of unqualified Dublin Core elements. The element names are preceded with "dc:" to distinguish them from the non-metadata elements.
A template with instructions for each element is available at K:\cookery\metadata.txt. Open the document in NoteTab or WordPad, copy the entire template, and paste it into the xml document. Delete the duplicate "meta" tags if necessary. The metadata template will validate against the cookbook.dtd.
FRONTMATTER AND BACKMATTER
Divide the <front> and <back> of each book into <div> sections based on their content. Each <div> must have a type indicated. Allowable values are: advertisement, appendix, backcover, contents, copyrightstmt, dedication, frontcover, glossary, halftitlepage, illustration, introduction, index, preface, titlepage, and other.
This list is meant to be fairly comprehensive, so don't use "other" unless none of the other values fit at all. For example, an editor's note is pretty similar to a preface, so tag it as "type=preface." Use "type=contents" for lists of illustrations, figures, and other special items as well as the usual table of contents.
If necessary, a <div> can be divided into <subdiv> elements. This is similar to the <div1>, <div2> relationship in TEI. However, it shouldn't be necessary very often. An exception might be a lengthy introduction that is divided into two parts, each with their own heading. Normally, the paragraph tag <p> will be all you need within a <div>.
THE BODY: STRUCTURAL ELEMENTS
The <body> structure is also limited in the number of text division levels, compared to the TEI structure that runs from <div1> to <div7>. The body can be divided into chapters, which can be divided into sections, which can be divided into subsections, which can be divided into recipes.
It is not necessary to use all of these levels; only use as many as necessary to reflect the actual structure of the book. The DTD allows the <recipe> element to be located immediately within the <body> element or within <chapter> or <section>. For example, Amelia Simmons' American Cookery has no chapter divisions at all, simply a title page and a series of recipes, so the <recipe> elements would be placed immediately within the <body> element.
This means the following structures are possible, from simplest to most
layered:
| <cookbook> <meta></meta> <front></front> <body> <recipe></recipe> </body> < /cookbook> |
| <cookbook> <meta></meta> <front></front> <body> <chapter> <recipe></recipe> </chapter> <chapter> <recipe></recipe> </chapter> </body> < /cookbook> |
| <cookbook> <meta></meta> <front></front> <body> <chapter> <section> <recipe></recipe> </section> <section> <recipe></recipe> </section> </chapter> </body> < /cookbook> |
| <cookbook> <meta></meta> <front></front> <body> <chapter> <section> <subsection> <recipe></recipe> </subsection> <subsection> <recipe></recipe> </subsection> </section> </chapter> </body> < /cookbook> |
WHEN TO USE THE "CLASS=" ATTRIBUTE FOR <chapter>, <section>, and <subsection>
The "class=" attribute is optional for the <chapter>, <section>, and <subsection> elements. As of May 2002, "class=" should be used only for the smallest of these three elements in use in a particular book or portion of a book. Theoretically, this will avoid having the XPAT search engine produce duplicate search results.
THE BODY: RECIPES AND FORMULAS
Within the <chapter> or <section> or <subsection> elements (which hold the major portions of the text) the majority of the text should be tagged as one of three types:
| <recipe> | Directions for making something edible, and intended as a food or beverage. This category does not include medicines taken internally.
|
| <formula> | Directions for making something non-edible (or not intended as a food or beverage), such as laundry starch, fabric dyes, or medicines.
|
| <p> | General commentary that is not part of a recipe or formula, such as advice on how to choose foods in the market place, foods that go together well, table manners, etc. Some of the books will also have extensive sections on other domestic matters such as childrearing, care of invalids, advice on household management, etc.
|
Some recipes and formulas will contain more than one paragraph, and (as one might expect) these are indicated with <p> tags. However, although many recipes are complete in a single paragraph, these must also have a set of <p> tags immediately within the <recipe> tags.
Although this is somewhat repetitive, it serves two purposes. In order for the style sheet to produce a consistent screen display, the coding also needs to be consistent, whether a recipe has one paragraph or two. This means we must either use the extra <p> tags, or have no <p> tags at all within recipes and use <lb> or some other construction to indicate a second or third paragraph. Since the typists are already inserting <p> tags (following the SSB typing conventions) the first choice makes more sense.
LOW-LEVEL ELEMENTS FOR RECIPES AND FORMULAS
| purpose | This is the title of the recipe or the statement of what the directions
will produce. In older cookbooks, this is often located in the first
sentence: "To make a bread pudding..." In later cookbooks this is usually
a heading located before a list of ingredients. When the <purpose> appears
as a heading on the page, use <purpose placement="heading">.
|
| process | This would be used for verbs: braise, boil, etc. Use this very sparingly, for actions that are uncommon in 20th century cooking. Don't tag words like "stir" or "roast." Do tag words or phrases like "let the batter sweat overnight".
|
| ingredient | This would be used for ingredients in a recipe or the items used to make a formula: things like madder root or walnut hulls would be ingredients for a fabric dye formula. Use only for uncommon ingredients.
|
| implement | Objects used to perform some action in a recipe or formula. Ignore common items like spoons, bowls, pots and pans.
|
| measurement | Use this to flag unusual measurement terms such as gill or teacup-full.
|
| contributor | Use in church and charity cookbooks when contributors of individual recipes are listed.
|
| attribution | Use when a recipe is attributed to someone else besides the editor or author of the book being tagged. <attribution>"This is based on Julia Child's recipe for boiled turnips."</attribution>
|
| variation | Use for variations on a recipe. Usually this means an instruction to follow the same cooking directions and set of ingredients but with one or two substitutions.
|
DEFINITIONS OF "CLASS=" VALUES FOR FOOD TOPICS
| "fruitvegbeans" | Preparing and preserving fruits, vegetables, beans, and legumes of all kinds; selecting these foods at market; proper storage conditions; nutrional value of these foods.
|
| "meatfishgame" | Preparing or preserving beef, lamb, mutton, poultry, seafood, and wild game such as venison, squirrel, buffalo, etc. Include organ meats such as kidney, brains, tripe, etc. Also, selecting and storing these foods; nutritional value of these foods.
|
| "eggscheesedairy" | Making cheese or other dairy products (i.e. yogurt) and recipes which have eggs, cheese, or dairy products as the major ingredients (i.e. puddings, custards, quiche). Also, selecting and storing these foods; nutritional value of these foods.
|
| "breadsweets" | Breads and baked goods: crackers, muffins, tarts, pies, cakes, pancakes, etc. Also, sweets or desserts even if they are not baked, such as fudge, boiled sugar candies, icings for cakes, etc. Also, selecting and storing these foods; nutritional value of these foods.
|
| "soups" | Soup recipes. This category takes precedence over "fruitveg" and "meatfishgame" -- i.e. asparagus soup goes here, not in "fruitveg"; beef broth goes here, not in "meatfishgame". Also, selecting and storing these foods; nutritional value of these foods.
|
| "accompaniments" | This category encompasses foods meant to season or flavor other foods, rather than being eaten alone. This includes recipes for sauces, jams and preserves, and condiments such as mustard or pesto, as well as directions for using or preparing herbs or spices. Also, selecting and storing these foods; nutritional value of these foods.
|
| "beverages" | Anything meant to be drunk instead of eaten. Milk or eggnog goes here, not in "eggscheesedairy." Fruit juice goes here, not in "fruitveg." Also, selecting and storing these foods; nutritional value of these foods.
|
| "generalfood" | Applies only to <cookbook>, <chapter>, and <section>. This is to be used when two or more categories are covered by the material. Most cookbooks will be class="general" because they cover all types of food.
|
| "menus" | Restaurant menus, "bills of fare" and other portions of text that list foods that go together well. This can only be used with <cookbook>, <chapter>, <section>, or <passage>.
|
Each <cookbook> and each <recipe> may have two of these terms applied: one as the value of the class1= attribute and one as the value of the class2= attribute. Even so, some recipes will be hard to classify. If in doubt:
- a) Classify according to the nouns in the recipe title, not the adjectives.
For example, should "bread pudding" be classed as "breadsweets" or "eggscheesedairy" ? The noun in this case is "pudding" which is basically the eggs-and-milk part of the recipe. Therefore, classify this in "eggscheesedairy." - b) If that doesn't work, look at the ingredients. Classify according to what seems to be the "biggest" ingredient in the recipe.
In any kind of classification system, there always will be differences of opinion about what category a certain thing should go in; this is normal and to be expected. Try to be consistent, but don't agonize over individual recipe classifications.
DEFINITIONS OF "CLASS=" VALUES FOR NON-FOOD TOPICS
| "medhealth" | Information about health, nutrition, hygiene, or care of the sick. Examples might include: "a tincture for mouth and gums" (i.e. toothpaste), "tonics" or nutritional supplements like cod liver oil, or poultices for dressing wounds.
|
| "household" | Information about household management. Examples of <formula> under this category would include directions for preparing things like laundry starch or fabric dyes. A <passage> under this category might discuss ways to heat a home more efficiently.
|
| "farmgarden" | Anything related to the raising of food or livestock. Examples might include advice on caring for an orphaned calf or making a spray to ward off potato beetles.
|
| "childrear" | Advice on raising children.
|
| "etiquette" | Advice on good manners, how to behave in social situations, etc.
|
| "restaurant" | Advice on managing a restaurant or hotel, or (in the case of Tunis Campbell) training employees of a restaurant or hotel.
|
| "servants" | Use this for servants in private homes. Hotel employees should be listed under "restaurant" (because that is actually shorthand for "restaurants and hotels".)
|
| "generalnonfood" | Anything that doesn't fit in one of the categories above.
|
And, a very general category, probably only applicable at the <cookbook> level:
| "foodandnonfood" | For those encyclopedic works that address many sorts of foods and many sorts of noncooking topics such as gardening, nursing the sick, organizing household work, etc.
|
GENERAL FORMATTING ELEMENTS
Use these low-level elements to format the text as necessary.
| <p> | Paragraph: to subdivide <recipe> or <formula> or <passage>, as needed.
|
| <pb> | Page break. Follow the same rules as for other typing projects, i.e. <pb n="pagenumber" id="book079.jpg">
|
| <lb> | Line break. Can be used any time there are special line breaks, as on the title page.
|
| <emph> | Use only for linguistic emphasis (see top of instructions for explanation).
|
| <alt> | Use to give the 20th century equivalents of archaic terms. For
example,
|
| <list> | Use to indicate a list of items. A <list> contains a series of <item>s.
|
| <item> | The individual lines or sections of a list. Can contain <term>, <definition>, and <ref>.
|
| <term> | Use only when <list> is being used to encode a glossary-type section.
|
| <definition> | Use only when <list> is being used to encode a glossary-type section.
|
| <ref> | Use for footnotes, or for page numbers in a table of contents. More explanation below.
|
| align= | Allowed values are center, right, indent1, and indent2.
|
| rend= | Allowed values are bold, italic, and ornate.
|
| size= | Allowed values are larger and smaller. (This means larger or smaller than the text immediately surrounding the tagged text.)
|
| placement= | Allowed values are heading and inline. Heading means on a line by itself. Inline means not on a line by itself, like the text in a paragraph.
|
| height= | This is an attribute for <ref>, for marking up footnotes. Allowed values are subscript (below the line of text) and superscript (above the line of text).
|
TARGET AND ID PAIRS
Target and ID "pairs" are used to refer a reader from one part of a document to another. They're referred to as pairs because whenever you use <ref target="example"> in one part of a book, there has to be another portion tagged as <element id="example">. The "id=example" can be used in many different elements in the cookbook DTD. It is required in <pb>, so that is the most frequent use. It can also be used in lists, chapter headings, illustrations, etc.
EXTERNAL REFERENCES
The cookbook collection will be accompanied by several groups of supplementary material: author biographies, essays on individual books and cooking genres, a glossary of cooking terms, and a gallery of museum objects. We will need to create links to this material from within the cookbook texts. The attributes xref= and item= will be used to do that. These attributes can be used with most elements.
The allowed values for xref= are authors, essays, objects, glossary (the
same four categories named above).
The value for item= is a code referring to the particular author, essay,
object, or glossary entry. We'll need to create standardized lists for these.
Example: one of the museum objects being photographed is a Dutch oven. When a recipe mentions using this item, it would be tagged like this:
"Let the stew simmer for three hours in a <implement xref="objects" item="dutchoven">Dutch oven</implement> placed among the coals."
ILLUSTRATIONS
Follow the same guidelines as in the Sunday school books to decide if something is an illustration. If you see an abstract design used to fill a little space at the end of a chapter, don't tag it at all. If you see something decorative that is an identifiable object (a little row of spoons, perhaps) tag it as an illustration even if it's only there to fill space.
| <illustration> | Contains <caption> and <description>. The caption is optional, since there might not be one. The description is required.
|
| <caption> | Wrap this tag around the picture's caption.
|
| <description> | Write a brief description of the picture. See the Sunday school books website for examples of good descriptions.
|
EDITORIAL NOTES
The TEI tagset, which we used to code the Sunday school books, has numerous tags for indicating unusual characteristics of a book that cannot be clearly understood through the transcription of the text. For example, there is a tag <inscription> for marking up handwritten inscriptions, a tag for <gap> to indicate that words are missing because of damage to a book, and <unclear> for words that cannot be accurately transcribed because (for example) the ink is badly smudged.
The cookbook tagset attempts to reduce the number of tags needed for situations like these by including an element <ednote>. When you encounter something that needs a bit of explanation, go ahead and put it in, wrapped in <ednote> tags. The XSL stylesheet will be written to display <ednote> material inside square brackets with a heading [Note: ] so it will be clear to the reader that this is an addition to the original text. In general, put the <ednote> after the place that needs explanation. For example, on the title page of Amelia Simmons:
<div type=titlepage>
<p>John Hammond</p>
<ednote>Handwritten inscription.</ednote>
<docTitle>American Cookery...</docTitle>
</div>







