Commit | Line | Data |
4a6b658f |
1 | [% PROCESS header.tt |
2 | pagetitle = "Guidelines for XML file uploads to Stemmaweb" |
3 | %] |
4 | <h1>Stemmaweb Help - XML upload guidelines</h2> |
5 | <div id="docco"> |
6 | <h3>Guidelines for TEI parallel segmentation input</h3> |
7 | <div id="tei"> |
8 | <p>The Stemmaweb uploader can accept a collated text tradition in TEI parallel-segmentation format that adheres to the following guidelines:</p> |
9 | <ul> |
10 | <li>The file must have an <msDesc> element in its header which contains a list of all witnesses used in the text in the <listWit> element. Each <witness> element within <listWit> must have an xml:id attribute; this is taken to be the witness sigil, and is expected to appear in the "wit" attribute of the relevant apparatus readings within the text.</li> |
11 | <li>Individual reading words may be wrapped in <w> tags; if the tag has an xml:id it will be preserved as the reading ID.</li> |
12 | <li>All text within the main <text> element of the TEI file will be taken to be part of the collation. Paragraph and line divisions are not currently preserved.</li> |
13 | <li>At the point where any witness appears in the collation, this should be signified with a <witStart> tag within its own apparatus. The first of these will therefore be the first element in any collated text, for example: |
14 | <pre> |
15 | <body> |
16 | <p xml:id="am_1_1"> |
17 | <app xml:id="AppStart"> |
18 | <rdg wit="#Jer #K #A #F #B #I #D #J #O #V #X #Y #Z #W"><witStart/></rdg> |
19 | </app> |
20 | [...] |
21 | </p> |
22 | </body> |
23 | </pre> Likewise, when a witness text ends this should be noted with a <witEnd> tag. Lacunae within the text may be indicated through successive uses of the <witEnd> and <witStart> tags.</li> |
24 | <li>Readings as they appeared before scribal corrections may be indicated using the "type" attribute on the <rdg> element, as shown here: |
25 | <pre> |
26 | <app xml:id="App530"> |
27 | <lem wit="#Jer #K #F #B #I #A #D #J #O #V #X #Y #Z #W"> |
28 | <w xml:id="L414">զոր</w> |
29 | </lem> |
30 | <rdg type="a.c." wit="#B"><w>զո</w></rdg> |
31 | </app> |
32 | </pre></li> |
33 | <li>An apparatus entry may contain the <lem> tag as well as the <rdg> tag; these are treated as equivalent for the purpose of creating the graph.</li> |
34 | </ul> |
35 | </div> |
36 | |
37 | <h3>Guidelines for Classical Text Editor export</h3> |
38 | <div id="cte"> |
39 | <p>The Stemmaweb uploader can accept a collated text tradition exported from CTE, provided that certain guidelines are observed:</p> |
40 | <ul> |
41 | <li>Common abbreviations (e.g. 'om.', 'add.', and the like) should be set correctly via the Format -> Document... menu under the General tab. Any other abbreviations or notes are likely to be interpreted as literal readings by the uploader.</li> |
42 | <li>Avoid notations such as "<em>tr. post</em> word"; this is difficult to parse even for a human and almost impossible for a computer.</li> |
43 | <li>Be extremely careful with overlapping apparatus entries; if, for example, a pair of entries reads: |
44 | <pre> |
45 | dominus deus] tr. A |
46 | deus] deo A |
47 | </pre> |
48 | then you are simultaneously telling CTE that A reads "deus dominus" and "deo dominus", and the export will become confused. Ensure that there is no scope for confusion.</li> |
49 | <li>Ensure that the critical apparatus (and the apparatus siglorum, if any) is marked as such via the Format -> Notes/Apparatus settings menu.</li> |
50 | <li>Ensure (via the menu item Options -> Preferences, under the 'XML' tab) that the 'Apparatus export' option is set to "<app> tags".</li> |
51 | </ul> |
52 | </div> |
53 | </div> |
54 | |
55 | [% PROCESS footer.tt %] |