[scpubgit/stemmaweb.git] / root / src / input.tt

[% PROCESS header.tt
	pagetitle = "Guidelines for XML file uploads to Stemmaweb"
%]
<h1>Stemmaweb Help - XML upload guidelines</h2>
<div id="docco">
  <h3>Guidelines for TEI parallel segmentation input</h3>
  <div id="tei">
    <p>The Stemmaweb uploader can accept a collated text tradition in TEI parallel-segmentation format that adheres to the following guidelines:</p>
    <ul>
    	<li>The file must have an &lt;msDesc&gt; element in its header which contains a list of all witnesses used in the text in the &lt;listWit&gt; element. Each &lt;witness&gt; element within &lt;listWit&gt; must have an xml:id attribute; this is taken to be the witness sigil, and is expected to appear in the "wit" attribute of the relevant apparatus readings within the text.</li>
    	<li>Individual reading words may be wrapped in &lt;w&gt; tags; if the tag has an xml:id it will be preserved as the reading ID.</li>
    	<li>All text within the main &lt;text&gt; element of the TEI file will be taken to be part of the collation. Paragraph and line divisions are not currently preserved.</li>
    	<li>At the point where any witness appears in the collation, this should be signified with a &lt;witStart&gt; tag within its own apparatus. The first of these will therefore be the first element in any collated text, for example:
    	<pre>
    	&lt;body&gt;
          &lt;p xml:id="am_1_1"&gt;
            &lt;app xml:id="AppStart"&gt;
              &lt;rdg wit="#Jer #K #A #F #B #I #D #J #O #V #X #Y #Z #W"&gt;&lt;witStart/&gt;&lt;/rdg&gt;
            &lt;/app&gt;
          [...]
          &lt;/p&gt;
    	&lt;/body&gt;
    	</pre> Likewise, when a witness text ends this should be noted with a &lt;witEnd&gt; tag. Lacunae within the text may be indicated through successive uses of the &lt;witEnd&gt; and &lt;witStart&gt; tags.</li>
    	<li>Readings as they appeared before scribal corrections may be indicated using the "type" attribute on the &lt;rdg&gt; element, as shown here:
    		<pre>
            &lt;app xml:id="App530"&gt;
              &lt;lem wit="#Jer #K #F #B #I #A #D #J #O #V #X #Y #Z #W"&gt;
                &lt;w xml:id="L414"&gt;զոր&lt;/w&gt;
              &lt;/lem&gt;
              &lt;rdg type="a.c." wit="#B"&gt;&lt;w&gt;զո&lt;/w&gt;&lt;/rdg&gt;
            &lt;/app&gt;
            </pre></li>
        <li>An apparatus entry may contain the &lt;lem&gt; tag as well as the &lt;rdg&gt; tag; these are treated as equivalent for the purpose of creating the graph.</li>
    </ul>
  </div>
  
  <h3>Guidelines for Classical Text Editor export</h3>
  <div id="cte">
    <p>The Stemmaweb uploader can accept a collated text tradition exported from CTE, provided that certain guidelines are observed:</p>
    <ul>
    	<li>Common abbreviations (e.g. 'om.', 'add.', and the like) should be set correctly via the Format -&gt; Document... menu under the General tab. Any other abbreviations or notes are likely to be interpreted as literal readings by the uploader.</li>
    	<li>Avoid notations such as "<em>tr. post</em> word"; this is difficult to parse even for a human and almost impossible for a computer.</li>
    	<li>Be extremely careful with overlapping apparatus entries; if, for example, a pair of entries reads:
    		<pre>
    		dominus deus] tr. A
    		deus] deo A
    		</pre>
    		then you are simultaneously telling CTE that A reads "deus dominus" and "deo dominus", and the export will become confused. Ensure that there is no scope for confusion.</li>
    	<li>Ensure that the critical apparatus (and the apparatus siglorum, if any) is marked as such via the Format -&gt; Notes/Apparatus settings menu.</li>
    	<li>Ensure (via the menu item Options -&gt; Preferences, under the 'XML' tab) that the 'Apparatus export' option is set to "&lt;app&gt; tags".</li>
    </ul>
  </div>
</div>
  
[% PROCESS footer.tt %]
Commit	Line	Data
4a6b658f	1	[% PROCESS header.tt
	2	pagetitle = "Guidelines for XML file uploads to Stemmaweb"
	3	%]
	4	<h1>Stemmaweb Help - XML upload guidelines</h2>
	5	<div id="docco">
	6	<h3>Guidelines for TEI parallel segmentation input</h3>
	7	<div id="tei">
	8	<p>The Stemmaweb uploader can accept a collated text tradition in TEI parallel-segmentation format that adheres to the following guidelines:</p>
	9	<ul>
	10	<li>The file must have an <msDesc> element in its header which contains a list of all witnesses used in the text in the <listWit> element. Each <witness> element within <listWit> must have an xml:id attribute; this is taken to be the witness sigil, and is expected to appear in the "wit" attribute of the relevant apparatus readings within the text.</li>
	11	<li>Individual reading words may be wrapped in <w> tags; if the tag has an xml:id it will be preserved as the reading ID.</li>
	12	<li>All text within the main <text> element of the TEI file will be taken to be part of the collation. Paragraph and line divisions are not currently preserved.</li>
	13	<li>At the point where any witness appears in the collation, this should be signified with a <witStart> tag within its own apparatus. The first of these will therefore be the first element in any collated text, for example:
	14	<pre>
	15	<body>
	16	<p xml:id="am_1_1">
	17	<app xml:id="AppStart">
	18	<rdg wit="#Jer #K #A #F #B #I #D #J #O #V #X #Y #Z #W"><witStart/></rdg>
	19	</app>
	20	[...]
	21	</p>
	22	</body>
	23	</pre> Likewise, when a witness text ends this should be noted with a <witEnd> tag. Lacunae within the text may be indicated through successive uses of the <witEnd> and <witStart> tags.</li>
	24	<li>Readings as they appeared before scribal corrections may be indicated using the "type" attribute on the <rdg> element, as shown here:
	25	<pre>
	26	<app xml:id="App530">
	27	<lem wit="#Jer #K #F #B #I #A #D #J #O #V #X #Y #Z #W">
	28	<w xml:id="L414">զոր</w>
	29	</lem>
	30	<rdg type="a.c." wit="#B"><w>զո</w></rdg>
	31	</app>
	32	</pre></li>
	33	<li>An apparatus entry may contain the <lem> tag as well as the <rdg> tag; these are treated as equivalent for the purpose of creating the graph.</li>
	34	</ul>
	35	</div>
	36
	37	<h3>Guidelines for Classical Text Editor export</h3>
	38	<div id="cte">
	39	<p>The Stemmaweb uploader can accept a collated text tradition exported from CTE, provided that certain guidelines are observed:</p>
	40	<ul>
	41	<li>Common abbreviations (e.g. 'om.', 'add.', and the like) should be set correctly via the Format -> Document... menu under the General tab. Any other abbreviations or notes are likely to be interpreted as literal readings by the uploader.</li>
	42	<li>Avoid notations such as "<em>tr. post</em> word"; this is difficult to parse even for a human and almost impossible for a computer.</li>
	43	<li>Be extremely careful with overlapping apparatus entries; if, for example, a pair of entries reads:
	44	<pre>
	45	dominus deus] tr. A
	46	deus] deo A
	47	</pre>
	48	then you are simultaneously telling CTE that A reads "deus dominus" and "deo dominus", and the export will become confused. Ensure that there is no scope for confusion.</li>
	49	<li>Ensure that the critical apparatus (and the apparatus siglorum, if any) is marked as such via the Format -> Notes/Apparatus settings menu.</li>
	50	<li>Ensure (via the menu item Options -> Preferences, under the 'XML' tab) that the 'Apparatus export' option is set to "<app> tags".</li>
	51	</ul>
	52	</div>
	53	</div>
	54
	55	[% PROCESS footer.tt %]