TEI corpus exportΒΆ

TEI corpus export will walk through an aggregation tree and produce a single TEIcorpus document from all nested TEI documents. If the aggregations aggregate something that is not text/xml, these documents will be skipped (a comment indicating that will be added).

The <teiCorpus> elements contain an artificial <teiHeader> that is generated from the metadata of the respective object and its parents. This may lead to redundancies, e.g., in the <titleStmt>.

Typically, the corpus that is delivered will be nested, i.e. if you run it on an aggregation that aggregates an aggregation, you’ll get a teiCorpus that contains a teiCorpus. If you do not want this, pass the flat=true query parameter, in this case you’ll get just a single root level teiCorpus document containing a bunch of leaf TEI elements. You’ll lose the hierarchical structure, obviously.

Corpus generation will typically stream, i.e. you’ll get the first results quite fast. When requesting large corpora, export may pause for a short while in the middle of exporting stuff while processing large sub-aggregations.

Synopsis::
/teicorpus/{uris}?attach&flat&title&sid

resource-wide template parameters

parameter value description
uris string TextGrid URIs of the root objects, separated by commas

request query parameters

parameter value description
attach

boolean

Default: true

Whether to generate a Content-Disposition: attachment header
flat

boolean

Default: false

If true, no intermediate TEI corpus documents will be generated for intermediate aggregations, hierarchical structure will be lost
title string Title for the container if multiple root objects are given
sid string Session id for accessing restricted resources

available response representations: