{% extends "base.html" %} {% block title %}Annotator User Guide{% endblock %} {% block head %} {{ super() }} {% endblock %} {% block content %}
The data is stored in a relational database. These are exceptionally efficient in the lookup and querying of specific attributes and subsets of data, as well as complex joins across attributes and tables, at the cost of the readability and intuitivity of the storage format. In particular, storing relationships such as an optional relationship, or a one-to-many relationship, cannot be modeled in a straightforward way, but require intermediate tables. The database structure is, therefore, a bit more complex and fragmented than the mental data model it represents. Still, it is valuable to understand the data model, because the way evidence is generated from annotations and groups of annotations is based very closely on that model.
Fig. 1 shows a simplified entity-relationship diagram of the tables in the database involved in evidence creation and annotations. One piece of evidence must contain a place and religion, and may contain one person and one or more time spans. In the database, this is represented by instance tuples, where each instance also stores meta-information specific to that tuple, such as a comment and confidence. The instance tuples then point to the base entities (places, religions, persons), which only exist once. The one or more time spans (time instances) are grouped by a singular time group. The instances may be linked to one annotation, such that an annotation can either be for a place, religion, person, or set of time spans. Each annotation is attributed clearly to one document, and each document to one source. An evidence tuple can be derived from zero or more sources, which is coded via the source instance table. In cases where an evidence was created using the annotator, there is only one source instance, whose source is the same as that of the document the annotations belong to.
An evidence created from the annotator, therefore, is a group of two to four annotations, where one belongs to a place instance, one to a religion instance, zero or one to a person instance, and zero or one to a time group. Evidences created, for example, using the GeoDB-Editor look exactly the same, except that their instances do not have a connected annotation. For more information on the database structure, please reference the database structure PDF.
When opening the annotator, the first screen shows a list of all available documents, shown in Fig. 2. Each document is represented by a card, which lists some metadata about the document and its source:
OCN),
The goal of all that metadata is to make perfectly clear which document this is; for example, there could be multiple versions of one source, each being its own document. This could contain differences in spelling, whitespace, and other aspects, which in turn affect the positioning of existing annotations. Therefore, creating new versions of existing documents when the textual content changes is favorable to updating the existing document, as that might create issues with those existing annotations.
To open a document in the annotator, simply click on the card in question. Besides one card for each existing document, there is also one card at the very end of the list to create a new document. This will link to the document creation form, described below.
Creating a new document is fairly simple:
This will show a page with a form, the contents of which will then populate a new record in the document
table.
The form has five fields, all of them mandatory, described below:
commentdoes not really fit the purpose of the field: providing a short and ideally unique and recognizable label for a document. On that note, the comment field is currently not enforced to be unique, but it might be favorable to give each document a unique comment in any case. I will at some point clean this up.
After filling out all fields properly, a new document is created by clicking on the green Submit button. This will put the document in the database and navigate back to the document selection page. Before clicking the button, no data will be sent to the server, and you can clear all form fields by clicking on the gray Reset form button.
{%- include "docs/code-listing-virtual-text.html" -%}
In theory, all HTML files should be fine to use for annotation purposes. For security purposes, many tag types and tag attributes are stripped from the document before upload1. The resulting document in the database will consist of semantically relevant, but unstyled, HTML tags, and text. However, there might be special considerations to be made about how you want the text to be represented. Some questions to think about before creating a HTML document for annotation are:
not reallypart of the text? That means: those sections should not be selectable or annotatable. Some information on how to incorporate virtual text is given in the paragraph below.
To add virtual text to a document, we use HTML pseudo-elements, which are generated by the browser and are not part of the textual content of the document.
Take note that virtual text, therefore, is only possible to add when using HTML, not plain text.
Virtual text has the advantage of being skipped by the offset calculations and not being selected, therefore also not being part of the text selection and the annotation content.
For adding virtual text, add the respective HTML element (e.g., <h2>
or <span>
) without any text between the tags.
The virtual text itself needs to be passed as an element attribute to the text.
The attribute name used is data-virtual-text
.
The appearance of the virtual text depends on which type of HTML element you use to represent it.
Block-level elements, like headings (<h1>
, <h2>
, etc.), will be represented thus, and inline elements, like <span>
, <strong>
, or <em>
, will be placed inline, which might be useful if the virtual text should be within a paragraph (e.g., for sentence numbering).
Fig. 3a shows an example document with two instances of virtual text:
One inline <strong>
element, and one block-level <h2>
element.
The HTML content of the document to produce this effect looks as listed in Fig. 3b.
1The entirety of <script>
, <style>
, and <head>
tags are discarded, both their content and the DOM nodes themselves.
Disallowed tags are removed, but their contents are kept as plain text and child nodes.
Allowed tags are: <a>
, <abbr>
, <address>
, <article>
, <aside>
, <b>
, <bdi>
, <bdo>
, <blockquote>
, <br>
, <caption>
, <cite>
, <code>
, <col>
, <colgroup>
, <dd>
, <del>
, <details>
, <dfn>
, <div>
, <dl>
, <dt>
, <em>
, <figcaption>
, <figure>
, <footer>
, <h1>
, <h2>
, <h3>
, <h4>
, <h5>
, <h6>
, <header>
, <hr>
, <i>
, <img>
, <ins>
, <kbd>
, <li>
, <main>
, <mark>
, <nav>
, <ol>
, <p>
, <picture>
, <pre>
, <q>
, <rp>
, <rt>
, <ruby>
, <s>
, <samp>
, <section>
, <small>
, <span>
, <strong>
, <sub>
, <summary>
, <sup>
, <table>
, <tbody>
, <td>
, <tfoot>
, <th>
, <thead>
, <time>
, <title>
, <tr>
, <u>
, <ul>
, and <wbr>
.
All attributes are removed from tags, except for
href
, title
and hreflang
for <a>
tags;
and src
, width
, height
, alt
, and title
for <img>
tags.
In addition, the data-virtual-text
attribute can be placed on every tag.
An annotation, in general, consists of a start and end position within a document, as well as an instance (see Database Schema
for more details).
Annotations are represented in the text by colored background behind the annotated text passage.
In the following, the user controls of the annotator are described, as well as how to create, edit, and delete annotations.
After having selected a document in the document selection screen, it and its annotations and evidences are displayed in the annotator. The annotator interface, shown also in Fig. 4, consists of two main views. The document area on the left shows the document, annotations, and evidences; and the editor pane on the right shows annotation and evidence editors, when open.
The document area is scrollable, and the document text is displayed here with a large line height to accomodate links between the rows. A scrollbar on the left of the document area shows the current position in the document, and the positions of annotations are also indicated here. Annotations within the text are indicated by a colored background, where different colors signify different types of annotations (place, person, religion, time group).
Evidences are groups of one place annotation, one religion annotation, zero or one person annotations, and zero or one time group annotations (see Database Schema
).
In the annotator, they are represented by a line connecting all annotations that are part of the evidence.
If the annotations are in different lines of the text, the line takes a detour via the left margin of the document to avoid crossing text.
In the margin, the evidence links are horizonally distributed into swimlanes to avoid overdrawing.
For multiple evidence links going into the same line of text, they are also vertically distributed in the same fashion.
The editor pane is where the annotation or evidence editor is displayed when creating or editing an annotation. This is described in more detail below. Initially, the editor pane is empty, as no annotation or evidence is being edited.
The process for creating a new annotation is shown in detail in Fig. 5. Initially, the text passage that is to be annotated needs to be selected using regular text selection; that is, going over the start of the text passage with the mouse, push and hold down the left mouse button, drag the mouse until the end of the text passage, and release the left mouse button. The selection should become highlighted while doing this.
As soon as you release the left mouse button, the selection of the text passage is finished.
The selected text is what will become the annotation.
Because there are four types of annotations, depending on which type of instance is represented (see Database Schema
), next a pop-up window appears, where the type of annotation needs to be selected.
The window also shows the content of the annotation again, and there are four buttons, one for each type of annotation.
By clicking one of the buttons, that type is selected, the pop-up window is closed, and the annotation editor is opened.
If you want to re-select the text passage or abort the creation of the annotation for any other reason, you can either click on the red cross in the top right, or anywhere outside of the pop-up window.
Next, in the editor pane, an annotation editor appears, where you can fill out the data for that annotation and the connected instance.
This editor window is described in more detail in Editing an Annotation
, as the process for creating and editing existing annotations is quite similar.
The only difference is that:
Cancel creationand serves the same function,
Createand clicking it will persist the new annotation and instance to the database, and
Finally, after clicking Create
in the editor, the editor will close, and the new annotation will appear in the document.
Annotations may overlap, and may even cover exactly the same text passage.
It is completely fine to do a text selection over an existing annotation in the text.
When annotations overlap, the parts where they do are highlighted in gray instead of the normal annotation colors.
The server will also generate suggestions for annotations.
These are based on known names (place names, alternative names from other languages, religion and person names) from the database, as well as on existing annotations in the current document.
Fig. 6 shows an example:
The person Athanasius I bar Gamala has been annotated in the text previously (under the name Athanasius
).
Based on that, the suggestion for a person annotation at this position is suggested.
Clicking on the annotation suggestion, which is indicated by a yellow curly underline, will open an annotation editor of the respective type, with the suggested entity (place, person, or religion) already selected.
The editor now also indicates where the suggestion stems from.
Clicking Create will commit the annotation to the database, replacing the suggestion.
Selecting an annotation is as simple as clicking on it in the document area. When hovering over the annotation with the mouse, it is already outlined. Especially in cases where annotations are very long, or there are overlaps with other annotations, this outline can be helpful for understanding which annotation is currently under the mouse cursor. When annotations overlap and you click on the gray section (i.e., the overlapping part), it is not immediately clear which annotation you want to select. In that case, a pop-up window appears, listing the different annotation candidates with their type and content (see Fig. 7). By clicking on the intended annotation in the list here, it is selected. The selection process can be cancelled by clicking on the red cross, or anywhere outside of the pop-up window.
In the annotation editor, you can edit the annotation comment, which is the comment
field of the record in the annotation
table.
This might be useful if there is something special about the placement of the annotation in the text, or some useful context.
You can also see and edit the textual extent of the annotation (more details in the next paragraph).
Further, you can edit the instance data.
The editors for the different types of annotations, therefore, look slightly different.
Fig. 8 shows an editor for a place annotation:
Here, the annotation comment field is empty.
For the place instance, the place itself, the location confidence, and the comment in the place_instance
table can be edited.
The place is selected via a drop-down menu, as is the confidence.
The comments are entered using text fields.
For person and religion annotations, the editors look quite similar, but the first drop-down menu lists persons or religions, respectively.
An annotation has a start and end position in the text, which are stored in the annotation
table of the database.
As the placement of the annotation could need to be changed, the editors provide a way to see the current extent, and to edit it.
Under the title Annotation extent,
three elements are visible:
A representation of the start and end position of the annotation, a button to start editing, and a text area where the textual content of the annotation is shown.
To change the extent, click on the button, which is initially labeled Reselect annotation
(see Fig. 8)
The button now turns red and the text says Cancel
(see Fig. 9), and clicking it again will go back to the previous state.
By now selecting a text passage in the document area, the textual extent of the annotation will be updated.
As with all other attributes of the annotation and instance, the changes will only be put into the database when clicking on the save button.
The annotation's textual extent can only be edited for existing annotations, and therefore this facility is not displayed when creating a new annotation.
Cancel.
For the time group annotations, the editor looks a bit different because of the way time groups work:
One time group can have zero or more time instances, and all of them would be attributed to the annotation.
Fig. 9 shows an editor for a time group annotation.
Besides the annotation comment and textual extent, time instances are shown as separate items, where the comment, confidence, start time, and end time can be edited.
In this case, start and end time must be numbers, and the end time must be greater than or equal to the start time.
Each time instance can separately be removed by clicking on the Delete instance
button in the respective box, and new time instances can be added via the large New time instance
button.
All changes, additions and deletions are only persisted to the database when the entire time group annotation is saved with the save button in the lower right, and are not persisted if the editor is closed, discarding the edits.
Clicking on the reset button in the lower left of the editor will revert the values to their initial state, as if the editor was freshly opened.
The editor can be closed by clicking on the cross in the upper right.
If there are unsaved changes, a prompt will appear to confirm that those changes should be discarded.
The green save button in the lower right will persist all changes to the database.
The button will be greyed out and disabled if there are no changes to the annotation yet.
The red delete button will delete the instance and annotation, see Deleting an Annotation
.
All form data in the editor is validated. If a field is empty and mandatory (e.g., no place is selected for a new place annotation), the field will be outlined in red to signify that. Similarly, if the content of an input field is invalid (e.g., end time before start time for a time instance), it will be outlined in red as well. In both cases, the save or create button will be disabled and greyed out.
The red delete button in the bottom left of the annotation editor will delete the instance and annotation. Deleting an instance is only possible if the instance is not part of an evidence, and therefore this button is greyed out and disabled if that is the case. When clicking on the button, a confirmation dialog will first appear to make sure that this is the intended action, see Fig. 10. When clicking cancel, the deletion is not performed. When clicking delete, the annotation and the instance will be removed from the database, and the annotation will disappear from the document area.
An evidence, in general, is a grouping of a place instance, a religion instance, and optionally a person instance and a time group (see Database Schema
).
In addition, the evidence has a comment field, the confidence of interpretation, which specifies how confident you are in the interpretation of the source when creating the evidence, and a visibility flag that controls whether the evidence will appear in the visualization or not.
Each evidence also has a source instance, which for annotator-generated evidences is created automatically based on the document's source. Here, the source confidence, which specifies the source's trustworthiness for that specific evidence, can also be set. Last, evidences can be tagged with zero or more tags.
When selecting an evidence, it is opened in the evidence editor. When creating a new evidence, that new evidence is automatically selected for as long as it is edited. An existing evidence is selected by clicking anywhere on the link. The links are layed out in a way that they overlap as little as possible. To further distinguish which evidence is currently under the cursor, the link gets bolder when the mouse hovers on it. Clicking on a link will select that evidence. It is then opened in the evidence editor. Further, the evidence link and the connected annotations are highlighted differently, with blue color and animation, as shown in Fig. 11. Selecting a different evidence while the editor is opened will switch to editing that evidence; however, if there are unsaved changes, a confirmation prompt is shown first.
To create an evidence, simply click on the New evidence
button in the top right of the document area (see Fig. 4).
This will open the evidence editor with a new evidence (see Fig. 12).
While the evidence editor is opened, the button is greyed out and disabled.
As with annotations, creating a new or editing an existing evidence is very similar, and so the description of the editor itself is described below, in Editing an Evidence
.
And again, there are slight differences in the three buttons (compare also Fig.s 12 and 13).
Cancel creationand serves the same function, and
Createand clicking it will persist the new evidence to the database.
The evidence editor, shown in Fig.s 12 and 13, contains four traditional form fields, which are all optional. These form fields can be edited in a straightforward manner:
comment
field of the evidence,The next four rows represent the place instance, religion instance, person instance, and time group that are part of the evidence. Here, as the place and religion instance are mandatory, these will show up as red when empty (see Fig. 12). These four fields cannot be directly edited (i.e., by clicking or typing in them), but instead are controlled via the document area. While the editor is opened, the annotations that are part of the evidence are outlined in blue and animated. Changing membership of annotations works as follows (see also Fig. 14): To add an annotation and its instance to the evidence, click on the annotation. To remove an annotation and its instance that are already part of the evidence, also click on the annotation (clicking toggles membership). If the evidence already contains an instance and annotation of a certain type, clicking on a different annotation of that type replaces the previous instance and annotation with the new ones; for example, if there are two place annotations in the text, one for Edessa and one for Damascus, with Edessa being part of the evidence, clicking on the Damascus annotation would replace the place instance in the evidence, and the evidence would now be related to the place Damascus. An instance and annotation can be associated with multiple evidences.
The fields in the editor displaying the instances cannot be interacted with directly. They show more information on the instances themselves and update automatically. In particular, they show the instance ID, the name and ID of the entity the instance refers to (place, religion, or person), and the respective instance confidence. For the time group, a comma-separated list of all time instances, with start time, end time, and confidence is shown instead. The data about the instances must be fetched from the database when it changes, so directly after opening the editor, or when toggling membership of an annotation and instance, the data is not available for a short while, and a loading indicator is shown instead:
Loading...
If no instance of that type is connected, this is indicated instead, as shown in Fig. 12. As the place and religion instances are mandatory, the absence of those is highlighted in red with more urgency.
The last part of the evidence editor is the evidence tags. Evidence can be tagged to create specific groups of evidences; for example, evidences that refer to Bishopric residences, or evidences that have been thoroughly reviewed. An evidence can have zero or more tags. Tags that are associated with the evidence are displayed at the top in green, tags that are not are displayed below in grey, with a plus symbol instead of the tag symbol. To remove an associated tag, click on it, and it moves to the bottom. Similarly, to add a tag, click on it, and it moves up and becomes green.
As with the annotation editor, all changes made are only local until you click the save button at the bottom (or, for new evidences, the create button).
Using the reset button, the initial state from the database can be restored for all fields, discarding all changes.
Clicking on the cross in the top right closes the editor if there are no unsaved changes, otherwise a confirmation prompt is shown first.
The same prompt is also shown when trying to open a different evidence by clicking on the respective link in the document area.
The delete button deletes the evidence, see Deleting an Evidence
.
The save button will persist all changes to the database.
This button will only be enabled if it is currently possible to save:
If there are no changes, it is disabled and greyed out.
Further, if there is invalid input (i.e., no place or religion instance selected), saving is also not possible.
For evidence that is already saved in the database (i.e., when editing evidence, but not during creation), the evidence editor also shows a link at the top labeled View this evidence in the GeoDB-Editor.
Clicking this link will open the GeoDB-Editor, and there select the place of the evidence, scroll down to the evidence table, and select the evidence there as well.
For evidence created using the annotator, a similar link exists in the evidence table of the GeoDB-Editor, which opens the appropriate document in the annotator and opens the respective evidence in the evidence editor.
2Evidence tuples that do not have the visible
flag set are never loaded from the database.
While most other filters in the visualization will hide or show evidence dynamically, evidences that are not visible will never show up.
Of course, the visibility can be changed later.
Clicking on the red delete button in the evidence editor will delete the evidence from the database.
A confirmation dialog (see Fig. 15) will appear first to avoid accidental deletions.
In the case of new evidences that have not been saved yet, the button will instead be labeled Cancel creation.
Deleting an evidence will not delete the connected annotations and instances. Those will remain in the database and the document area. It will only delete the evidence itself, the source instance, and all tag associations. The deletion will be reflected at once in the document area, where the respective link will also disappear. After deletion, the evidence editor is closed.