{% extends "base.html" %} {% block title %}Create Annotator Document{% endblock %} {% block head %} {{ super() }} {% endblock %} {% block content %}
Please fill out the form below and select a file to create a new document
entry.
The document can then be used for digital annotation, creating evidences that are attributed to the source of the document.
A source for the document must therefore be selected below, and that source entry must already be in the database.
At the moment, we only support annotation of plain text (text/plain;charset=UTF-8
) and HTML documents (text/html;charset=UTF-8
).
HTML documents should be prepared carefully before being submitted here, as changes to their content later on will mess up existing annotations. If you need to make larger changes to an existing document, consider uploading a new version of the document. The main criteria and guidelines for HTML documents are detailed below:
DOCTYPE
, a <html>
tag, a <head>
tag, or a <body>
tag.
The content of the document can just be the content of the <body>
tag.
An example for a complete, valid HTML document:
<h1>Heading</h1>
<p>
This is the first paragraph. It has two sentences.
</p>
<script>
tags, both with inline JavaScript code, and with external links.
<style>
tags, both for inline styles, and with external links.
<head>
element.
<html>
and <body>
tags.
The sanitization will also remove the DOCTYPE
, comments, and CDATA
blocks.
Finally, most attributes will be removed from tags, except for the following ones:
id
attribute will always be preserved.
data-virtual-text
attribute will always be preserved.
a
(anchor) elements, the attributes href
, title
, and hreflang
will be preserved.
img
elements, the src
, width
, height
, alt
, and title
attributes will be preserved.
<style>
tags nor inline style
attributes on elements.
Therefore, it is important to use the right HTML elements for their specific purpose:
<h1>
through <h6>
for headings.
<section>
and <article>
if you want to logically structure the document into smaller pieces.
This is optional.
<p>
(paragraph) elements for paragraphs.
<em>
for emphasized text.
<strong>
for strong text.
data-virtual-text
, which is added to an appropriate element.
Within that attribute value, put the virtual text that should appear, and leave the rest of the element empty.
Use the type of HTML element that the virtual text should appear as;
i.e., a block-level element such as an <h1>
for block-level text, and an inline element such as <span>
or <strong>
for inline text.
Example with a virtual heading and two virtual paragraph numbers:
<p>
<strong data-virtual-text="¶23 "></strong>This is a paragraph.
It has the number 23.
This paragraph is slightly shorter than the next one.
</p>
<h3 data-virtual-text="Page 16"></h3>
<p>
<strong data-virtual-text="¶24 "></strong>This is the next paragraph.
Amet ullam commodi cum quam aut Illum veritatis error voluptas provident
fugit perspiciatis? Tenetur dicta itaque dolore veniam quo? Tenetur animi
quam odit a aspernatur Excepturi elit eaque fuga id ipsam provident.
<strong data-virtual-text="¶25 "></strong>Omnis reprehenderit animi
repellendus provident pariatur, magni. Voluptas voluptates ut aspernatur
harum optio! Assumenda quae excepturi explicabo expedita.
</p>
This is the result of the document above.
Notice how you cannot select the virtual text at all, and how you can do a selection across it:
This is a paragraph. It has the number 23. This paragraph is slightly shorter than the next one.
This is the next paragraph. Amet ullam commodi cum quam aut Illum veritatis error voluptas provident fugit perspiciatis? Tenetur dicta itaque dolore veniam quo? Tenetur animi quam odit a aspernatur Excepturi elit eaque fuga id ipsam provident. Omnis reprehenderit animi repellendus provident pariatur, magni. Voluptas voluptates ut aspernatur harum optio! Assumenda quae excepturi explicabo expedita.