Starting out with Textual Communities
Creating an account in Textual Communities #
First, you need to be a registered user of Textual Communities before you can do anything. At the top right of the Textual Communities home page at http://www.textualcommunities.usask.ca/ you will see a "Sign In" link:
Click on this. You will then have a choice of ways to create an account:
The easiest way to start an account is to use your Facebook account. Just click the Facebook icon, log in to it, and you are done.
Alternatively: click on "Create Account". Follow the instructions there given, and you will be sent through a series of steps to set up your account, using your email as your user name. You will be assigned a temporary password, which you can use to sign in to your new account (you will be prompted to change it).
Creating a new community #
After you have signed into the Textual Communities site, you will see the following (or similar..). Click on the Profile button at the right.
You will see your name and a 'create community' button:
Click on the 'Create Community' button:
Fill in the boxes as follows:
· Name: short name for your community; usually one or two words only
· Abbreviation: abbreviated name -- four letters only!
· Long name: longer name (say, up to six words)
· Font: the name of the font you would like to use (optional)
· Description: brief account of what the community is (say, up to 500 characters)
When you are finished, click on the "create" button at the bottom right. Note that the screen changes, and now you will see at the bottom right:
And at the top, you now have your Profile menu:
You can close this administration screen at any time by pressing 'Close'. You can return to it any time by clicking 'Profile', on the Textual Communities menu bar:
Now that you have a community, your profile has changed, to show the community you just made:
Click on the 'Admin', and you will return to the administrative screen.
Loading your first document and images #
You are now ready to load your first documents and images! Here is a very simple file for a document, ready for you to upload:
<?xml version="1.0" ?> <!DOCTYPE TEI SYSTEM "../common/chaucerTC.dtd"> <TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:det="http://textualcommunities.usask.ca/"> <teiHeader> <fileDesc> <titleStmt><title>Bodley</title></titleStmt> <publicationStmt><p>Draft for Textual Communities site</p></publicationStmt> <sourceDesc><bibl det:document="Bodley"></bibl></sourceDesc> </fileDesc> <encodingDesc> <refsDecl det:documentRefsDecl="Manuscript" det:entityRefsDecl="Simple Poetry"> <p>Textual Communities declarations</p> </refsDecl> </encodingDesc> </teiHeader> <text> <body> <pb n="110v" facs="BD110V.JPG"/> <lb/><div n="Book of the Duchess"> <head n="Title">The Boke of the Duchesse</head> <lb/><l n="1">I haue grete wondir be this light</l> <lb/><l n="2">how that I leue for day ne nyght</l> <lb/><l n="3">I may not slepe wel nygh nought</l> <lb/><l n="4">I haue so many an ydell thought </l> <lb/><l n="5">Purely for defaulte of slepe</l> <lb/><l n="6">That bi my trouth I take no kepe</l> <lb/><l n="7">Of no thinge how hit comyth or goth</l> <lb/><l n="8">Ne me nys' no thinge leue nor loth</l> <lb/><l n="9">Alis I lich good' to me </l> <lb/><l n="10">Ioye or sorwe wherso it be</l> <lb/><l n="11">ffor I haue felynge yn no thynge</l> <lb/><l n="12">But as it were a mased' thynge </l> <lb/><l n="13">Alway yn poynte to falle a doun'</l> <lb/><l n="14">ffor sorwefull ymagynatioun'</l> <lb/><l n="15">Is alwey holely yn my mynde</l> <lb/><l n="16">And well ye wote a geyns kynde</l> <lb/><l n="17">hit were to lyuen yn this' wyse</l> <lb/><l n="18">ffor Nature wolde nat suffyse</l> <lb/><l n="19">To non' erthly creature </l> <lb/><l n="20">Nat longe tyme to endure</l> <lb/><l n="21">Without slepe & be yn sorwe</l> <lb/><l n="22">And I ne may ne nyght ne morwe</l> <lb/><l n="23">Slepe & this Melancolye</l> </div> </body> </text> </TEI>
You can copy this text into a file, and save it on your computer (warning! if you are using Word, choose "Plain text" format from the save options.) Or, you can load the file from www.sd-editions.com/TC/Bodley.xml: save it to your computer, remembering where you put it. Better still: download the zip file www.sd-editions.com/TC/GettingStarted.zip, and extract the files from it to a convenient place on your computer. This will give you all the files you need to get started.
Select 'Add text file' from the Documents menu:
Select the file Bodley.xml, from where-ever you placed it (usually, the 'downloads' or 'desktop' folder):
Click upload, and in an instant, the screen will change (signalling that the file has uploaded fine):
Actually, the "success" message indicates only that the upload has worked, and the file has been received by the server. To check that it was loaded, choose "Management" from your profile member. You will see something like this, showing that the file has now been processed within the system:
Now load a second text file, at www.sd-editions.com/TC/Fairfax.xml, following the same procedure.
We have two documents, with text. Now, let's load also two sets of image files, one for each document you have just loaded. We have the images in zip files named Bodleyimages.zip and Fairfaximages.zip: you can download these file to your computer from the folder at http://www.sd-editions.com/TC/ (if you do not already have these from the zip file at www.sd-editions.com/TC/GettingStarted.zip).
To upload the zip file containing images: select 'Add Image ZIp' from the Documents menu on the Admin panel:
Then, select the document for which you are adding the zip file (either 'Fairfax' or 'Bodley'):
Select the file FairfaxImages.zip (which needs to be on your computer for this):
Press "Upload". A progress bar will appear, and then the message "Success!":
Do the same for the other images zip file, BodleyImages.zip, making sure you select the document 'Bodley' this time.
You now have two documents, each with images, loaded into your Textual Community. Time to look at them.
Your documents in your community: transcribing #
You now can look at these documents in your community. Go to the viewer for your new community. To get there: click on 'Textual Community' in the bar at the top; you will see a link to your new community appear under 'My Communities'. Click on this link and then on 'Viewer'. And now you should see your documents:
Or, click on the community name in your Profile view, and you will be taken direct to the viewer for that community:
Here are your two documents! You can view the pages in either document by clicking the + against its name:
Now, click on the page number. You will see the image for that page appear on the left, and the text of the page appear below it:
To see how this might look in a full publication system, without all the XML coding, click the Preview button:
You can edit the text in the lower window, and click the preview window to see the result. Note that the editor checks that you have 'well-formed' XML as you edit -- try deleting a tag (e.g. a </l> at the end of a line) and you should see the text in the editor turn red, signalling something is missing. Note too that the Preview function parses the document against a Text Encoding Initiative (TEI) schema: so if you try and use a tag the TEI does not recognize, or put the wrong tag in the wrong place, you will get a (hopefully) helpful error message. For example: put the tag <chaucer> in the first line:
You will see that the </head> element has turned red, as the editor recognizes the element <chaucer> is still open. You could turn that green again by placing </chaucer>, to close the <chaucer> element, before the closing </head> element:
However, the <chaucer> element is not part of the TEI system. So if you press Preview you will now get this message:
That is: in line 3 (":3:0") the element <chaucer> is not declared as a possible child of the <head> element. Later, you will learn how you can change the schema to accept different elements, and to change the way your text appears in 'Preview'.
You can also save any changes you make in the editor. To do this, click on the Save button. The screen changes to show you have a saved transcription, with the date and time when you did it:
Note too that there is a 'Compare' function here: change the dropdown menu on the right to 'Version in database' and you will see the two selected versions appear side-by-side, with the differences highlighted:
You can click on the beside a line where there are differences to move the change from one version to another, so creating a single corrrected version which you can then save by clicking the 'Save' button below.
Look at the two pages of the Fairfax document. Spend a few minutes playing with the text editor, preview, compare and save functions. You will learn about the uses of the other buttons ('Commit', 'Publish') later.
Your documents in your community: default transcription #
Textual Communities supports many of the transcription features enabled by the Text Encoding Initiative Guidelines. You can see these in action as follows: load in the xml file DemoTranscript.xml (at http://www.sd-editions.com/TC/, if you do not already have it from the zip file at www.sd-editions.com/TC/GettingStarted.zip), using Add Text File from the Documents menu in the Admin panel for your community. Here is the text of this file:
<?xml version="1.0" ?> <!DOCTYPE TEI SYSTEM "../common/chaucerTC.dtd"> <TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:det="http://textualcommunities.usask.ca/"> <teiHeader> <fileDesc> <titleStmt><title>Demonstration of transcription features</title></titleStmt> <publicationStmt><p>Designed to show transcription features -- quite fake as a document</p></publicationStmt> <sourceDesc><bibl det:document="Demo"></bibl></sourceDesc> </fileDesc> <encodingDesc> <refsDecl det:documentRefsDecl="Manuscript" det:entityRefsDecl="Simple Poetry"> <p>Textual Communities declarations</p> </refsDecl> </encodingDesc> </teiHeader> <text><body> <pb n="1" /> <lb/><div n="Book of the Duchess"> <lb/><l n="55">This is a misuse of this line of verse to show various transcription features. <lb/>Text with various features: <lb/> <hi rend="ital">italic</hi> <hi rend="bold">bold</hi> <hi rend="ital bold">bold italic</hi> <hi rend="strike">strike through</hi> <hi rend="sup">superscript</hi> <lb/>Text useful for transcription of manuscripts: <lb/> Abbr<am rend="sup">et</am><ex>eviations</ex> <choice><sic>wrong</sic><corr>right</corr></choice> <choice><orig>olde</orig><reg>old</reg></choice> <lb/>Text useful for recording alterations within the manuscript: <lb/> <app><rdg type="orig">Original</rdg><rdg type="corrector 1">Altered</rdg><rdg type="lit"><hi rend="strike">Original</hi><hi rend="il">\Altered/</hi></rdg></app> <lb/>Note: we deprecate the use of the TEI elements add and del, because they confuse representation with interpretation <lb/>The app system here suggested cleanly separates representation from interpretation. <lb/>Representation of page numbers and catch words: <lb/><fw place="tr" type="pageNum">1</fw>-- a page number, top right <lb/><fw place="br" type="catch">Catchword</fw>-- a catchword, bottom right <lb/><fw place="bm" type="sig">Signature</fw>-- a signature, in the bottom margin, centre <lb/>Marginalia: <lb/><note place="margin-left">Marginalia</note> -- left margin <lb/><note place="bl">Marginalia</note> -- bottom margin, left <lb/>Editorial notes: <lb/>Something to annotate<note resp="PMR" type="ed">An editorial note</note> <lb/>Tables: <table> <row><cell cols="2">occupies two columns</cell></row> <row><cell rend="circsmall">circle</cell><cell rend="squareborder">square</cell></row> <row><cell cols="2" rend="circlarge" rows="2">large circle</cell></row> </table> <lb/>Unreadable, unclear, supplied or damaged text: <lb/><gap quantity="4" reason="illegible" unit="chars"/> -- you cannot read the text at all: (four characters unreadable) <lb/>You can read the <unclear reason="damage">damaged text</unclear> but with difficulty <lb/><damage agent="water"><gap quantity="4" unit="chars"/></damage> -- The document is damaged, and you cannot read it <lb/><space quantity="1" unit="chars"/>-- empty space in the source text <lb/>See the Wikipedia entry on <hi rend="b">Default Transcription Guidelines</hi> for more details. </l> </div> </body> </text> </TEI>
Go to the viewer for your community, and look at page 1 of this file, and then choose Preview from the editor panel. You should see this:
You will see that this shows how to represent bold, italic, superscript and other styles of text (using the <hi rend="xx"> encoding), and how they appear in the preview window. Note too how you can simultaneously encode both 'diplomatic' (showing abbreviation and uncorrected text) and 'edited' text views: change the menu item at the top left from 'Diplomatic' to 'Edited' and you will see the fifth line change from "Abbret wrong olde" to "Abbreviations right old". Note also how this permits representation of scribal alterations within the text, using the <app> mechanism recommended by Textual Communities project: if you change the menu at the top from "Readings: lit" to "Readings: orig" you will see the text of the seventh line to "Original"; change it again to "Readings: corrector 1" you will see it change to "Altered".
You can see too how Textual Communities supports Text Encoding Initiative representation of page numbers and catchwords using the <fw> element; editorial notes and marginalia using the <note> element; tables using <table>; missing, damaged and unclear text using <gap>, <damage> and <unclear>. See the Wiki entry on Default transcription guidelines for more details.
Your documents in your community: entities #
So, Textual Communities shows you the text page by page, and allows you to edit it, page by page. Actually, it does far more than that. In the Table of Contents panel on the left, click on the 'By Item' tab. The display will change slightly:
Now, click on the + icon beside each document. Instead of a list of pages, you will see the 'Items' contained in the document:
Click on the + icon beside 'Book of the Duchess' and it will show the lines in the Book of the Duchess:
So, Textual Communities understands that your document does not just contain pages with text on each page. It also understands that your document contains the text of something called "Book of the Duchess", and that the Book of the Duchess contains a title, and lines numbered 1 onward. If you click on one of the lines, you will be taken to the page of the document containing that line. In Textual Communities language, we describe the "something called Book of the Duchess" as an "entity": that is, a single act of communication. Entities may contain entities, just as documents may contain pages. Thus, the entity "Book of the Duchess" contains further entities: a title, and lines labelled 1 on.
Spend a few moments playing with the 'By Item' view. Look, for example, at the By Item view of the Fairfax manuscript, containing also an instance of the entity "Book of the Duchess", and instances of the same lines of the poem as Bodley.
Collating texts #
Because Textual Communities understands that a single entity (for example, line 1 of the Book of the Duchess) can appear in multiple documents, it can collate the different forms of that entity in the documents.
The document names have disappeared, and in their place we see the entity 'Book of the Duchess'. Click on the + icon beside 'Book of the Duchess':
You will see the numbers of the title (the first 1) and the first lines (the second 1 on) appearing below. Now click on one of these, say the '2', and you will see the collation interface appear (it might take a few minutes for the server to move the information into this window).
Spend a few moments playing with this:
- use the barto move through the collation of the verse, or "Next Entity" to move to the next line
- Try out the regularization by single clicking on a word to move it into the "regularize this" box and double clicking on another word to move it into the "to this word" box: eg, "how that I" to "How that I" and then pressing return (or "Add rule")
- Click on a manuscript sigil in the output box or in the View Witnesses panel to see an image of the manuscript
- Click on "alignment" to see the collation for the whole verse, and to re-align the variation
Coming soon: the ability to send collation output to other systems for analysis (eg phylogenetic software).
Text flowing from one page to the next #
Perhaps the most distinctive feature of Textual Communities is that it understands not only pages and entities, but it understands that an entity (a line of verse, a prose paragraph) can continue from page onto the next. To see this at work, go to page 130v of Fairfax:
The last line of 130r is 38 (<l n="38">). Now, let's imagine that line 38 actually continued on this page, as follows:
So, add this new continuation of line 38 to the transcription of 130v, as shown. Now, press the Link Pages button at the base of the transcription:
A box will appear, as follows:
Textual Communities has found that the <div> labelled "Book of the Duchess" and the line (<l>) labelled 38 are found on both the previous page and this page. Click OK to confirm this is indeed the case.
Now, you will see the transcription change:
A new "prev" attribute has been added to the <l n="38"> element. This points to the instance of line 38 on the previous pages, and this is how Textual Communities knows that this is a continuation of that line on this page. (Question: why does the <div> element have a "prev" attribute too?)
Now, we can confirm that this line element really does flow from page to page:
- Save the transcription with the additional line and "prev" attribute, by pressing the Save button. You don't need to do it, but it is a good idea!
- Now, write this page to the database by pressing the Commit button. You will be told the document validates, and then you will see display of the transcription menu change:
- View the XML for this document now by clicking on the XML icon next to the document name:
Now, you will see the altered document XML, including these lines:
Here, the altered line 38 continues around the <pb> element, without interruption. (A question -- there is no <lb/> element before the second half of the line 38. There could be! where would it need to be in the encoding?)
What Textual Communities does that others don't do #
From this brief demonstration, you should understand the key features of Textual Communities (TC):
1. TC understands that text is on pages, in documents. All transcription systems do this. TC goes further than most in understanding that the text on the pages might be in lines within the page, or in lines within columns in the page, but again other systems do this.
2. TC understands that text is not just on pages, in documents. It is text "of" something: text of what we call "entities", or "communicative acts". A communicative act might be the Canterbury Tales, the Mahabharata, the Gospel of John, Shakespeare's Hamlet. Further, TC understands that these entities might have complex structures (verses within stanzas within poems; sentences within paragraphs within chapters within books). More than this, TC knows which documents have which entities (in this case, Bodley and Fairfax both have the Book of the Duchess), and knows also which parts of which entity (what verses of the Book of the Duchess) are in what pages of what documents, and even in what lines on those pages those verses occur. TC also knows the reverse: it knows exactly what verses of the Book of the Duchess are on any page, and for each line of the page, exactly what verse of the Book of the Duchess it contains
At the time of writing, Textual Communities is the only transcription system which does (2) as well as (1). Along the way, TC solves the notorious "overlapping hierarchy" problem which has bedevilled the making of scholarly editions in digital form for more than two decades: that is, the problem of representing texts as books as composed both of discrete pages, each with text on it, and as made up of (say) chapters which run across pages, starting half-way down one page and finishing half-way down another, perhaps many pages on.
- Textual Communities does (or will do) much more than this demonstration shows:
- It has an open API (application programmer's interface) so that anyone with reasonable web skills can make web pages accessing and showing texts held in textual communitie. There is an example of this at http://www.sd-editions.com/FromThePage/SimpleAllAPI.html, which gives access to all the communities, documents, images and entities (including those you have just made) in 144 lines of code (written in about one hour). With a little more work, a programmer could make his or her own editor, allowing you to pull out your own texts and edit them as you please.
- It can cope with highly complex texts: see the example at http://www.textualcommunities.usask.ca/web/donne-s-prose/viewer (select the 'By Item' tab and look at document LXXX, to see Sermons, front and back matter, tables, embedded poems, and much more).
- Textual Communities offers a naming system, applicable to all texts and all documents, which can be used for attachment of commentary, additional images, and much more (see on this naming system).
- It includes a full collation system, with regularization and alignment adjustment. It will include the facility to send collations to other software (including phylogenetic methods) for further analysis.
- It is designed for easy integration with a institutional repository/digital library, to guarantee data maintenance in perpetuity.
- Each community has its own wiki, blog and message board, for handling information for each community, with links to Facebook, Twitter and other social media tools: see http://www.textualcommunities.usask.ca/web/recipes/wiki and http://www.textualcommunities.usask.ca/web/recipes/blog for examples of how one community has used these tools.
You can either use Textual Communities from its existing Saskatchewan base, or install it all on your own system: it is designed so a computer programmer can do this expeditiously. We are also looking at ways of bundling Textual Communities with other systems (for example, OMEKA) to make it even easier to run it from your own machine.
Our boast is that Textual Communities can deal with any text of any work in any document. We do have a slight qualification: the document and the textual entities it contains must be capable of being represented in XML, preferably (though not necessarily) in the implementation of the Text Encoding Initiative. As we have yet to meet a document and texts that can not be so encoded, we feel our claim will stand.
How does Textual Communities do all this? and how can you adapt it, to deal with your texts in your documents? See the Wiki entry 'How Textual Communities Works' for answers to these questions.