Becoming a transcriber

This page is for people who have just joined the project, and want to start out transcribing some pages.  Welcome! we have 30,000 pages to transcribe.  We really need your help.

Of the 30,000 pages, we have transcripts, in various stages, of some 18,000 pages.  However, all these need to be checked. The other 12,000 pages have not been transcribed at all.  You can easily distinguish these: the pages that have not been transcribed at all have "(base)" at the beginning of each line (verse), or block of text (prose).

Here is how to get started:

  1. First, get the Junicode font and install it on your computer (very easy! expand the zip file by double-clicking on it, open up the Fonts folder, double-click on Junicode.ttf, you are done)
  2. Look at the "Quick start" transcription guide, at http://www.textualcommunities.usask.ca/web/canterbury-tales/wiki/-/wiki/Main/Quick+Start+Guide. Don't try and master it all at once! Notice the broad categories of phenomena we deal with: abbreviations (everywhere), unreadable text, line breaks, capitalization, rubrics, etc.
  3. Look at some examples of our transcription.  For example: in manuscript Cp -- folios 5r, 19r, 154r-156v all give examples of treatment of abbreviation.  Look at these, and see how the rules in the Quick Start guide have been applied.
  4. Look at the 'Preview' view of any of these pages.  Notice the difference between the "diplomatic" and "edited" views in treatment of abbreviations.
  5. Look at the introducton to our XML encoding at http://www.textualcommunities.usask.ca/web/canterbury-tales/wiki/-/wiki/Main/Introducing+the+XML+encoding, to get a sense of what all those < and > are doing on the page
  6. You should by now have been assigned your own page to transcribe! (if not, contact either Peter Robinson or Barbara Bordalejo).  Now start out trying to transcribe it. Make sure you press "Save" after each transcription session, or during it.  When you think you have finished it: press "Submit". This will send a message to either of the project leaders, or to a "Master Transcriber" who has been assigned to check your work, and they will respond with feedback on your first transcription.
  7. You will then be assigned a few more pages (typically, ten or twenty at a time). You are on your way!
  8. When you think you know what you are doing -- try the transcription tests at https://www.goconqr.com/en-US/p/1593135 and https://www.goconqr.com/en-US/p/1617512.  Good luck!

In theory, most of the time, transcription is relatively straightforward: just identify the abbreviations and represent them correctly.  But as you will see from the quick start guide, and as you look at the actual manuscript pages, deciding just what is an abbreviation can be really tricky. There are many marks in the manuscripts which might be abbreviations, but which we believe (based on many years experience now) are actually not abbreviations, but simply decorative or calligraphic flourishes.  Thus, for example: (almost) all the little tails at the ends of words; cases of crossed h and ll, and many others.  A later project might decide to record all these, as indeed this project started trying to do when it first started out in the early 1990s. But now, we have decided only to record marks of abbreviation which (we think) actually do abbreviate something. This is sometimes a difficult choice! notice particularly the complicated, and apparently bizarre, rules for abbreviated m/n in the quick start guide.

