Moving back to this blog

Just a quick note to say that I’m moving back to this blog because I’m leaving the University of Oxford (where I had hosted my blog for awhile) for a post at Newcastle University.  Some posts have been re-imported and this may have messed up some code fragments, lost images, etc. Partly this is because WordPress’s default code syntax highlighting makes elements which have the same name as HTML elements just vanish. Sigh.  This is why I’ve generally moved to using images of code.




Using Code Templates in oXygen XML Editor

Recently a project asked me how they might easily re-enter similar content in a file again and again while using oXygen XML Editor. They thought they might use a second file with a bits of template XML in and then copy and paste this in when and where they needed it.

“Aha!”, I said, “What you need are ‘Code Templates’ in oXygen XML Editor”.

Code templates are small fragments of code that can be added at the current editing position as and when you want them.  They are reusable and oXygen comes loaded with a bunch of them for various editor modes.  For example when using oXygen XML Editor to write XSLT I often type ‘ct’ and press control-space (Command-space on Macs I believe).


This displays a list of matching code templates and the tooltip gives you a preview of it. The Copy Template is a template which, by default, copies all of the input to the output (which you can then override for particular elements you might wish to change). In this case pressing enter inserts the following into my XSLT Stylesheet:


However, this project is editing texts in TEI XML rather than writing XSLT, how will this help them?  While I won’t go into specifics about this project in particular, let’s say they have to enter metadata about people they find referenced in the text.   To do this they wrap a <persName> element around the name and point to a <person> element in another file by its @xml:id.

Code templates probably won’t help us with this, fair enough, it is easy enough to wrap an element around some highlighted text with ‘Surround-With-Element’ (Control-E).  However, when it comes to inserting the <person> element to which this refers in their personography.xml file, code templates can really help. Let’s say that for each <person> they must provide an @xml:id attribute (here SMI123) and a standardised form of the name. If they have the data available they’d also like to provide a birth date, death date, institutional affiliation, occupation, nationality, and information about their education.  Although these are optional they’d rather have them there and delete them for those they don’t have the information for, rather than add them.  How do we do this?  First we come up with a template in our chosen document, testing that when we fill in attribute values and element content properly it will be valid.  So for example they might create a template such as:


This will help them create a template person. The red underlining of the values for @xml:id and @when are oXygen warning that these must be completed with valid values.

After having created this template, they would need to turn it into a Code Template.  To do this in the menus they go to Options -> Preferences -> Editor -> Templates -> Code Templates.  This will be filled with default ‘Global Options’ templates for things like the XSL Editor.



The image above shows the Code Template for the copy template I inserted earlier.  Notice that it has an additional feature a ${caret} at the bottom. This tells oXygen where to put the caret or cursor after it has inserted the Code Template.

To create a Person Code Template they would click on ‘New’ and fill in the shortname it will be known by, a description reminding you what it is for, and associate it with a particular editor (in this case the XML Editor).  In some more recent versions of oXygen you can also assign it a shortcut key (and if you want make these platform-independent if you are sharing them between people using different operating systems).



Here we’ve given the Code Template the name ‘person’ which means if we type the first part of that and press control-space then the template should be inserted. (Or we can press control-space and scroll down to ‘person’ if we have lots of other templates.) We’ve given it a description, associated it with the XML Editor, and given it a shortcut key of Contrl-Alt-P.  Care must be taken not to use shortcut keys your operating system is using for other things.  The ${caret} here means that when the ‘person’ Code Template is activated the oXygen XML Editor will place the cursor in place to add the @xml:id. There are other variables that can be added including various file names, paths, timestamps, dates, and other information the oXygen XML Editor has available to it.  However the most useful to those inputting fragments of XML are probably:

  • ${caret} – Cursor position after insert
  • ${selection} – Current selected text
  • ${ask(‘Message’, input_type, ‘default_value’)} – Interactively ask for values
  • ${timeStamp} – Timestamp
  • ${pn} – Project name

The ${selection} variable puts the text that you already had selected before triggering the shortcut at that point. The ${ask()} variable is a function that takes a message, some input_type (like ‘generic’, ‘url’, ‘combobox’, ‘radio’) and a default value. These can end up being quite complicated and more than one variable can be used in a single Code Template. For example, we could use the ${ask()} one to prompt these users to enter the @xml:id, forename, surname, birth and death dates, etc. However, since it is quite common for them not to have all of these details, it is probably easiest to just put in the XML and let them edit it.  More information on these and other variables can be found at

Although not all projects will need to insert <person> elements, it is hoped that if you are repetitively entering the same bit of code again and again, that Code Templates in oXygen XML Editor might be a useful shortcut for you.


Metamark Wrapping: putting brackets around spanning markup

A project PI recently asked me the following XSLT question:

In my TEI I have the following markup:

A bunch of text and other various markup

and then a

<metamark rend="red" spanTo="#meta-111"/>more text
here<anchor xml:id="meta-111"/> and then

the text continues, etc.

How do I wrap a bracket around the <metamark> to <anchor> pairing, and moreover how do I make it red?

That is, I want: … [more text here] … in my HTML output. Help!


This is actually fairly easy to do in XSLT if the encoding is consistent and you are only using metamark in this way (otherwise you should use the other attributes on metamark to delineate its function as well).

In this case the secret is to not think about wrapping the metamark/anchor pairing in square brackets but providing a starting square bracket and an ending square bracket through using two templates. We treat them as separate actions rather than trying to link them in any complicated way. (That is possible but much more difficult.)

<xsl:template match="metamark
[contains(@rend, 'red')]
[substring-after(@spanTo, '#')= following::anchor/@xml:id]">
<span class="metamark makeRed">[</span>

<xsl:template match="anchor
[contains(preceding::metamark/@spanTo, @xml:id)]">
<span class="metamark makeRed">]</span>

In the first XSLT template we only match those metamarks where:

  • the @rend attribute contains ‘red’ and
  • the @spanTo attribute, once we have removed the ‘#’ on the front, equals any @xml:id that is on an anchor element following it. (This means there is one following it somewhere but we don’t need to know where. means there is one somewhere following it… it doesn’t necessarily have to be the next one.)

Then on the second template we match any anchor where:

  •  there is an @xml:id and
  • the @xml:id attribute on this anchor is pointed at by a metamark/@spanTo attribute that precedes it somewhere

We don’t need to have any real correspondence or connection between the two templates, and indeed if any of these accidentally fired on other metamarks or anchors, we could put in additional testing.

In both cases we put out an HTML span element with a bracket in it and given this span two classes ‘metamark’ and ‘makeRed’ to enable the project to control the styling of the metamark display and to colour things ‘red’.  i.e.

.metamark{font-size:80%; font-weight:bold;}


This is fairly straightforward and it is only a conceptual approach where those used to XML structures would often think about wrapping an element around something rather than just giving its starting and ending points in their output.

[UPDATE: As Torsten notes below, the reason I don’t need to provide the namespace for these TEI elements because I have, sensibly, used @xpath-default-namespace and an @xmlns on the <xsl:stylesheet> element that you don’t see in these extracts.]



Sebastian Patrick Quintus Rahtz,
13 February 1955 – 15 March 2016


I have been helped by, worked with, learned from, and been friends with Sebastian Rahtz since before I even came to work at OUCS (now IT Services).  During my years working with him there were only a few times that I ever showed him something new, or a better way to do the thing he was doing, more often than not it was the reverse. I learned a lot from Sebastian, not only about the way to approach technical problems, management, workflow, etc. but in many ways how to be a better human being.  I won’t attempt to list the projects, services, people, and organisations that Sebastian has made better by his existence. I would only leave out many crucial and important ones. I will miss him.

sit tibi terra levis


Teaching for DEMM: Digital Editing of Medieval Manuscripts


This is the second year that, as part of my commitment to DiXiT, I have also taught on the Erasmus+ Digital Editing of Medieval Manuscripts network.  Digital Editing of Medieval Manuscripts (DEMM) is a joint training programme between Charles University in Prague, Queen Mary University of London, the Ecole des Hautes Etudes en Sciences Sociales, the University of Siena, and the library of the Klosterneuburg Monastery. It equips advanced MA and PhD students in medieval studies with the necessary skills to eAmiatinusdit medieval texts and work in a digital environment. This is done through a year-long programme on editing medieval manuscripts and their online publication: a rigorous introduction to medieval manuscripts and their analysis is accompanied by formal training in ICT and project management. The end of each one-year programme will see the students initiated into practical work-experience alongside developers, as they will work on their own digital editions, leading to its online publication.

Funded by the Strategic Partnership strand of the European Union’s Erasmus+ Programme, DEMM will run for three consecutive years, always with a new group of students. It will lead to the publication, in print and online, of teaching materials, as well as a sandbox of editions.

My institution is not directly involved in it (but there is overlap with DiXiT) and last year I taught and assisted at both the workshop in Lyon and the Hackathon in London. This year the students had a a week’s introduction to Palaeography, Codicology and Philology at Stift Klosterneuburg in the autumn and then in March had a week’s workshop on encoding, tagging and publishing in Lyon.

Needless to say I was providing tuition on the Text Encoding Initiative and a full schedule, with links to my presentations (some of the others are behind a password protected site) is available at:

This follows a fairly predictable pattern of introducing people to the concept of markup, the formal syntax of XML, and the vocabulary of the TEI. It then goes on to expand this with an introduction to the core elements, named entities, and the following morning TEI metadata. Here of course we also single out the elements for both manuscript description and transcription since that is key for those undertaking  to build digital editions of medieval manuscripts.  The course continued on to talk about critical apparatus, genetic editing, and publication / interrogation of your results.

Report on the Digital Humanities at Oxford Summer School 2015

Digital Humanities at Oxford Summer School 2015 Report


The Digital Humanities at Oxford Summer School (DHOxSS) is an annual training event at the University of Oxford which took place this year on 20 – 24 July 2015. This year it took place primarily at St Anne’s College, IT Services, and the Oxford e-Research Centre. The DHOxSS offers training to anyone with an interest in the Digital Humanities, including academics at all career stages, students, project managers, and people who work in IT, libraries, and cultural heritage. Delegates follow one of our week-long workshops, supplementing their training with expert guest lectures. Delegates can also join in events each evening. This year the DHOxSS grew significantly. It swelled from 5 workshops in 2014 to 8 workshops in 2015 and this meant the number of delegates and speakers also grew from 107 delegates + 54 speakers in 2014 to 163 delegates + 83 speakers in 2015.

The DHOxSS runs primarily on the goodwill of various units of the University of Oxford donating their time as DHOxSS Directors, Organisational Committee, Workshop Organisers, Speakers, and in the work of the IT Services Events Team. Organisers and Speakers are not financially remunerated for their participation, though travel and accommodation expenses for visiting speakers are covered by the DHOxSS. Speakers and Workshop Organisers are rewarded for their labours through attendance at the DHOxSS welcome reception and sometimes other DHOxSS events. The enterprise as a whole is financially underwritten by IT Services, which also donates multiple FTE worth of staff time spread across part of the time of one of the Directors and staff commitment from those in the IT Services Events Team.

DHOxSS Directors

For the last few years James Cummings (IT Services) has been the overall director of the DHOxSS. However, it has grown to such a size that this year Pip Willcox (Bodleian Libraries) joined him as a co-director. In the planning for DHOxSS 2016 the responsibilities of individual directors is already more distinct as a result of this first year of experience: they oversee discrete areas of the summer school in collaboration with the events team and DHOxSS Organisational Committee.

DHOxSS Organisational Committee

The year-long organisation of the DHOxSS is overseen by an organisational committee consisting of stakeholders from across the collegiate university. After DHOxSS 2014 this committee was intentionally re-structured to give broader representation from more stakeholders and the planning of DHOxSS 2015 bears the fruit of this. The committee for DHOxSS 2015 consisted of:

  • Jacqueline Baker, Oxford University Press
  • James Cummings, Co-Director of DHOxSS, IT Services
  • David De Roure, Wolfson College Digital Cluster
  • Kathryn Eccles, TORCH Digital Humanities Chamption
  • Andrew Fairweather-Tall, Humanities Division
  • Ruth Kirkham, The Oxford Research Centre it the Humanities
  • Eric Meyer, Oxford Internet Institute
  • Kevin Page, Oxford e-Research Centre
  • Pamela Stanworth, IT Services
  • Tara Stubbs, Continuing Education
  • Jessica Suess, Museums & Collections
  • Kathryn Wenczek, IT Services Events Team
  • Pip Willcox, Co-Director of DHOxSS, Bodleian Libraries


Structure of the DHOxSS

Overall the DHOxSS mostly has a fairly regular daily structure of:

  • 9:30-10:30 Additional Plenary Keynotes or Parallel Lectures
  • 10:30-11:00 Break
  • 11:00-12:30 Individual Workshops
  • 12:30-14:00 Lunch and travel time
  • 14:00-16:00 Workshops Continue
  • 16:00-16:30 Break
  • 16:30-17:30 Workshops Continue
  • Evening Events

However, some individual workshops varied the times of breaks slightly from this. Indeed, the TEI workshop was asked to extend its teaching until 13:00 each day when an overcrowding situation in the OeRC atrium became evident at lunchtime. For DHOxSS 2016 this schedule will need to be revised to include more travel time because of the distances between some of the chosen venues.

Additional Plenary or Parallel Lectures

The DHOxSS structure provides an opening and closing plenary keynote on the Monday and Friday of the week. Tuesday through Thursday provides an opportunity for parallel sessions in smaller venues. The DHOxSS 2015 had 3 parallel sessions on these days.

Monday 20 July 2015, 09:30-10:30

Tuesday 21 July 2015, 09:30-10:30

Wednesday 22 July 2015, 09:30-10:30

Thursday 23 July 2015, 09:30-10:30

Friday 24 July 2015, 09:30-10:30


This year the DHOxSS grew from 5 parallel workshops to 8 workshops each running in parallel over the course of the week. This sudden growth and corresponding need for additional venues did pose an additional administrative burden and a greater degree of logistics.

All workshops at DHOxSS run for the full 5 days. Delegates chose a single workshop and stayed with that workshop for the entire week. Workshop organisers were responsible for designing and running the program of the workshop, providing the necessary information about it, liaising with the speakers, and ensuring it runs smoothly. Organisers also often were speakers on that workshop. A call for workshops issued in 2014 resulted in the committee approving the following workshops for DHOxSS 2015:

An Introduction to Digital Humanities
Crowdsourcing for Academic, Library and Museum Environments
Digital Approaches in Medieval and Renaissance Studies
Digital Musicology
From Text to Tech
Humanities Data: Curation, Analysis, Access, and Reuse
Leveraging the Text Encoding Initiative
Linked Data for the Humanities

Each workshop is given its own colour which carries through on the website, in the printed booklet, and in the lanyard that delegates on that workshop are given. This makes it blindingly obvious if delegates are trying to switch from one workshop to another. This is something which is not allowed for both pedagogical and administrative reasons, and incurs an administration fee and needs the express approval of the workshop organiser.


The Introduction to Digital Humanities workshop was organised by Pip Willcox (Bodleian Libraries) and was our most popular workshop strand. It is a mostly lecture-based survey of a large number of Digital Humanities topics and those speaking on it are often appearing in other workshops as well.  This year speakers included: Alfie Abdul-Rahman (Oxford e-Research Centre, University of Oxford), James Cummings (IT Services, University of Oxford), David De Roure (Oxford e-Research Centre, University of Oxford), J. Stephen Downie (Graduate School of Library and Information Science, University of Illinois, Urbana-Champaign), Kathryn Eccles (Oxford Internet Institute and TORCH, University of Oxford), Alexandra Franklin (Bodleian Libraries, University of Oxford), Christopher Green (Institute of Archaeology, University of Oxford), David Howell (Bodleian Libraries, University of Oxford), Matthew Kimberley (Bodleian Libraries, University of Oxford), Ruth Kirkham (Oxford e-Research Centre, University of Oxford), James Loxley (University of Edinburgh), Eric Meyer (Oxford Internet Institute, University of Oxford), Kevin Page (Oxford e-Research Centre, University of Oxford), Meriel Patrick (IT Services, University of Oxford), Megan Senseney (Graduate School of Library and Information Science, University of Illinois, Urbana-Champaign), Judith Siefring (Bodleian Libraries, University of Oxford), Ségolène Tarte (Oxford e-Research Centre, University of Oxford), Andrea K. Thomer (Graduate School of Library and Information Science, University of Illinois, Urbana-Champaign), Pip Willcox (Bodleian Libraries, University of Oxford), and James Wilson (IT Services, University of Oxford)

The Crowdsourcing for Academic, Library and Museum Environments workshop was organised by Victoria Van Hyning (Zooniverse, University of Oxford) and Sarah De Haas (Google). It gave participants an in-depth exposure to the full workflow of crowdsourcing and making use of the aggregate data. Speakers on this workshop included: Philip Brohan (Met Office Hadley Centre), Sarah De Haas (Google), Shreenath Regunathan (Google),  and Victoria Van Hyning (Zooniverse, University of Oxford).

The Digital Approaches in Medieval and Renaissance Studies workshop was organised by Judith Siefring (Bodleian Libraries). This workshop explored various innovative approaches in the field in use at Oxford. This included both image and text-based materials, and delegates had the opportunity to view original artifacts from the age of manuscripts and early print. Speakers on this workshop included: James Cummings (IT Services, University of Oxford), Geri Della Rocca De Candal (Faculty of Medieval and Modern Languages, University of Oxford), David De Roure (Oxford e-Research Centre, University of Oxford), Cristina Dondi (Faculty of History, University of Oxford), Iain Emsley (Oxford e-Research Centre, University of Oxford), Alexandra Franklin (Bodleian Libraries, University of Oxford), Matthew Holford (Bodleian Libraries, University of Oxford), David Howell (Bodleian Libraries, University of Oxford), Eleanor Lowe (Department of English and Modern Languages, Oxford Brookes University), Matilde Malaspina (Faculty of History, University of Oxford), Liz McCarthy (Bodleian Libraries, University of Oxford), Matthew McGrattan (Bodleian Libraries, University of Oxford), Monica Messaggi Kaya (Bodleian Libraries, University of Oxford), Kevin Page (Oxford e-Research Centre, University of Oxford), Alessandra Panzanelli (The British Library),  Judith Siefring (Bodleian Libraries, University of Oxford), Daniel Wakelin (Faculty of English, University of Oxford),  and Pip Willcox (Bodleian Libraries, University of Oxford).

The Digital Musicology workshop was organised by Kevin Page (Oxford e-Research Centre). This workshop provided an introduction to computational and informatics methods that can be, and have been, successfully applied to musicology. It brought together a well-rounded programme balancing lectures with practical sessions. Speakers on this workshop included: Chris Cannam (Centre for Digital Music, Queen Mary University London), Rachel Cowgill (Music & Drama, University of Huddersfield), Julia Craig-McFeely (Faculty of Music, University of Oxford), Tim Crawford (Computing Department, Goldsmiths, University of London), David De Roure (Oxford e-Research Centre, University of Oxford), J. Stephen Downie (Graduate School of Library and Information Science, University of Illinois, Urbana-Champaign), Ben Fields (Computing Department, Goldsmiths, University of London), Ichiro Fujinaga (Schulich School of Music, McGill University), David Lewis (Computing Department, Goldsmiths, University of London), Richard Lewis (Computing Department, Goldsmiths, University of London), Kevin Page (Oxford e-Research Centre, University of Oxford), Christophe Rhodes (Computing Department, Goldsmiths, University of London), Carolin Rindfleisch (Faculty of Music, University of Oxford), Stephen Rose (Department of Music, Royal Holloway, University of London), David M. Weigl (Oxford e-Research Centre, University of Oxford), and Tillman Weyde (Department of Computer Science, City University London)

The From Text to Tech workshop was organised by Gard B. Jenset, (TORCH), and Kerri Russell (Faculty of Oriental Studies). This workshop HiCor research network ( taught delegates the skills and understanding required to work computationally and quantitatively with corpora of historical texts. Speakers on this workshop included: Gard B. Jenset (The Oxford Research Centre in the Humanities, University of Oxford), Barbara McGillivray (The Oxford Research Centre in the Humanities, University of Oxford), Kerri Russell (Faculty of Oriental Studies, University of Oxford), Gabor M. Toth (University of Passau / The Oxford Research Centre in the Humanities, University of Oxford), Alessandro Vatri (Faculty of Classics and Faculty of Linguistics, Philology & Phonetics, University of Oxford)

The Humanities Data: Curation, Analysis, Access, and Reuse workshop was organised by Megan Senseney (Graduate School of Library and Information Science, University of Illinois Urbana-Champaign) and Kevin Page, (Oxford e-Research Centre). This workshop provided a clear introductory grounding in data concepts and practices with an emphasis on humanities data curation. Sessions covered a wide range of topics, including data organization, data modeling, big data and data analysis, and workflows and research objects. Case studies included examples from the HathiTrust, EEBO-TCP, and BUDDAH. Speakers on this workshop included: Laird Barrett (Taylor & Francis / Oxford Internet Institute, University of Oxford), Josh Cowls (Oxford Internet Institute, University of Oxford), David De Roure (Oxford e-Research Centre, University of Oxford), J. Stephen Downie (Graduate School of Library and Information Science, University of Illinois, Urbana-Champaign), Tanya Gray Jones (Bodleian Libraries, University of Oxford), Scott Hale (Oxford Internet Institute, University of Oxford), Neil Jefferies (Bodleian Libraries, University of Oxford), Terhi Nurmikko-Fuller (Oxford e-Research Centre, University of Oxford), Kevin Page (Oxford e-Research Centre, University of Oxford), Allen Renear (Graduate School of Library and Information Science, University of Illinois, Urbana-Champaign), and Sally Rumsey (Bodleian Libraries, University of Oxford)

The Leveraging the Text Encoding Initiative workshop was organised by  Magdalena Turska (DiXiT Project / IT Services, University of Oxford) and Lou Burnard, (Lou Burnard Consulting). This workshop tried to balance an introduction to TEI with more technical investigations of software to publish and interrogate TEI XML files. Speakers on this workshop included: Misha Broughton (DiXiT Project, University of Cologne), Lou Burnard (Lou Burnard Consulting), Emmanuel Château (École Nationale des Chartes), Elena Spadini (DiXiT Project, Huygens ING (KNAW)), and Magdalena Turska (DiXiT Project / IT Services, University of Oxford)

The Linked Data for the Humanities workshop was organised by Kevin Page (Oxford e-Research Centre). This workshop introduced the concepts and technologies behind Linked Open Data and the Semantic Web. It taught attendees how they could publish their research so that it is available in these forms for reuse by other humanities scholars, and how to access and manipulate Linked Open Data resources provided by others. Speakers on this workshop included: David De Roure (Oxford e-Research Centre, University of Oxford), Alex Dutton (IT Services, University of Oxford), Barry Norton (British Museum), Terhi Nurmikko-Fuller (Oxford e-Research Centre, University of Oxford), Dominic Oldman (British Museum), Kevin Page (Oxford e-Research Centre, University of Oxford), John Pybus (Oxford e-Research Centre, University of Oxford),

Poster Session

Each year DHOxSS has a peer-reviewed poster session, often held in conjunction with the welcome drinks reception. This gives delegates, speakers, and members of the University of Oxford, a chance to get to know each other and display their digital humanities work to each other. This year posters were presented by:

Evening Events

An important part of DHOxSS is the social events. This year these consisted of:

As mentioned above the Welcome Drinks Reception and Poster Session is an important networking event (for those attending and speaking at DHOxSS) but also other invited guests. In this case it was also used as a book launch event for one of the DHOxSS’s major sponsors the AHRC Digital Transformation Theme.  The guided walking tour gave visitors to Oxford a chance to explore the historic city.

Teaching Venues

The DHOxSS has reached a size where it can occasionally face venue capacity problems. There are only so many lecture theatres in Oxford which hold 163 delegates (plus additional speakers) and many of these are booked out well in advance. The DHOxSS events team is working on securing locations several years in advance, however, the unprecedented growth from DHOxSS 2014 to DHOxSS 2015 in number of workshop meant that additional venues needed to be found. The venues used were:

  • The Mathematics Institute: For the opening and closing keynote DHOxSS 2015 used the Mathematics Institute.

  • St Anne’s College: The additional lectures on the mornings of Tuesday – Thursday were held in the Tsuzuki Lecture Theatre, Seminar Room 9, and the Danson Room.  These rooms were also used for three of the DHOxSS workshops. There were some problems with using the Danson room for presenting, but other spaces worked well.

  • IT Services: Three workshops will head in the IT Services Thames Suite of teaching rooms.

  • Oxford e-Research Centre: Two workshops were held in the Oxford e-Research Centre.

  • The Weston Library Lecture Theatre: This was used for a joint session of two workshops.

Podcasts, Photos, and Social Media

The DHOxSS has always engaged with social media and the #DHOxSS was well-used by DHOxSS delegates and  the @DHOxSS twitter account was a source of information and advice. Although the DHOxSS Photo Group on Flickr was mentioned to delegates, it did not prove as popular as more instant open forums such as twitter and instagram.  Podcasts of the opening and closing keynotes as well as most of the additional lectures were made freely and openly available. (The only reason one lecture wasn’t, was technical difficulties with the footage.)  These are made available on the DHOxSS podcast series at Individually these are:

Lecture  Title



Uneasy Dreams: the Becoming of Digital Scholarship

James Loxley, University of Edinburgh, gives the final keynote in the DHOXSS 2015.

James Loxley

The Online Corpus of Inscriptions from Ancient North Arabia

Daniel Burt, Khalili Research Centre, University of Oxford, gives a talk for the DHOXSS 2015.

Daniel Burt

If a Picture is Worth 1000 Words, What’s a Medium Quality Scan Worth?

David Zeitlyn, Institute of Social and Cultural Anthropology, University of Oxford, gives a talk for the DHOXSS 2015.

David Zeitlyn

Crowdsourced Text Transcription

Victoria Van Hyning, Zooniverse, University of Oxford, gives a talk for the DHOXSS 2015.

Victoria Van Hyning

Let Your Projects Shine: Lightweight Usability Testing for Digital Humanities Projects

Mia Ridge, Digital Humanities, Open University, gives a talk for the DHOXSS 2015.

Mia Ridge

Networking⁴: Reassembling the Republic of Letters, 1500-1800

Howard Hotson, Faculty of History, University of Oxford, gives a talk for the DHOXSS 2015.

Howard Hotson

Mapping Digital Pathways to Enhance Visitor Experience

Jessica Suess, University of Oxford Museums and Anjanesh Babu, Ashmolean Museum, University of Oxford, give a talk for the DHOXSS 2015.

Jessica Suess,Anjanesh Babu

Digital Image Corruption – Where It Comes From and How to Detect It

Chris Powell, Ashmolean Museum, University of Oxford, gives a talk for the 2015 DHOXSS.

Chris Powell

Digital Transformations

Panel discussion for th DHOXSS 2015.

David De Roure,Lucie Burgess,Tim Crawford,Jane Winters

How I Learned to Stop Worrying and Love the Digital

Jane Winters, Institute of Historical Research, University of London, gives the opening keynote talk for the 2015 DHOXSS.

Jane Winters

This continues a DHOxSS tradition of recording and making openly available the keynotes and additional lectures.

DHOxSS Statistics


There were 83 speakers for DHOxSS 2015, 54 of which were from the University of Oxford. These were contributed by the following departments:

  • Bodleian Libraries: 13 Speakers
  • Oxford e-Resarch Centre: 9 Speakers
  • IT Services: 7 Speakers
  • Oxford Internet Institute: 6 Speakers
  • Faculty of History: 3 Speakers
  • The Oxford Research Centre for the Humanties: 3 Speakers
  • Oxford University Museums: 3 Speakers
  • Faculty of Music: 2 Speakers
  • Faculty of Classics: 1 Speaker
  • Faculty of English: 1 Speaker
  • Faculty of Medieval and Modern Languages: 1 Speaker
  • Faculty of Oriental Studies: 1 Speaker
  • School of Archaeology: 1 Speaker
  • Institute of Social and Cultural Anthropology: 1 Speaker
  • Khalili Research Centre: 1 Speaker
  • Zooniverse: 1 Speaker


There were 163 DHOxSS 2015 registrations which were as follows:

  • Academic/Standard/NFP: 92

  • Student: 53

  • Oxford: 16

  • Commercial: 2


The registration charges were:

Registration Type


Full Commercial Rate: You work for a commercial or corporate organisation

695 pounds

Academic/Education/NFP: You work for an educational institution, library, charity or not-for-profit organisation in any capacity

590 pounds(15% discount)

Student (any institution/level): You are enrolled as a (full-time or part-time) student at any educational institution at any level

485 pounds(30% discount)

Staff or Student of the University of Oxford:You work or are a student at the collegiate University of Oxford

485 pounds(30% discount)

This covered the costs of venues, lunches, evening events, speaker travel and accommodation as well as any costs in running the workshops.

As part of the registration process delegates were optionally able to indicate the source of funding they were using to pay for their registration. While 33% chose not to answer, 31% had institutional funding, 22% were self-funding, 8% had project funding, 6% had a bursary/grant of some sort, and 1% indicated a different reason.


The reasons for attending, when chosen from a list were mostly career development (38%), a specific project (20%), and general interest (10%), while 33% chose not to answer:


Delegate Origin

Delegates came from all levels of professional standing, and from over 100 separate institutions. In aggregate the countries of origin can be totalled as:

  • UK: 80
  • Other Europe: 50
  • North America: 26
  • Far East: 3
  • Middle East: 1
  • Russia: 1
  • South America: 1
  • Australia: 1


Delegate Age


How Delegates Heard About DHOxSS 2015

DHOxSS 2015 was advertised through various media. While registering, delegates were able to indicate where they had heard about DHOxSS. Mostly delegates had heard about DHOxSS from colleagues (some of whom were previous attendees), others indicated that they had used online searches or found the website through one route or another. Less indicated that they had heard about it through social media but the effectiveness of this measure is hard to determine since this and mailing list may be how the colleagues referenced. Similarly, flyers were distributed at conferences and sent to various UK humanities departments which might have resulted in some of the institutional recommendations.



DHOxSS strives to be a welcoming place for all participants. One of the statistics we have examined over the years is that of gender. In previous years gender was not asked of participants but tracked informally based on apparent gender identify.  This has shown that DHOxSS normally attracts approximately 69% female delegates. For the first time, in the registration for DHOxSS 2015, delegates were asked to declare their gender. The ratio of female to male delegates generally held but was slightly less because many of those choosing not to answer the question (for whatever reason) appear to be women.  The chart below looks at gender not only of delegates but all participants and provides 31% Delegate Female, 16% Delegate Male, 20% Speaker Male, 13% Speaker Male, and 19% Delegates who didn’t answer. This indicates room for improvement by increasing the number of female speakers so it is more representative of the DH community which attends DHOxSS.



In general the feedback from delegates and speakers was generally positive. There were a number of problems with workshops where abstracts didn’t entirely match the workshop content or there were too many topics being covered. There was very positive feedback for the organisation and administration of DHOxSS 2015. The feedback was summarised for the organisational committee and has formed part of the planning for DHOxSS 2016.

Plans for DHOxSS 2016

The DHOxSS 2016 will be held from the 4 – 8 July 2016 using St Hugh’s College, IT Services, the Oxford e-Research Centre, and other venues. The planning for this is already underway (and locations for 2017 and 2018 are being booked), and a call for workshops and additional lectures has already gone out. . If you want to subscribe to our DHOxSS announcements mailing list, email: and confirm by replying to the confirmation email that gets sent to you. We will notify this mailing list when registration opens.

childish toys

I count religion but a childish toy, and hold there is no sin but ignorance.
Jew of Malta, Christopher Marlowe

Occasionally, indeed almost cyclically, on some of the mailing lists I’m on a big theoretical war erupts where someone declares “XML is DEAD: We should all move to using $Thing“. Though to be honest, it could be any format or technology, not just XML.

Sometimes these are well-meaning hunters of the new and shiny: Someone has heard about this brand new shiny $Thing technology and heard that it is the replacement for XML technologies (or whatever existing technologies) and that we should all start using it. With little or no critical examination of their sources, perhaps a shiny youtube promotional video, this then starts a long and usually fruitless discussion. One of the reasons that $Thing technology is quicker, shinier, and much more fun, is that it has dropped lots of the baggage of the old technology — eventually people will realise that baggage was there for a reason and slowly add it back, but this time to a framework not designed to incorporate it. People chip in from both sides but the status quo remains.

Sometimes it is naively theoretically-based: Someone notices, or reads about, the inherent problems in XML (or whatever existing technologies) and sees that using $Thing technology doesn’t have those problems (and either doesn’t notice the other problems it does have, or they don’t apply to their narrow use-case). The poster in this case wants to know “Is this really the next big thing?” but is, or should be at least, open to the reasons why it isn’t. This usually brings up discussion by posters on both sides picking flaws in one of the technologies or the others or recycling of long-dead myths. (“XML has a problem with overlapping hierarchies, $Thing doesn’t! Ha!“, “There are lots of solutions to overlapping hierarchies in XML which enable you to use all these nice tools.”, “Ah, but you can’t do stand-off markup in XML or represent a graph!“, “Erm, yes, you can. Honestly, URI-based pointing, Out of line markup, Linking multiple disparate resources by various taxonomies, all common in XML“, etc.) This sniping back and forth is hardly productive and just makes people think there is a problem where there isn’t. People chip in from both sides and the status quo remains.

Sometimes it is sophisticatedly theoretically-based: Some philosophical guru has been studying the various technologies for quite some time and expresses that the problems inherent in one, from their point of view, are dealt with more elegantly in $Thing technology. This is probably true, but is mostly done as a theoretical exercise of trying to perfect the ideal technology and express it in a form that is elegant, beautiful, and rational. More often than not this results in a particular instance of $Thing technology that solves problems that most people didn’t really care about, and although it may be elegant it is not human readable and there is only the guru’s personal implementation of anything that reads it which works for their use-case.  While potentially useful, it is not pragmatic for the majority of people to care about it until it has reached mass adoption.  It will never reach mass adoption because this guru, let’s say, isn’t interested in community building. People will gently comfort the technological genius who doesn’t understand why we persist with the well-supported but suboptimal, and the status quo remains.

Sometimes it is religiously-based: A devotee of $Thing technology, or a die-hard opponent of XML (or whatever existing technology) finds some news article or development which they can use to claim the superiority and mass-adopting of $Thing technology.  The use of $Thing technology in this instance is then cast as a slow but measurable demise of XML (or whatever existing technology).  The increase in use of one technology is not necessarily related to the demise of another technology, and this may be misleading for people viewing this exchange. In my opinion it is usually intellectually dishonest to present such a news article or development as the death knell for another technology, especially when both can happily co-exist, and especially when it is done consciously as a technique by the devotee to intentionally discredit the existing technology. Dislike of any particular technology because of its flaws is reasonable, but doing so blindly is not what users should be basing the technological decisions on.  Users of the existing technology defend their conscious decision not to be trendy and inexperienced users choose $Thing technology because of the hype and then contribute to that hype. People chip in from both sides and try to patiently convert the masses or correct the fallacies of the devotee, but the status quo remains.

Sometimes it is implementation-based: A programmer needing to process lots of XML (or whatever existing technology) runs into a problem, often a limitation by the poor implementation of the libraries they are using, and either bemoans or is advised that $Thing technology doesn’t have these problems and look comes with the wonderful library of tools. People counter showing how if the programmer had been using the appropriate tools the problem would be easier to solve. Others point to the growing code base for $Thing technology and and get shown the huge amount of tools for the existing technology.  The code base might be growing because people have seen that $Thing technology is missing support for all their special cases, and thus it agglomerates bits and pieces of new areas of support. People chip in from both sides with examples of how their chosen technology does one thing better, or how they are all bad, but the status quo remains.

There are of course other ways this arises and plays out, and different actors playing many parts. In my case I find almost any of these discussions pathetically juvenile. How many times do we have to say it:


Instead, lets help each other do good and useful things rather than needlessly wasting spare cycles proclaiming death or triumph of one useful format or technology over another. To do otherwise is tiring, pathetic, and just a waste of everyone’s time. Sure, any new project needs to get good and sensible advice on what formats, technologies, and methodologies are suitable for their project. These are rarely determined by abstract considerations of the inherent properties of the format, technology, or methodology, however, and instead are determined by what the staff already know, what the local infrastructure will support, and what will give the most useful answers to the research questions with the least amount of investment. The childish toys alluded to by my appropriation of Marlowe here isn’t the formats themselves, but the arguments people have about them.  Sure, geek out and enjoy the intricacies of your chosen technologies,  but if you find yourself posting to a mailing list how your $Thing technology is better than some other technology, please have a long hard look in the mirror and go do something more useful with your life.

Although I spend a lot of my time immersed in the world of one particular technology, XML, that doesn’t mean I need to believe it is the right and true answer for all situations.  If I was designing a mobile phone app, at the time of writing I’d almost certainly be using JSON or an SQLite DB for data storage. If I was constructing an ontology then RDF would be the way to go. If I want to structurally query a large number of documents I’d use a NoSQL Document Database like eXist-db.  If I’m encoding dearly held and deeply nested semantics in the text of a medieval manuscript … I would have to be a complete lunatic to sit down and hand-encode this in JSON or RDF.  In that case I’d use TEI XML, because of the power of schema constraints and validation to enforce consistency, human readable nature of it, and its resilience for long-term preservation.  I’d do this knowing that I could convert my work to any format I needed based on the granularity of the markup I provided. They are all appropriate at different times and places, what the base storage format is depends a lot on your project’s needs, the sources of information, and the technology stack you have available to you.

The growth in one of these or other technologies doesn’t ipso facto indicate in any way the ‘death’ of any other technology. Technology will always change, things will always move on. But we should never celebrate even the perception of the marginalisation of widely adopted formats — useful legacy data migration of existing resources, no matter what the format, takes time and effort. Some technologies will eventually become less supported and the mainstream with be using one new $Thing technology or other. This has happened before and will happen again.

I’m all for pointing out the technologies chosen by good and interesting projects, and learning from their successes, but even more importantly their failures, but this should be done honestly with a desire for education, not blindly with trolling attempts to start a war where there really isn’t any argument.

More people are using $Thing technology? This well-known project has adopted $Thing technology as one of their outputs? Great! Isn’t it good that people are using all these wonderful technologies… what is even more important is what they are doing with them! Maybe we should ask them why they chose to do that rather than making assumptions about the lifecycle of technologies? In fact, one things that contributes to the strength and power of modern information systems design is the ability to work between multiple formats simultaneously and sometimes even automatically. For example, to store something as XML, but auto-generate a subset of that as JSON metadata to then in a web frontend to link to some PDFs and EPUBs generated from the same XML. To say that “if you want to use JSON you shouldn’t being using XML”, is like saying “if you want to play with a Princess Elsa Doll, then you shouldn’t play with a Batman Action Figure”. It is nonsensical.  Anyone who thinks you can’t play with both just doesn’t deserve the oxygen of being listened to.