digital humanities – Oxford Medieval Studies

Stephen Pink and Anthony John Lappin

This article adapts the Introduction to Dark Archives Volume I: Voyages into the Medieval Unread and Unreadable. Medium Ævum Monographs N.S. 43 (Oxford, 2022). Available in print and digitally at https://aevum.space/NS43

‘Nel suo profondo ‘In its depth I saw
vidi che s’interna, legato contained, bound with
con amore in un love in one volume,
volume, cio che per what is scattered as scraps
‘l’universo si squaderna’ through the universe.

Dante, Paradiso,
XXXIII.85-88

AS WE ORGANISED THE FIRST DARK ARCHIVES CONFERENCE IN 2019 on the praxis of digitisation and its impact on medieval studies worldwide, little did we think that we would be arranging its sequels during a worldwide pandemic, with medievalists struggling for access to archives and libraries, even those which had previously been anything but dark. And so this volume, born of the pre-coronal world, in gathering together articles from papers delivered at the first event, forms a composite with those that followed, which were celebrated virtually and have been published as an on-line record of papers delivered, discussions round-tabled, and blogs subsequently posted.[2] The development of Dark Archives into a hybrid, inseparably digital and physical, reflects the broader transformation of medieval studies and indeed our whole world: the digital substitutes which became necessary to living during the lockdowns of 2020 and 2021 have not only persisted afterwards but begun, in often unsettling ways, to blend with the old existence into something new (as in our part inhabitation of the now-omnipresent Zoom).[3] Clearly, we now dwell in a ‘Metaverse’ (as Neal Stephenson first termed it, and in the full intended sense of its latest proponents)[4] – an inseparably digital and physical life with novel and still emergent properties, often as exotic as those of Jorge Luis Borges’ Orbis Tertius or ‘Third World’.[5]

As one journey therefore halted – the archives became inaccessible (literally dark, in most cases) in ways unknown since the birth of medieval studies – another began. Yet on reflection, this journey has been less one of actual praxis than of acknowledging an existing fact: a vast area of medieval studies has predominately been conducted within a Metaverse for more than a decade, the beneficiary (or victim, some would argue) of inexorable and massive increases in the digitised representations of physical sources, primary and secondary. The present time, in annis coronae, has therefore sharpened our awareness of the issues involved in the first Dark Archives conference rather than supplanted them. Our primary concerns, which structured the conference and the present volume, centred around our knowledge of the written heritage (subsumed under the heading of the ‘Graphosphere’); its digital records (‘metadata’) alongside the huge challenge of harvesting, structuring and curating them; and the nature of the future scholarship that may resultantly emerge.

Mapping the Medieval Graphosphere

The medieval ‘Graphosphere’, as we define it, is itself one such emergent Metaverse object – the totality of what was inked, traced, daubed, carved, and scratched in the medieval Old World, from (somewhat arbitrarily) the end of antiquity in the West to its gradual adoption of movable-type printing in the fifteenth century; and, further, the infinitesimal survival of those scripta into the present; (other names suggest themselves, such as Michael G. Sargent’s Pleroma (πλήρωμα or ‘Fullness’), of the medieval written tradition).[6] Barely grazed by scholarship, to grasp this totality has for centuries been the province of ecstatic vision, theory, fantasy, and horror, but only in the last decade or two, of scientific quest.[7] Hugely lagging the parallel process for printed books, itself largely unaccomplished,[8] we feel ourselves at the equivalent stage of the Age of Discoveries, of multiple missions into the previously unknown, that broadly capped what we ourselves term the Medieval. The reference to the Portuguese expansion is not simply mad self-aggrandisement (brought on by Zoom over-exposure). It captures on the one hand how soaringly the Graphosphere dwarfs our existing working map in extent, and whose proper charting will, we suspect, marginalise the latter as far as the circumnavigators did the Mappa Mundi; on the other, the great energies we witnessed at Dark Archives being marshalled to this end. Examples included: the unprecedentedly large Polonsky Foundation-funded scanning projects to digitally re-unite bodies of manuscripts dispersed since the medieval period, represented for us by the Polonsky Greek Manuscripts Project;[9] Sarah Savant’s presentation on the KITAB digitisation project,[10] which had by around 2020 produced a database of 1.5 billion words of eighth- to fifteenth-century C.E. written Arabic; and the project of the Hill Museum and Manuscript Library to digitally preserve handwritten artefacts from across the globe.[11] Quantifying what is still extant in France’s incredibly rich libraries and archives is the topic of Anastasia Shapovalova’s paper, which describes the Biblissima project in which she is herself involved, as a tool for exploring this rich cultural reserve.

However, in seeking to even grasp the Graphosphere’s vastness our terrestrial analogy falters (while cosmological ones beckon), for it must also encompass what has been lost – a body of ‘dark matter’, literally unreadable, itself in turn dwarfing the extant (read or unread).[12] The ambition to sketch and eventually restore this lacuna was highlighted at Dark Archives by Beyond2022, with its aim to reconstruct as fully as possible the centuries of material destroyed in the 1922 fire at Ireland’s Public Record Office; Krista Murchison’s similar efforts for manuscripts destroyed in the Second World War;[13] Joanna Tucker’s presentation, ‘Survival and Loss: working with documents from medieval Scotland’, where monastic cartularies are excavated for information of lost documents, but disappeared monasteries are also queried for their lost cartularies; and our extended Dark Archives 20 round-table debate on ‘Loss and Dispersal’, chaired by Elizabeth Solopova.[14] Nor can one speak of the ‘lost’ as a constant, since it grew unevenly throughout the medieval period and continues to do so, if not at the past’s calamitous rates.[15]

If one had to identify an inaugural journey of the Graphosphere era, it would be Eltjo Buringh’s Medieval Manuscript Production in the Latin West: Explorations with a Global Database (2011).[16] By applying statistics to a small database of manuscript records, Buringh inferred outline numbers, with more detailed breakdowns, for the Latin West’s total production from the sixth to the fifteenth centuries – c. 11mn whole manuscripts of which c. 0.75mn remain (albeit with major caveats to the definition of ‘manuscript’), part of a more loosely estimated c. 3mn surviving manuscripts, produced as far afield as Ethiopia and India, from the first to nineteenth centuries.[17] This was a marked development upon previous estimations[18] in its combined method, scale, and sheer ambition – an Erastothenes, Buringh longed to calculate the entirety of Old World medieval manuscript production, but was hampered by the time’s limited techniques and (above all) data. Yet both the need and practicality of an interrogable, navigable model of the Graphosphere along these lines has become clearer with each annual flood of fresh data. Therefore we were delighted that Eltjo Buringh contributed the opening Keynote to the first Dark Archives conference, and the first chapter of this Proceedings, with a re-consideration of his methods in the context of lost codices in England and Scotland. It was remarkable to see the influence of his work in a range of other research presented at Dark Archives, including the flowering science of manuscript statistics.[19]

What has also become clearer is that any credible Graphosphere model must embrace not only all geographic areas of production, but all kinds of written artefact – from manuscript fragments (whose enormous scope for reconstructing the medieval was the subject of Lisa Fagin Davis’ Dark Archives 20 keynote, and other presentations),[20] and writings neither on parchment nor paper such as graffiti,[21] to artefacts generally ignored as being ‘written’ at all (despite clearly possessing a laden semantic freight for their original users). Two articles therefore explore the cast and the carved: Rosário Morujão describes the progress made in cataloguing, describing, analysing (from pictographic and chemical points of view) and preserving medieval Portuguese seals (‘Dark Seals in Portuguese Archives’);[22]and John Hines offers a discussion of the origin and importance of runic inscriptions throughout northern Europe, ending with a particularly illuminating case-study of a runic fragment and its attached object (‘The Dark Sides of the Runes’).[23] Materiality is here crucially important in the study of the written object, or the object with writing upon or within it.[24]The evident thing-ness of the wax seal, or bridle-bit runically inscribed, encourages us to consider it ‘in the round’, and so both description and photographic representation have been spurred to capture its 3D accents — such three-dimensional representations are already arriving for manuscripts, providing a depth to the otherwise flattened page and the physical volume of the codex. At the same time, excessive pursuit the perfect simulacrum (in the manner of the facsimiles produced with remarkable exactitude by Ediciones Siloé)[25]can draw us away from the inherent properties and possibilities of digitisation itself, not least that of simply preserving the physical aspects of manuscripts whose very existence, like both the libraries and the archivists that preserve them, is threatened.

We concluded our Mapping the Medieval Graphosphere session by turning to a third indispensable element of its dark matter, neither completely unknown nor destroyed: those things about which we know but which remain unread or unrated, dark in the archives because they remain unopened. Clearly, some of this neglect is due to difficulties of access, a point brought out by Paul Dryburgh – and sometimes that difficulty is purposeful (see the frustrations of Roger Martínez Davila in certain religious repositories in Spain, and Anna Dorofeeva’s presentation on medieval ciphers);[26] but another aspect, as Monika Opalińska’s article shows in its unpicking of vernacular English translations of the Pater noster, is due to unquestioning reliance on the assumptions of previous scholarship, and in the West a nineteenth- and twentieth-century system of values for the evaluation of its texts – religious texts have suffered particularly from this tendency to marginalize cultural production.[27] To that inheritance of distortions in western materials we must add its working archives of non-European writings, often the outcome of entirely arbitrary choices in the colonial era as to what should be sent home – a distortion which the Arcadia fund is correcting through its drive to digitally scan and preserve texts situated in areas from sub-Saharan Africa to East Asia.[28] We were also honoured to welcome the literary historian Yating Zhang, via Zoom, from Shaanxi, for an eye-opening history of the reception of medieval English texts in China, a perspective completely new to most scholars of medieval Europe who dwell on the continent itself or in North America.[29]

Liberation from the assumptions of our education (and that of our supervisors) may be the first great and necessary outcome of mapping the Graphosphere. By opening ever more doors and windows into the archive’s darkness, allowing an ever fuller picture to be drawn, we expect so much of what went before (previously taken as the totality of the archive), to be confirmed as a somewhat arbitrary wandering through a fraction. Then we can truly grapple with what has survived and been lost, and fundamentally redraw mental maps of the Middle Ages whose shaky outlines were laid down in the late fifteenth century, or the Victorian age, or the period between the World Wars. Thus in one way, we stand like Henry the Navigator, the recipients of ever-increasing snippets of information that will supplement the metaphorical significance of the Orbis terrarum maps beloved of the illuminators of the Beatus manuscripts or the fillers-in of the mappae mundi; and in another, we peer like the seventeenth-century scientist Nicolaus Steno at a new historical geology, with the hope of now understanding its sediments and how they were laid down, in place of former explanations of self-serving etiologies.

**Detail from *The Vision of St. Benedict* (Giovanni del Biondo, 14^th Century).[30]** This depiction of Benedict’s vision of ‘the whole world … as if gathered together’ stands out in the tradition for accentuating the spherical aspect of the *Orbis terrarum,* and thus (somewhat contradictorily) that a portion is obscured from Benedict’s as well as our direct view.[31]

Endless deserts, oceans & mountains: the Metadata Crisis

At both Dark Archives 19 and 20 we necessarily turned from the theoretical survey of the Graphosphere to the central practical challenge we must solve before we can even begin to own its territory – the ‘Metadata Crisis’, as our second keynote of Dark Archives19, Will Noel, put it.[32] This crisis has been acutely one of scarcity of digital information, and the variable quality of much of what there is. Our physical written heritage remains overwhelmingly unscanned in a usable fashion, let alone described, most of all because of the prohibitive expense of doing so, limiting even the best-funded scanning initiatives to strategic selections of a few thousand folio pages.[33] We were pleased to welcome some of the major funders of these initiative for an insight into their motivations, represented by Marc Polonsky of the Polonsky Foundation, Maja Kominko and Simon Chaplin of Arcadia Fund, and Daniel Reid of the Whiting Foundation.[34] The 2019 Notre Dame fire reminded us of the pricelessness for their own sake of digital records of our vulnerable medieval heritage, quite besides that of data extraction – and until recently, indeed, one would have to question the latter motivation. By even an optimistic guess of numbers of people currently capable of reading a handwritten medieval text (and the rosiest forecasts for training more) it might take millennia to transcribe them all.[35]

To our initial rescue, ex machina, may come automated Optical Character Recognition (OCR), or for medieval manuscripts more precisely, Optical Handwriting Recognition (OHR). Previously a collection of techniques only achieving useful (if far from total) accuracy with uniform post-Gutenberg printed type, Achim Rabus’ article demonstrates the huge progress, as well as limits, of the Transkribus project in machine reading the vastly greater complexity and variability of medieval handwriting.[36] Dark Archives was also privileged to hear Verónica Romero, from the Universidad Politécnica of Valencia, speaking on their own OHR successes; Vincent Christlein who presented his own work on algorithm-driven identification of scribes, dating of hands and the recognition of document types, and Estelle Guéville and David Joseph Wrisley on advances in machine-reading manuscript abbreviations.[37] Roger Martínez Davila’s article in this volume approaches the same problem with a truly impressive alternative: the harnessing of the general public, and its interest in its own heritage, to transcribe documents via Massive Online Open Courses (MOOCS) with high accuracy – in this case, the archives of multi-confessional medieval Iberia; [38] a similar approach but with more specific goals was the La Sfera transcription competition described for us by two of its organisers, Laura Morreale and Ben Albritton.[39] Effective not only in transcrib-ing texts beyond the competence of current OHR, the results of such crowd-sourced endeavours can also be used in turn to train yet more accurate OHR models. Indeed, with the advent of ever more powerful forms of machine-learning, the latest of which teach themselves without human re-training, it seems only a matter of time before machines deliver a huge new archive of materials that medieval studies will then be obliged to incorporate within itself.[40]

However, exactly such progress in automation is hastening what we believe to be the crux of the metadata crisis: not the scarcity but the potential endlessness of information about a physical written artefact that might be digitally captured and represented. Throughout Dark Archives, the related debates on what the digital can, cannot or can only capture of the physical have often seemed at root metaphysical (and emotively so, amplified by the unique stresses of the pandemic). At the Dark Archives 20 panel, ‘The Whole Book?’, chaired by Lisa Fagin Davis, and its associated papers there emerged on the one hand a palpable excitement that we now possessed a new object of study, inseparably material artefact and digital repres-entation, generated by their constant interplay (see, for example, the presentation of Lena Vosding, Natascha Domeisen, Luise Morawetz, and Carolin Gluchowski).[41] On the other there was great discomfort at the huge potential damage of equating digitised information, no matter how plentiful, with the ipsissima res of each unique medieval manuscript. Indeed, it was argued, the futile quest for digital verisimilitude of the physical should be abandoned, so that the digital may be re-evaluated on its own terms.[42] Yet, before our eyes, such debates are fast being sidelined by the onrush of data now being generated, with manuscript folio images alone now numbering in the millions. Its sheer range and quantity was on display at Dark Archives, from Vincent Christlein and Daniel Stromer’s digital unwrapping of fragile rolls of text using tomography, and Alexander J. Zawacki and Helen Davies’ related recovery of palimpsested text via spectrography, to Sarah Fiddyment’s capturing of the DNA and other biological markers left on codices – the very ‘writing of life’, of huge significance to a range of historical enquiries beyond codicology itself.[43]

It is this tsunami of unprocessed information that threatens to define our Metadata Crisis as one of ‘superabundance’, as Elaine Treharne termed it in her Dark Archives 20/20 keynote. In fact, this superabundance is welcomed by Treharne and others as a transform-ing catalyst to scholarship, premised upon automated machine-categorisation evolving to carve out navigable pathways for human scholarly explorers. The power of such algorithms to classify manuscript images was already on display in her collaborator Ben Albritton’s presentation (in this case, by isolating illuminated initials); techniques promising to knit our digital records, regardless of the fragmentation of metadata and physical sources, into a massive, open and online ‘Future Archive’ (these issues and more explored in the eponymous panel chaired by Suzanne Paul).[44] We also saw how other medieval data scientists are working to lend such images at least a metadata skeleton, as witnessed by Andrew Hankinson’s presentation on the crucial International Image Inter-operability Framework (IIIF) protocols in which major scanning initiatives are now encoded.[45] Likewise, Debra Cashion’s article here presented (‘Selva Oscura: in and out of a dark archive’) demonstrated the great use to researchers of the attachment of provisional meta-data to digitised images.[46] Yet without rapid advances, ex machina, of the kind anticipated by Treharne to structure, interrogate, and interpret the data – a recurrent demand of our contributors – we are faced with what Zawacki and Davies term a ‘new kind of dark archive … a “digital palimpsest”’.[47]

Moreover, as William Mattingly’s Dark Archives 21 presentation at UnEdition soberingly brings home, the very likelihood that independent self-teaching AI will complete the scanning of our archive without human input threatens us not only with a vast further body of data, but one which we may not immediately, fully (or ever) comprehend, or trust.[48] As of 2022, such a scenario seems closer than ever with the astonishing progress and apparent creativity shown by machine interpretation of humanity’s cultural heritage, along with indifference to our distinctions between ‘truth’ and ‘fiction’, as demonstrated by services such as Dall-E and ChatGPT.[49]

Thus, the lesson of our current struggle with metadata may be that setting out to know the medieval Graphosphere in any exhaustive, enumerative sense will achieve the very opposite, for its emerging territories and cruxes have the endlessness of a Mandelbrot fractal; as one kind of Terra Incognita disappears, a vaster one takes its place. We have (perhaps comfortingly) come full circle. Yet, what should our goals be, if that of complete discovery is futile?

New worlds of medieval scholarship

Among major grounds for optimism is that medievalists are already constructing the worlds of scholarship that a realised Graphosphere might make possible – moreover these are evolutions, not supersessions, of existing scholarly techniques. One such field was demonstrated by Mark Faulkner, whose ‘Corpus Philology, Big Dating and Bottom-Up Periodisation’ brings that most traditional of disciplines, philology, into fruitful commerce with the developments in corpus linguistics over the last decades. As the title suggests, he imagines the scope of a fully realised digital corpus of medieval textual materials to uncover vernacular linguistic features previously un-systematised, or even simply ignored, in older surveys on which we have relied. We may thereby transform, from ‘the bottom up’, our placement of ‘the composition of texts in time and space’.[50] This approach indicates how medieval ‘Big Data’ may rebuild the entire foundation of assumptions upon which current medieval scholarship rests, as was on display throughout Dark Archives and specifically debated at our Dark Archives 20 Round-Table debate, ‘The Future of Scholarship’, chaired by Peter Frankopan.[51]

Perhaps the most often articulated ambition in the Dark Archives events was to liberate the scholarly presentations of texts from the constraints of the static two-dimensional page and dominant single-manuscript edition. Thus William G. Sargent’s article invokes William Gibson’s three-dimensional ‘Cyberspace’ (an inspiration for the ‘Metaverse’): a realm of free mental movement to be contrasted with the crabbed world of our physical existence. In Cyberspace, Sargent suggests, we might finally experience the fullness of manuscript traditions – each represented as an independent ‘arcology’ with its dizzyingly complex networks of variances, distributions, sequence of recensions, and links to other such arcologies. Thereby we might dispel the ‘obfuscation’ of fixed print snapshots.[52] We were able to follow up this vision of the future Edition – or of the ‘UnEdition’, as Laura Morreale and Ben Albritton termed it, at an eponymous Dark Archives 20/21 event chaired by Paolo Trovato.[53] Presentations ranged from that of Wouter Haverals and Mike Kestemont on ‘UnEditing the Herne Corpus’, via a massively ‘hyperdiplomatic’, rapidly updateable and interactive digital edition of that monastery’s entire library, through to Anthony Bale’s evocation of the breathtaking permutations of John Mandeville’s Travels as its own manuscripts voyaged through Europe’s vernaculars – its true tale (inaccessible to rescensionist quests for an originary exemplar) one of constant re-fashioning in its medieval audiences’ imaginations.[54]

However, UnEdition also made clear that a truly useful repres-entation of this complexity still belongs to a more advanced ludic future age (except, that is, via the royal road of narrative description demonstrated by Bale himself). One route ahead was signalled by the Digital Editions Live workshop co-hosted by Dark Archives in 2021 (with Oxford Medieval Studies, and OCTET, the Oxford Centre for Textual Editing and Theory), reflecting on the digital editions recently crafted by Oxford medievalist students, based upon the protocols of the Text Encoding Initiative (TEI).[55] Perhaps the event’s greatest lesson in this regard was that scholarly discernment, including traditional rescensionist editing skills, will be more important than ever in crafting useful scholarship from the vast amounts of data now available. Another lesson was the pressing need (as in humanities tout court), for a proactive digital pedagogy to gradually incorporate these new skills, even as digital technology itself constantly evolves. Perhaps the most novel aspect of Digital Editions Live was the augmentation of each presentation with a ‘live-feed’ consultation, from the Bodleian, of the physical original, an art brought to perfection by Andrew Dunning (so dramatically present did the three-dimensional physical artefact feel that one convenor nearly shouted when a student’s desktop cup of coffee appeared side-by-side with ‘her’ manuscript on the screen)!

Digital Editions Live is also the latest of our learning experiences in crafting the Dark Archives series itself, which now ranges from the workshops of 2019 (which covered skills from spectrography, and the scanning of seals on a budget, to crowdsourcing transcriptions) to the organising of subsequent events online in and after lockdown.[56] Freed from the constraints of physical space and (in many ways) time, and to involve a truly global audience, we arranged for Dark Archives 20 presentations to be entirely pre-recorded, pre-captioned (by computer, sometimes amusingly) and released several days ahead of the scheduled live panels, with all participants encouraged to digest them beforehand. This front-loaded approach allowed us to concentrate the live events themselves (also computer live-captioned) in the early afternoon to early evening GMT, maximising the active attendance of many hundreds from as far afield as the US West Coast and China. Alongside the Zoom events we ran a separate online text forum (on ‘Discord’) allowing discussion of themes at any time. Behind the scenes the event was kept going by shifts of unseen but vital online moderators, from Oxford Medieval Studies, the University of Fribourg, and the University of Colorado (Colorado Springs). This born-digital approach also greatly facilitated the creation of a comprehensive digital archive of the event’s metadata (to fuel future discussion, events and scholarship (see https://darkarchiv.es). Yet among the most impressive achievements were those ‘outreach’ events that took place between the main sessions: the ‘Blogging with Manuscripts’ Presentations and Prize (also awarded via Zoom) associated with the #PolonskyGerman Project;[57] and, finally, ‘Singing Together. Apart’, an extraordinary Zoom Compline in the evening (GMT) of the second day, which united in perfect synchrony singers physically dispersed across many locations (from St Edmund Hall’s crypt to the Church of St Barnabas Church in Jericho) together with all of the people who digitally attended from around the world.[58]

Throughout our discussions at Dark Archives has run a quandary, explicitly or in the background — what truly is a digital repres-entation of a material thing; what truly are the two taken together? Far from being esoteric, in the last few years we have recognised it to be an existential issue, for it has convulsed all our lives, and as yet we have no answers. To explore it more broadly, we invited Luciano Floridi to present a Dark Archives keynote, to which he very graciously agreed.[59] However, his planned article became another casualty of the times, as he became wholly involved in advising on various privacy issues regarding the UK Government’s ‘world-beating’ COVID-19 app that would potentially allow an efficient track-and-trace operation to be launched, thereby saving countless lives. Professor Floridi’s contribution to the philosophy of information has been so important that we sought another philosopher who might be able to give an overview of Floridi’s thought and its implications for digital humanities – in particular Floridi’s situating the historical archive at the heart of human life via the digital, as encapsulated in his conception of hyperhistory (our dependence upon the digital, and our incessant creation of digital traces).[60]

Whatever our future digital representation of the medieval world, already clear is that it will not be the nightmare of Borges’ Tlön. Rather, it is the medieval world in ways that we have never before experienced it, part of its physical existence as inseparably and magically as Dante’s vision in Paradiso of the pages scattered throughout the universe, beheld re-bound ‘in one simple light’.[61] Our manner of marvelling at this has taken the form of articles — such as those here — and blogs and presentations —such as those found on our website—followed by questions and the search for answers, the discussions of roundtables, all of which have deepened our knowledge of the written universe beyond us. We hope that the volume you hold in your hands, or your eyes scan on a screen, will mark the beginning of numerous exploratory paths for you into this newly revealed world.

Acknowledgements

We must thank everyone who has made the Dark Archives series thus far possible, including our presenters, panelists and chairs, and all those who kept things running behind the scenes: Pablo Acosta-García, Tuija Ainonen, Benjamin L. Albritton, Anthony Bale, Graham Barrett, Zoe Bartliff, Josephine Bewerunge, Elizabeth Biggs, Mary Boyle, Stewart J. Brookes, Scott Bruce, Eltjo Buringh, Toby Burrows, Daron Burrows, Debra Cashion, Matthew Champion, Simon Chaplin, Vincent Christlein, Sophie Clayton, Ralph Cleminson, Julia Craig-McFeely, Robin Darwall-Smith, Helen Davies, Karen Demond, Matteo di Franco, Maria do Rosário Morujão, Natascha Domeisen, Anna Dorofeeva, Sebastian Dows-Miller, Paul Dryburgh, Andrew Dunning, Sara Elis-Nilsson, Lisa Fagin Davis, Mark Faulkner, Gustavo Fernández Riva, Sarah Fiddyment, Chris Fletcher, Molly Ford, Alex Franklin, Peter Frankopan, Carolin Gluchowski, Emma Goodwin, Estelle Guéville, Andrew Hankinson, Wouter Haverals, Carrie Heusinkveld, Sam Heywood, John Hines, Matthew Holford, Kyle Ann Huskin, Folgert Karsdorp, Martin Kauffmann, Mike Kestemont, Ben Kiessling, Lynn Killgallon, David King, Maja Kominko, Pavlina Kulagina, Henrike Lähnemann, Franziska Lallinger, Andres Laubinger, Caroline Lehnert, Molly Lewis, James Louis Smith, Roger Louis Martinez-Davila, William Mattingly, John McEwan, Genevieve McNutt, Luise Morawetz, Laura Morreale, Krista Murchison, Eva Neufeind, Mary Newman, Will Noel, Monika Opalińska, Richard Ovenden, Nigel F. Palmer, Suzanne Paul, Luca Polidoro, Marc Polonsky, Dot Porter, Ellie Pridgeon, Adrien Quéret-Podesta, Achim Rabus, Henry Ravenhall, Daniel Reid, Tom Revell, Shannon Ritchey, Jane Roberts, Natasha Romanova, Verónica Romero, Anastasija Ropa, Edgar Rops, Miri Rubin, David Rundle, Rebeca Sanmartin Bastida, Michael G. Sargent, Sarah Savant, Daniel Sawyer, Marlene Schilling, Carolin Schreiber, Anastasia Shapovalova, Elizabeth Solopova, Lesley Smith, Emma Stanford, Alyssa Steiner, Columba Stewart, Jo Story, Justin Stover, Daniel Stromer, Jane H.M. Taylor, Keri Thomas, Samuel Thrope, Elaine Treharne, Paolo Trovato, Joanna Tucker, Cornelis van Lit, Stacie Vos, Lena Vosding, Julia Walworth, Michelle R. Warren, Teresa Webber, Thomas White, Pip Willcox, Lois Williams, Damon Wischik, Christopher Wright, David Joseph Wrisley, Ulrike Wuttke, Alexander Zawacki, and Yating Zhang. Finally, we must thank our sponsors, sine qua non: Medium Ævum, Oxford Medieval Studies, the Bodleian Library, and the Oxford English Faculty which freely and graciously provided our venue for the first physical conference in 2019.

Stephen Pink
Anthony John Lappin

[1] English translation indebted to many others, most recently Dante Alighieri, The Divine Comedy 3: Paradiso, trans. Robin Kirkpatrick. (London, 2007).

[2] See https://darkarchiv.es for the details of the successive events of 2019-21.

[3] As Elaine Treharne pointed out in her wide-ranging keynote on the relation between the material and the digital at Dark Archives 20/20 (DA20), ‘Seeing and Being Seen: manuscripts and their digital viewers‘, one reason that prolonged Zoom use has felt so draining to many is that ‘your eyes and ears take on … the entire responsibility of the in-person meeting’.

[4] Neal Stephenson, Snow Crash (New York, 1992), passim. Although the term is clearly a conflation of ‘universe’ and ‘meta’, the latter is susceptible to a range of interpretation: in OED, as ‘beyond, above, at a higher level’, certainly, but most relevant to the IT industry’s current ambitions to create an indispensable hybrid reality for humanity, as ‘denoting change, transformation, permutation, or substitution’. In 2021, reflecting such ambitions, Facebook Inc. renamed itself ‘Meta’.

[5] The ‘third world’ is a new existence forged, in Borges’ ficción, from the leakage into ours of the impossibly fantastic qualities of the world of Tlön; Jorge Luis Borges, ‘Tlön, Uqbar, Orbis Tertius’, Sur 68 (1940), 36-46.

[6] On πλήρωμα, see ‘Birth of the UnEdition’, part of the Dark Archives 20/21 (DA20-21) series of events; on its theological connotations, see for example Jn. 1.16. We have drawn the general idea of a ‘Graphosphere’ from Simon Franklin’s The Russian Graphosphere, 1450-1850 (Cambridge, 2019), and less directly from Régis Debray’s division of human signage into the ‘logosphere’, ‘graphosphere’ and ‘videosphere’ eras (see Régis Debray, trans. Eric Rauth. ‘Three Ages of Looking’. Critical Inquiry 21.3 (1995), 529-55. Our consideration of the medieval Graphosphere broadly ends where Franklin’s begins, chronologically at least, at the rise of movable-type printing in Europe; however, all boundary definitions commonly attaching to ‘the medieval’, itself hugely problematic, await reconsideration through a proper survey of the Graphosphere itself.

[7] For example Dante, Paradiso, XXXIII.85-88, quoted above; Karl Popper’s ‘Three Worlds’ classification (e.g. Karl Popper, ‘Three Worlds: The Tanner Lecture on Human Values Delivered at the University of Michigan, April 7, 1978‘, 144, 162-63); and Borges’ ‘Del rigor en la ciencia’ (Los anales de Buenos Aires 1.3 (1946), 53), following Lewis Carroll (Sylvie and Bruno Concluded, London/New York, 1893, 169), in which human hubris creates a one-to-one scale map of the world, overlaid upon the world itself, with Babelian outcomes.

[8] In October 2019, Google Books reported that it had scanned more than 40 million printed volumes, in 400 languages, out of its earlier estimated total of c. 130mn (Lee Haimin. ‘15 years of Google Books’. (blog post, 2019); Leonid Taycher, ‘Books of the World, Stand up and be Counted! All
129,864,880 of you’ (blog post, 2010)).

[9] Christopher Wright and Matteo di Franco spoke on the Polonsky Foundation Greek Manuscripts project, ‘From isolation to integration: making Greek manuscripts readable’ (DA19). One might point to the ambitious projects to digitize the manuscript holdings of the Herzog August Bibliothek Wolfenbüttel (Marenliese Holscher and Katharina Mähler, ‘Ready for the Big Show: how manuscripts are prepared for digitization’, covering the Polonsky Foundation’s project, ‘Manuscripts from German-Speaking Lands’) which has had subsequent knock-on effects such as the digitisation of the 127 manuscripts in the Staats- und Universitäts Bibliothek of Bremen between 2020-21, funded by the Deutsche Forschungsgemeinschaft.

[10] Sarah Savant, ‘Finding Meaning in 1.5 Billion Words of Arabic: the KITAB project and its aims’ (DA19).

[11] Columba Stewart, ‘Showing the Medieval and Early Modern World as it Actually Was: the expansion of the work of HMML (the Hill Museum & Library) beyond monastic libraries in Europe to global preservation of handwritten heritage’ (DA20).

[12] On ‘dark matter’, cosmic and written, see further Michael G. Sargent, ‘Hidden in Plain Sight: the obfuscation of manuscript evidence in the modern critical edition’, Dark Archives Vol. I, 315-35 (315).

[13] Krista Murchison’s ‘Righting and Rewriting History: recovering and analyzing manuscript archives destroyed during World War II’ (NWO Project Database) will reach its completion in 2023. For her paper to Dark Archives 20/20, see Murchison 2020b.

[14] Solopova 2020.

[15] Further DA19 conference papers were given by Jo Story (‘Insular Manuscripts: how many and what next?’; Ralph Cleminson (‘Non leguntur: shedding light on Slavonic sources’; Adrien Quéret Podesta (‘Textual Ghosts in the Oldest Central European historiography’); Daniel Sawyer (‘At Knowledge’s Edge: lost materials’), Gustavo Fernández Riva (‘Network Analysis of Manuscripts’).

[16] Eltjo Buringh, Medieval Manuscript Production in the Latin West: Explorations with a Global Database (Leiden, 2011).

[17] Buringh, Medieval Manuscript Production, esp. 16-17, 99, 232, 259-63. For example, Anastasija Ropa, and Edgar Rops’ DA20 presentation on ‘The Elusive Archives of Medieval Livonia’, whose independent existence ceased relatively early.

[18] See, for example, Iter italicum (Kristeller 1967-92; 2006) and the Medieval Libraries of Great Britain database.

[19] See also Mike Kestemont & Folgert Karsdorp, ‘Estimating the Loss of Medieval Literature with an Unseen Species Model from Ecodiversity’, DA20 Presentation, which adopts an ‘unseen species model’ used in calculating eco-diversity.

[20] All at DA20: Lisa Fagin Davis (chair), ‘The Whole Book?’ ; Karen Desmond: ‘Fragments and Reconstructions: the written traces of polyphonic liturgical music in medieval Worcester and beyond‘ ; Sara Elis-Nilsson, ‘Using Manuscript Fragments to Map Lived Religion: the case of the cults of saints in medieval Sweden’.

[21] See Matthew J. Champion’s DA20 presentation: ‘A Sea of Lost Words: the medieval graffiti inscriptions of England’s parish churches’.

[22] See Dark Archives Vol. I,, 125-44. Seals were also the topic approached at DA19 by John McEwan, ‘Reflectance Transformation Imaging and Medieval Seals’.

[23] See Dark Archives Vol. I,, 97-124.

[24] Further engagement with materiality was found through the DA19 contributions of Henrike Lähnemann (‘Nun’s Dust’); David King (‘The Corpus vitrearum medii aevi’), Ellie Pridgeon (‘The Writing on the Wall: medieval painted inscriptions’), and Sarah Fiddyment (‘Manuscript Palaeo-proteomics’).

[25] http://siloe.es. Most recently engaged by the Beinecke Library to produce a facsimile edition of the Voynich manuscript, which retails at around eight thousand euro.

[26] Paul Dryburgh, ‘Peering into an Impenetrable Gloom and the “Tyranny” of Digital by Design: the future of medieval collections at The National Archives (UK)?’ (DA19); Lisa Fagin Davis (chair), ‘The Whole Book?’ (DA20); Anna Dorofeeva, ‘Book Ciphers and the Medieval Unreadable’ (DA20);

[27] See below, 145-67. Further DA19 papers on this theme were offered by Mathew Holford (‘The Least Studied Manuscripts in the Bodleian’) and David Rundle’s characteristically provocative think-piece (‘The Unbearable Lightness of the Archive’).

[28] At DA20, Miri Rubin, Columba Stewart, Cornelis van Lit, and Maja Kominko engaged in an extended debate on ‘Inaccessibility and Bias’, chaired by Michael G. Sargent. See also the debate chaired by Suzanne Paul on ‘The Future Archive’; Stacie Vos, ‘The Dark Archive and the Silent Book: histories of access’; and Genevieve McNutt, ‘Inaccessible and Inconvenient Archives at the Turn of the Century’.

[29] Yating Zhang, ‘Digitalization and Practicalities of Medieval English Studies in China’ (DA20).

[30] The original is in the Art Gallery of Ontario.

[31] Gregory the Great, Dialogi, II, 35.

[32] Will Noel, ‘Through a Screen Darkly: the Metadata Crisis and the authority of the digital image’. Further, at DA19, Toby Burrows (‘Aggregating Provenance Metadata to Reveal the Histories of Medieval Manuscripts’) showed how metadata can be used to good effect.

[33] Marc Polonsky discussed the various strategies adopted by the Polonsky foundation in his address, ‘Digitisation of Cultural Heritage: a funder’s perspective’ (DA19). Ben Kiessling, ‘The Limits to Digitization’ (DA19) sounded a warning note over some of these processes.

[34] ‘Discussion: Funders’ Perspectives’, DA20 Round-table debate, chaired by Peter Frankopan.

[35] Samuel Thrope, ‘The Curator in the Machine’ (DA20) discussed the difficulties of balancing accessibility with the reading experience in making public the digitized Arabic manuscripts of the National Library of Israel.

[36] Achim Rabus, ‘Training Generic Models of Handwritten Text Recognition using Transkribus: opportunities & pitfalls’, Dark Archives Vol. I, 183-208.

[37] Verónica Romero, ‘Interactive-Predictive Transcription and Probabilistic Text Indexing for Handwritten Image Collections’ (DA19); Vincent Christlein, ‘Scribal Identification and Document Classification’ (DA19); Estelle Guéville & David Joseph Wrisley. ‘Rethinking the Abbreviation: questions and challenges of machine reading medieval scripta’ (DA20).

[38] Roger L. Martínez-Dávila, ‘The Space Between: Jews, Christians, and Muslims in Medieval Spain. MOOCS, citizen science, and digital manuscript collections’, Dark Archives Vol. I, 209-51.

[39] Laura Morreale and Ben Albritton. ‘Community, Collaboration, and the UnEdition’ (DA20/21).

[40] See Demis Hassabis’ 2019 presentation at MIT on the ground-breaking AlphaGo Zero and ‘The Power of Self-Learning Systems’; Mattingly, ‘Leveraging the UnEdition’ (DA20/21).

[41] Fagin Davis (chair), ‘The Whole Book?’ (DA20); Luise Morawetz, Natascha Domeisen, Carolin Gluchowski & Lena Vosding, ‘Blast from the Past and Back to the Future: manuscripts and digitisation’ (DA20). See further discussion of this phenomenon in Lapo Lappin, ‘The Beautiful Glitch: human and machine in Luciano Floridi’s philosophy of information’, Dark Archives Vol. I, 337-55 (esp. 345-46).

[42] Stewart Brookes, ‘The Book, the Whole Book, and Nothing But the… Digital Surrogate’. Treharne, ‘Seeing and Been Seen‘ (DA20).

[43] Sarah Fiddyment, ‘Reading the Invisible: can biocodicology help interpret the history of a manuscript?’ (DA20).

[44] Treharne, ‘Seeing and Been Seen‘ (DA20) ; Benjamin L. Albritton, ‘Found Within: discovery and complex objects’ (DA20); Paul (chair), ‘The Future Archive’ (DA20).

[45] Andrew Hankinson, ‘Discovery through Data: how IIIF shines a light into the dark archive’ (DA19); Albritton, ‘Found Within‘ (DA20).

[46] Dark Archives Vol. I, 265-78.

[47] Alexander J. Zawacki and Helen Davies, ‘Digital Archives and Damaged Texts: capturing, processing, and sharing multispectral image data’, Dark Archives Vol. I, 253-67 (267).

[48] Mattingly, ‘Leveraging the UnEdition’ (DA20/21).

[49] DALL E 2 (https://openai.com/dall-e-2/); ChatGPT (https://chat.Openai.com/).

[50] See Dark Archives Vol. I, 280-308. See also Scott Bruce’s DA20 presentation, ‘The Lost Patriarchs Project: discovering Greek patristics in the medieval Latin tradition’.

[51] Peter Frankopan (chair), ‘The Future of Scholarship’ (DA20).

[52] See Sargent, ‘Hidden in Plain Sight’, Dark Archives Vol. I, 315-35, quoting William Gibson’s Neuromancer (New York, 1984); Count Zero (New York, 1987); Mona Lisa Overdrive (New York, 1989).

[53] Paolo Trovato (chair), ‘Birth of the UnEdition‘ (DA20/21).

[54] Mike Kestemont & Wouter Haverals, ‘UnEditing the Unspoken: hyperdiplomatic digital editions of the remarkable vernacular manuscript collection of the Herne Charterhouse (ca. 1350-1400)’ (DA20).

Bale, Anthony 2021. ‘Towards an Un-edition of Sir John Mandeville’ (DA20/21).

[55] Digital Editions Live (DA20/21).

[56] All at DA19: Verónica Romero, ‘Hands-on Workshop on Assistive Technologies to Access the contents of handwritten text manuscripts’; John McEwan, ‘Imaging Seals on a Budget’; Roger Louis Martinez-Davila, ‘Crowdsourcing Manuscript Transcriptions: opportunities and challenges using MOOCs, social media, and emerging platforms’; Alexander Zawacki and Helen Davies, ‘Multispectral Imaging: technologies, techniques, and teaching’.

[57] Henrike Lähnemann et al., ‘#PolonskyGerman #BloggingMSS Presentations’ (DA20).

[58] St. Edmund Hall Choir & friends, ‘Compline from the Crypt’ (DA20).

[59] Luciano Floridi, ‘Semantic Capital: its nature and value’ DA19.

[60] Lappin, ‘The Beautiful Glitch’, Dark Archives Vol. I, 331-48.

[61] Dante, Paradiso, XXXIII.85-90.

Tag: digital humanities

‘Big Data’ and Medieval Manuscripts

Call for Survey Responses: Enabling Digital Research on Manuscript Catalogue Data

The Emergence of the Medieval Graphosphere at the Dark Archives Conferences