DH BeNeLux

, or Digital Humanities Belgium-Netherlands-Luxembourg in longform, is happening today at  in the Hague (the research institute where I work is on the fifth floor in the same building).  Follow along with the hashtag () and official Twitter account () if you can.  The program is amazingly full of some of the top digital literary and history scholars working on cool projects, so I can’t wait to hear what they have to say.

As a heads up, if you follow my own Twitter account () I’ll be livetweeting this schedule (on UTC time):

Thursday, June 12

13:00-14:00 Keynote: Melissa Terras

14:00-15:00 About DH

 – Hieke Huistra, Bram Mellink

 – Niels-Oliver Walkowski

15:30-16:50 Crowdsourcing

 – Lars Wieneke and Marten Düring

 – Marie-Charlotte Le Bailly

 – Cissie Fu

 – René Voorburg

Friday, June 13

09:15-10:15 Copyright

 – Alastair Dunning

 – Renske van Nie

 – Tjeerd Schiphof and Karina van Dalen-Oskam

10:45-11:45 Linked Data

 – Wouter Beek, Rinke Hoekstra, Fernie Maas, Albert Meroño-Peñuela and Inger Leemans

 – Alina Saenko

 – Max Kemman and Astrid van Aggelen

12:00-12:45 Panel Session

12:45-13:00 Closing remarks

Digital Preservation 101: Undertaking a File-Level Inventory for Preservation Planning

This entry was originally posted on January 13, 2014.  It outlines the institution-wide file format inventory that I undertook at Dumbarton Oaks as part of my work as a Library of Congress National Digital Stewardship Residency fellow.

In order to develop a better understanding of the holdings at Dumbarton Oaks as part of my NDSR project, I have been working on a file-level inventory that can hopefully be embedded in a digital preservation workflow process at DO in the future.

The benefits of an inventory are manifold, but these are a few that I highlighted in a recent presentation (all originally adapted from ):


The inventory basically tells us what we have, how much we have, where we have it, and most importantly, what user behaviors surround the creation and management of digital assets.  Keep these goals in mind as you are working, because undertaking a file-level inventory won’t be easy.  There really aren’t a lot of tools out there, and the ones that are there require a pretty solid base of technical knowledge.

The two tools I decided to try out were JHOVE2 and DROID.

The first was my main focus, as .  On top of this, JHOVE2 includes validation of files, which is an added bonus when compared to DROID.

Drawbacks of JHOVE2, however, were pretty insurmountable in my project implementation.  They included the need to run now-outdated Java 6, and the lack of a GUI.


Command line, anyone?

The main problem that I ran up against with JHOVE2, however, wasn’t the actual implementation (all of the basic commands needed are outlined in the handbook, so even a relative novice can run it), but rather the reporting.  After going through all of the steps, the tool was spitting out a massive jumble of text that I was unable to make out.  After consulting the forums and trying our in-house IT specialist at Dumbarton Oaks, I had committed too much time to JHOVE2 and still couldn’t process the inventory reports and so I decided, for the sake of moving the project forward, that I would go with DROID instead.

The most recent version, , is a lot more accurate than older versions.  The install is incredibly easy (for Windows: download ZIP file, unzip, run BAT file, done).

The interface is also a whole lot prettier than JHOVE2:


But of course, there were still (mysterious) problems.


Beyond the occasional crash, the tool’s output is fairly readable, especially if you pre-read the user guide referenced above.

Here’s a small example of what the final reports look like:


DROID is helpful for identifying preservation issues, like the  above.  The report also provides information like MIME type (to get a top-level idea of general types of media), date last modified (I found this really helpful for determining whether a drive was full of archival assets or everyday files), and file format and size.

While I tried out these two tools, there are other possibilities to check out.  See this .  Some all-in-one preservation tools like also integrate file inventorying, sometimes referred to as preservation planning.