EXCLUSIVE – Managing the preservation and accessibility of public records from the past into the digital future
OpenGov speaks to Ms. Justine Heazlewood, Director and Keeper of Public Record, Victorian Public Record Office
Could you tell us more about your role at Public Record Office Victoria?
The Public Record Office is the State Archives of the Victorian Government. We have records that date back to the 1830s. There’s about 100km of paper records and obviously now there are a lot of born digital records governments created as well, so we have a hybrid paper and digital set of archives. For accessibility purposes, paper records are digitised so that we can provide access for them online.
The other thing that the Public Record Office does is set standards for government agencies across Victoria in the area of records management. In the digital environment, there is not a lot of difference between records and information so that means we set standards in the area of information management as well.
Those standards cover access and security, the creation of records and the disposal of records because most of the information that government creates can be destroyed after a certain period of time. That’s the job that we do and my job is to manage that organisation.
What are the key initiatives at your department right now?
One of the areas of focus that we have is about access to records. As you can imagine, there’s a whole lot of paper records, it’s pretty hard to access them unless you come into where the records are stored, which is in Melbourne and access them manually. It’s not very accessible for people. We have published our catalogue information online so people can find that we have the information but they can’t necessarily get access to it.
As we move into a more digitally focused age, we’re trying to provide equity of access to all Victorians. We have to work out how we can provide access to people who can’t necessarily come into our repository in Melbourne, they might work or live very far away so we need to work out how to provide access. So one of those ways is to digitise the records so that they can be accessed online. That sounds very simple because all you’re doing is taking an image of a document but that’s actually not providing much access to people because an image does not tell you a lot about what the text in that document says.
For something that’s type-written or a computer print-out, it’s not that hard to work out what the text says but when you’re talking about 19th century documents that were written in handwriting, then it’s a lot harder for something like OCR to work to provide access to or to translate that handwriting into a text readable format. That’s where we have to use humans. So we have a lot of volunteers who work to transcribe those records so that they are text searchable.
There’s a lot to access that isn’t necessarily ‘just digitise something’, there’s a lot of work to be done in addition to just digitising them. There’s capturing metadata about the digital images because again, if I just take photographs of a whole lot of things and send you the files, that’s not very accessible, you have to open each one and look at it in order to find the one that you want. So we have to capture metadata about each of the images in order to make them accessible so a lot of thinking and work goes into that.
The other aspect of that is governments are very large and complicated environments, it can be hard for individuals to navigate government, to work out what records they actually want. They know the information they want but they don’t necessarily know which records have that information, they don’t necessarily know which part of government created those records.
The way that our information is organised at the moment, it’s organised by government function so unless you know a certain part or department of government that created those records, it’s quite difficult to find. So someone might come in and they might want their grandfather’s employment records because they knew their grandfather worked for government but they don’t know which part of government. So how do you find that information? You have to look at each government department, you have to look their employment files, it’s quite difficult to find.
So that’s another one of our initiatives, it is actually to work out how we take the metadata and the information we have about the records and put that in a form that people can easily search and find what they want.
For these initiatives, digitising records and making it easy for individuals to access records, how far have these projects been in progress?
It’s something we have been working on for a long time, probably over a decade. It’s not work that will finish, it’s just a project we can deploy more technology on as technology improves. So if the tools for data analytics become more user-friendly, then we can use those as a way of slicing and dicing our collection in a different way that provides a different form of access for people.
So for example, one thing we have done recently is we have taken all our metadata about our records, which is available in a catalogue-type form, and we used a visualiser to display the linkages between different departments and functions. Because government departments change over time, they change their names, their functions move around. So if you are interested in say, education, you’re probably looking at something in the order of 30 or 40 agencies that have a role or have had a role in the past in education. So how do you display that to people without giving them a long list?
You can do it with visualisation and when someone types in education, they can see a map of all of the different agencies that were ever involved in education in the past and how they relate to each other. The visualisation tool is available for public access. (Screenshot below)
Something we’re working on which isn’t available yet is providing a tool for people who wish to volunteer their time to us to make records more accessible, to be able to do that in the comfort of their own home. This will be a tool where they can say, “I am interested in transcribing some of the records you digitised”, so we can provide them with the images and they can have a tool that they can type into from home.
Then we can use the data that they create as a way of making the collection more accessible. Or to give you another example, we have lots and lots of early historic maps of Victoria and these are not geo-tagged in any way but if someone wanted to at home, take those images and geo-locate them, so add tags that provide geo-location information, we could then take those information back and make it more accessible. So instead of having to know an old place name or something like that, they can put in a current location name and then that map will come up.
Those are the things we are looking at but we haven’t implemented them yet, so that’s something we are working on.
The projects you are working on right now, they are basically all concurrent? Are there any specific goals or timelines to meet?
We do have goals for when we’re going to implement the technology. For instance, we’ve just implemented a new website that provides some of the groundwork for that user-generated content tool that I was talking about. That’s the first step in that, we launched that website as a beta version so that we can get feedback from our users to improve the system.
So that’s something that we have been doing for the last 12-18 months, we’ve just launched that in a couple of months. Another thing we’re doing is we are building a new digital archive that is a repository for born digital records, records that were generated in digital form, they’re never printed, they should remain in digital form and we need to archive those, house them, look after them and provide access to them. We have an existing digital archive but the technology is around 10-15 years old, so we’re currently updating that, we’re building that new digital archive and that’s going to be finished in about 2 years’ time.
So we do have technology goals but the overarching objectives are ongoing and longer than just the technology builds. The technology is just the tool to help us get there.
How do you manage the process of balancing existing legacy infrastructure and bringing in new technology?
There’s 2 aspects to that. The first aspect is our own internal technology and systems that we own and operate or we buy in from somewhere else. Our principles there have evolved over the years – generally speaking, we look mainly at systems that are based around the open standards, possibly might be open source, have a lot of APIs so that they are accessible to other systems. And most importantly, the systems are designed so that they’re easy to extract the data from those systems so when we transition to a different system, we can extract the data out without any difficulty. These are our principles, but we haven’t always met them, particularly with some of our very old systems because they were built at a time before open standards and open software was really a thing.
The other issue is around the records, particularly the born digital records, the different formats that are used to create records in government. As you can imagine, there’s a whole bunch of different technologies that have been used to create born digital records, and so our issue then is managing those records so that they continue to remain accessible into the far future. So 50 years, a 100 years, 200 years, they still need to be accessible because those records are of enduring value to people of Victoria. So that’s a format issue and a format obsolescence issue.
And the way we manage that is to set standards for the kinds of formats that we will accept into our archive and the kinds of metadata that we need in order to manage those digital objects and formats going forward. We have a set of criteria to determine what formats are going to have a longer lifespan and then we only accept born digital records in those formats, so that we can continue to manage them and provide access to them over the years.
We wouldn’t want to be stuck with a whole bunch of records in legacy formats that aren’t accessible.
Related to working with other government departments – do you work with every single agency? What does the process entail?
For the creation of standards, we would use a reference or stakeholder group and also we put standards out in draft form for comment so there’s plenty of opportunities for agencies to talk about whether they feel they can comply with those.
In terms of compliance, there’s a requirement under our legislation that the heads of all the agencies are responsible for complying with those standards. If agencies are having trouble, we would then come in and provide them with support where we can. That might be in terms of providing them with specific advice around issues they may have around legacy databases, that’s an issue that comes up regularly. So we would then provide advice about that.
Sometimes we provide general advice. We’ve issued advice around the use of cloud, for example and the record keeping issues you need to think about when you want to use the cloud.
How does your department manage with the security and privacy of data?
We have different classes of information in terms of the records we manage, the collection that we manage. Most records are openly available so they’re available to anybody who wants to access them. So our focus there is actually on access, more than security. Obviously we have to make sure there’s no compromise of those records, that the integrity of those records is maintained, but we have mechanisms in place to ensure that.
And then another part of the collection is closed to public access for periods of time. They may be closed because they may have sensitive information about individuals. For example, health records, prison records, children in care records, obviously we wouldn’t make those available to the public. So those records are kept securely, we don’t provide access to those records.
Those are the two broad classes and we have different approaches depending on the status of that information. Records get assessed in terms of privacy as they comes into the organisation, so we make that determination upfront and early on.
What are some of the milestones and challenges in your 13-year experience at the Department of Public Records?
One area that I think we’ve done very well in is in the management of digital records. We were very early in terms of Australia, in fact, the world, to develop a strategy for managing digital records into the future. And that’s called the Victorian Electronic Records Strategy (VERS), it’s been around for quite some time, since the 1990s. That guides our approach to the management of digital information, the digital records of government.
So developing out of that has been the establishment of a digital archive, we were one of the first in Australia, in fact, we were the first to develop an end to end digital archive that could accept digital records, securely store them and provide access to those records. I think that’s been a great achievement.
One of the challenges is the fact that technology is changing frequently and government agencies often have tended to adopt technology without thinking about the information management facets that are important. So we end up with a lot of legacy technology which contains important government information and no real way of extracting or managing that information.