Three Synthesis Statements on Open Government and Archives and Records Management
Nov 30, 2017
As the InterPARES Trust project starts to wrap up, it has provided an occaison to reflect on the five years of work (!) that was completed. I was priveleged to work as a research assistant and later research consultant (i.e. pro bono post-graduation) on project NA08: “The Implications of Open Government, Open Data, and Big Data on the Management of Digital Records in an Online Environment.” NA08 included some fantastic people and inspirational professionals: Valerie Léveillé, John McDonald, Kelly Rovegno, Jim Suderman, and Kat Timms, who produced an impressive body of papers and insights from this project. The project investigated open government along a number of lines, but was united by an interest in how open government relates to the theory and practice of archives and records management and where the gaps and challenges between the two lie. InterPARES members are currently preparing a book on the project’s outputs. Offering some feedback on this work inspired me to write a short piece summarizing some of the key issues I experienced in working on NA08 and some of the things that continue to interest me in this area. Open government has a lot of promise and can include a lot of different types of activities and ideas under its umbrella. All of them have potential impacts and interactions with archives and records management. I am posting this little review piece here: three short(ish) synthesis statements that draw from our work and some of the insights we gained. It’s far from perfect, but I hope it will provide for an entry for the curious into this complex and evolving area.

Crowd of women, 1920s
“Crowd of Women” [ca. 1920] by William James, City of Toronto Archives, Fonds 1244, Item 674.

The Relationship Between Open Government and Archives/Records Management

If a key goal of open government is to encourage citizens to access and use government information, then public archives are arguably the oldest form of open government. Public archives have long been the central place where citizens interested in acquiring information about their governments (and themselves) could turn. For access to records that have yet to be transferred to the archives, systems for freedom of information requests that attempt to balance the need for security and privacy with rights to access have been enabled by governments the world over. The relatively recent embrace of open government has bypassed archives and freedom of information processes somewhat with an emphasis on rapid access to government information through the proactive disclosure of records and making machine-readable and sharable government data readily available. In Canada, open government programs are often organizationally separate from their archives, records and information management counterparts. But both areas clearly have much in common: an interest in providing accurate, authentic, and trustworthy information to citizens. They key difference between the two areas at present relate to the timeline of access and the scale of aggregation. Open government information and data both reduces and lengthens access timelines. Government data created in one week might be made available the next, and that same dataset may be still available much longer in the future than if it was managed in a traditional “retain and dispose” records environment. The scale of data aggregation means that information derived from records and transformed into machine-readable datasets can be easily queried, analyzed and re-used in ways that access to original individual records could not have supported or their creators even intended. But the processes of aggregation (sometimes called “mashups”) could obscure their links back to original records, leading to doubts about the data’s accuracy and authenticity. These conditions surrounding access and aggregation open up a whole host of challenges and questions. What information should be made available? How long should it be made available? How can its quality and accuracy be verified? What releases of information may impact the privacy of individuals, and how to strike the balance between privacy and openness? And how can open data be made accessible and usable from a legal standpoint and a human standpoint?

For questions surrounding access, approaches to teasing apart these questions should follow the trail of the business process, as John McDonald and Valerie Léveillé write in their article “Whither the Retention Schedule in the Era of Big Data and Open Data?” “From a recordkeeping perspective,” they write, “big data and open data concepts and issues are like a tree where the roots represent the business processes of an organization and the branches represent the extension and augmentation of these processes to form the processes that underpin big data and open data initiatives” (p. 117). Among the potential approaches discussed in their paper are that “public-use datasets should be removed from [a] portal on a regular basis when they became outdated, but that the master dataset used to generate the public use datasets, together with their supporting documentation, should be retained for evidentiary purposes” (p. 112).

In terms of the aggregation of information, key issues are determining the authenticity and accuracy of the information in the dataset and ensuring an appropriate balance between privacy and access. In their paper “Through a Records Management Lens: Creating a Framework for Trust in Open Government and Open Government Information,” Valerie Léveillé and Katherine Timms also discuss these issues in the context of open government as a business process in three stages: initiation; identification and distribution; and promotion and evaluation. For example, they recommend using records classification schemes as an initial inventory to identify candidate records for publication in open information and open data initiatives (p. 173). They emphasize the importance of preserving context in which records were created: the “separate business functions and processes, information concerning the work processes, the system and the environment that guided the creation, capture, and control of the source material” that together with the processes that lead to the publication of open data or information, “provide a complete and accurate picture of the information’s context and meaning and, therefore, a basis for trust in the information” (p. 173). Data generated from an internal source and aggregated for public use must be traced back to its originating record for authenticity purposes. McDonald and Léveillé suggest that in the case of “mashed up” data from various sources, documentation on the processes used to perform these aggregations would be retained as evidence (p. 113). In another instance where confidential data has to be deleted due to disposition rules, they write of an audit trail approach that could be used to verify anonymized data that is kept much longer (p. 113). Distribution involves determining the conditions for access: how data may be licensed to users and what rights they have over the reproduction and dissemination of that data. Open data licenses are becoming a common response to this issue. At the stage of evaluation, the question relates to the accessibility of the data itself: can the majority of users access and use the data? Or are specialized tools and training required? Education, outreach and training for specialized and non-specialized users regarding systems and tools may be the necessary outcome after the initial push to put data online. For open data to fulfill the lofty democratic goals that are often cited to support it (trust, transparency, accountability, engagement and so on), its accessibility must also be democratic.

Data flows: From Government to Citizen and Back

One potential assumption of an archivist records manager working to support open government is that the flow of information always runs from government to citizen. Governments create information, package it for consumption, and citizens access it. And while citizens have long been the subject of government information, it is less often the case they actively create it. Citizen engagement processes, particularly ones that encourage uses of civic technologies, are in the position to reverse this condition. Through citizen engagement processes, citizens create records that governments are in a position to receive, analyze, store, and manage, potentially for the long term. This is not to say that aspects of this two-way relationship have not always been present: elections, town hall meetings and letters and depositions submitted to officials have all been common forms of citizen engagement with long legacies in governments. Similarly, citizens often have the opportunity to submit complaints or requests regarding government services. Rather, citizen engagement may create new forms and sources of data that touch upon a much wider array of services and functions than previous, and citizen-derived information, as in open government data, may be aggregated and analyzed at a much larger scale. Governments that use citizen information to provide for decision-making must be able to demonstrate the trustworthiness and transparency of the processes taken to make those decisions. A useful method for examining how these flows of information might impact records creation is through the IAP2 Public Participation Spectrum, which describes the qualities of different kinds of citizen-government interactions at different levels. In Managing Records of Citizen Engagement: A Primer myself, Valerie Léveillé, and John McDonald trace the different levels of the IAP2 Spectrum mapped to different records contexts as a method of identifying issues and strategies for records management.

Records and information professionals have to be aware of the ways in which data can flow between governments and citizens, which creates records that may be unanticipated by the usual records schedules. The first is instances in which governments directly seek feedback from citizens as part of a citizen engagement initiative. Such initiatives may seek information in various forms: e-mail submissions, social media comments, structured surveys, opportunities to edit or provide commentary on documents, or more conversational, discourse-based platforms. In consultation-based projects, these processes are intended to end up in some kind of decision or action on a particular issue. The feedback must be analyzed and packaged in such a way to inform decisions, which are then theoretically implemented by a government. At other levels of the IAP2, feedback may actually constitute the decision itself, such as in a vote on an issue, or decision-making processes may be created by back-and-forth discourse and problem-solving. All of these events create information that must be managed and audit trails that attest to how records created by citizens were used and acted on. In many cases, governments are now using sophisticated data analysis methods such as natural language processing software to summarize the opinions from a large corpus of submissions. As I write in “Your Comments Here: Contextualizing Technologies, Seeking Records and Supporting Transparency for Citizen Engagement,” “transparency of the whole engagement process, and how different information technologies for input and analysis were used to get from the beginning to the end, is a crucial contributor to establishing trust in the system itself through the availability of evidence” (p. 18).

Citizens themselves may also create data independent of governments that in turn has an impact on government services. Websites such as Fix My Street or other crowdsourced data collected by third parties could be used to inform governments of a variety of issues, from infrastructure to policy. Governments could potentially take in this data, but it must be verified and packaged in such a way that a government could act upon it. The appropriate rights over the use and ownership of this data would have to be worked out between the creators and governments; if governments receive such data, it is conceivable that that may wish to use it for their own purposes and will have to manage its retention and disposition accordingly.

Open Government Sustainability Into the Future

It should not be assumed that the continuation of open government programs is guaranteed. The embrace of open government is politically motivated as a potential method of increasing trust and providing resources to business, among other reasons, but just as a current government might implement and fund open government programs, a succeeding government could abandon them entirely. Lacking statutory requirements, open government programs are far from stable, even on the part of governments who initially championed them. As Kent Aitken writes, “It’s easy for governments to be transparent and to engage citizens when they want to; the question is how to ensure that happens when it’s hard.” Citing rollbacks in commitments to open government within the province of Newfoundland and Labrador in Canada, Aitken concludes that “the lesson that can be drawn from international politics is that progress on open government is not guaranteed.” Though I suppose it cannot be taken as proof of a change in attitudes towards open government within the United States, the US’s fourth action plan for the Open Government Partnership was announced as delayed on October 31, 2017 due to the need to “fine tune a strong and quality action plan reflective of national priorities” (Graves, 2017). As institutions, public archives have a staying power that comes from a commitment to the long term. Though they can themselves be subject to all manner of politically motivated actions, they tend to stick around much longer due to their foundation in statutes. Similarly, records management processes and rights to access and information and privacy have a similar support, both due to statutory requirements and the cost efficiencies and risk management derived from records and information management activities. Therefore, support from archives and records management for open government processes may be a lead-in towards their increased sustainability, or the two may be poised to mutually benefit from one another. As the final report of InterPARES project NA08 concluded, “the study found that record keeping policy and practice was weak or absent in relation to aspects of Open Government, and particularly in relation to citizen engagement initiatives. Few measures of success were identified for engagement initiatives nor did any criteria seem to be well established within or across jurisdictions reviewed” (p. 4). Herein lies an opportunity to strengthen both.


