OPEN Government Data Act

Leave a comment

I’m chalking it up to the fact this legislation was signed during the government shutdown to explain why I missed that the OPEN Government Data Act is now law.  It was first introduced in the House of Representatives in 2017 as the Open, Public, Electronic, and Necessary (OPEN) Government Data Act, and it became Title II of Paul Ryan’s Foundations for Evidence-Based Policymaking Act that was signed January 14, 2019.

This bill requires:

  • open government data assets to be published as machine-readable data
  • each agency to develop and maintain a comprehensive data inventory for all data assets created by or collected by the agency
  • each agency to designate a Chief Data Officer who shall be responsible for lifecycle data management and other specified functions.

These terms are defined by the legislation (and will be codified in 44 USC 3502):

  • data asset: a collection of data elements or data sets that may be grouped together
  • machine-readable: data in a format that can be easily processed by a computer without human intervention while ensuring no semantic meaning is lost
  • metadata: structural or descriptive information about data such as content, format, source, rights, accuracy, provenance, frequency, periodicity, granularity, publisher or responsible party, contact information, method of collection, and other descriptions
  • open Government data asset: a public data asset that is
    • machine-readable
    • available (or could be made available) in an open format
    • not encumbered by restrictions, other than intellectual property rights, including under titles 17 and 35, that would impede the use or reuse of such asset
    • based on an underlying open standard that is maintained by a standards organization
  • open license: a legal guarantee that a data asset is made available
    • at no cost to the public
    • with no restrictions on copying, publishing, distributing, transmitting, citing, or adapting such asset

The bill also creates in the Office of Management and Budget (OMB) a Chief Data Officer Council for establishing “government-wide best practices for the use, protection, dissemination, and generation of data and for promoting data sharing agreements among agencies.”  The Government Accountability Office (GAO) is in charge of gauging compliance with this legislation as well as the value of the information made public.  It’ll be interesting to watch what metrics the GAO develops for this purpose.

The General Services Administration is responsible for maintaining the online public interface for sharing this data, which is currently  The OMB will work with the National Archives and Records Administration to develop and maintain a repository of tools, best practices, and standards to facilitate open data sharing.

This legislation builds on the 2013 executive order signed by President Obama, Making Open and Machine Readable the New Default for Government Information.  It will take effect in July 2019.


What Ansel Adams Could Teach Archivists

Leave a comment

I recently had an opportunity to view an exhibit of Ansel Adams photographs at the North Carolina Museum of Art.  Although I’ve been a fan of his work for some time, I appreciated learning more about his notion of visualization.  In an interview he explained that “the creative work is the internal event that happens inside your mind when you see the photograph.”  He credited Alfred Stieglitz as a model for the idea of seeing in the mind’s eye:

“I come across something that excites me, I see the picture in my mind’s eye, and I make a photograph, and then I give it to you as the equivalent of what I saw and felt.”

Adams did a lot of his work under the auspices of the National Park Service, so many of his photographs are available at the National Archives.

I believe the process that Adams followed could be useful archival work.  Too often it seems we dive into projects and hope for the best without having any notion of what we hope to see as the culmination.  While we can’t mandate what records cross our thresholds, we can try to be more proactive about how we handle the records we’ve already received and how we perceive our responsibilities to gather and preserve the stories of today for tomorrow’s users.  Although this would undoubtedly be seen as too interventionist by many, in keeping with concepts such as the documentation strategy, I think archivists should think carefully about how we reach out to people who are currently creating records and provide more guidance about how best to create and maintain records of events occurring today.  If we choose to avoid this role, I see two things happening — either the records are lost from history and/or archivists lose our voice as others step in to provide advice on managing the predominately digital records that are currently being created.

“Bringing Things Together: Aggregate Records in a Digital Age”

Leave a comment

Since researching born-digital records in manuscript repositories to write my master’s paper, I’ve been interested in how archives are handling electronic records — their appraisal, acquisition, processing, preservation, and access.  This week’s review focuses on an article that delves into the processing and access piece — “Bringing Things Together: Aggregate Records in a Digital Age,” published in the Fall 2012 issue of Archivaria.  Geoffrey Yeo worked as an archivist for the Corporation of the City of London, for St Bartholomew’s Hospital, London, and for the Royal College of Physicians.  He taught archives and records management at University College London (1995-97 and 2000-14).  He also served as programme director for the MA/Diploma/Certificate in Archives and Records Management and the MA/Diploma/ Certificate in Records and Archives Management (International) at UCL’s School of Library, Archive and Information Studies (2002-5).

Yeo laid out an ambitious path for archivists in his abstract (44):

  • “build multiple overlapping record series to meet different needs or realize different conceptualizations of series boundaries”
  • “bring together aggregate records and present views of context when they are required”
  • develop “scalable and user-friendly systems that enable the construction of
    aggregate records as well as ‘non-record’ aggregations, and also preserve information about logical contexts and about physical arrangements imposed in the past”

Unfortunately, this article identified more of the problems rather than the solutions to these intriguing challenges.  For background, Yeo delved into the concept of a record, collections vs. fonds, series, and files.  He then elaborated on David Weinberger’s concepts of order from his book Everything is Miscellaneous as a framework for analyzing digital records.

  1. The first order of order — physical objects cannot reside in more than one sequence at the same time
  2. The second order of order — alternative sequences are enabled by “laborious representational surrogates” (i.e., indices) (57)
  3. The third order of order — “resources can be arranged into as many sequences as may be desired and users can organize their work independently of the limitations imposed by analog systems” (58)

Yeo acknowledged that these notions of fluidity could make archivists wary, but he embraced the third order of order, encouraging multiple collections alongside the repurposing and reuse of content by users.  Contrary to the notion in the paper world that there is “this single correct arrangement” (59),  he urged a different approach to digital records.  Where original order has been used to support provenance and authenticity, he asserted, “Not all users seek evidence of the occurrents that records represent, or look for groupings of records based on contextual provenance” (68).

Yeo seemed to be prioritizing the user, allowing for the possibility of new combinations, new learning, and regroupings to suit the needs of the user.  He suggested that the types of cross-boundary collections and search and discovery that are available to users in other domains will also be expected in the archival realm (e.g., creating a collection on Flickr).  The paradigm shift for records management is from controlling aggregate records in a stable physical form to “ensuring that aggregate records can be constructed when we require them” (71).

He acknowledged the literature that suggests some users do not demand contextual information, especially when their research purpose is prooftexting.  In this brave new world, “memory and identity (as perceived by those alive today) supersede
history (of the world as it was, or as it might have been)” (74).  Yet Yeo seemed adamant that context matters, concluding:

“If we are to allow or encourage users to create their own collections and construct their own hierarchies, we also need to find ways of presenting larger or previous contexts and of enabling users to contextualize each item in their collections” (75).

Yeo returned to the idea of original order as it relates to digital records, acknowledging that many individuals and organizations use a hierarchical system of electronic record organization that mirrors the paper storage system and that these orderings can “tell us something about the priorities and perceptions of the people concerned” (77).  He provided two examples of repositories that combined linear descriptions with third order flexibility — however, Yeo provided no information about the time and other resources necessary for these undertakings.

Yeo concluded with a list of ingredients necessary for preserving contextual information and orderings while providing the ability to aggregate records:

  • Granularity: Yeo asserted “the item is the paradigmatic unit of control in the third order” (80).
  • Relational modelling: Yeo acknowledged that this possibility necessitates rich metadata, which depends on both manual and automatic capture.
  • System interfaces and functionalities: “Unconstrained by paper paradigms, systems and interfaces should enable archival resources to be presented in many different ways, reflecting their various ‘original’ orders, different interpretations of context, and other orders newly desired by users in the course of research and experimentation” (85).

My frustrations with the literature on born-digital records persists after reading this article.  To my mind, the goals set out in the abstract  were not met in any concrete way.  I long for some middle ground between theoretical stabs in the dark and dull procedural manuals.  Perhaps the problem is simply that the people with boots on the ground do not have the time to write about their work, but we need a louder voice from accomplished practitioners.

“On the Occasion of SAA’s Diamond Jubilee: A Profession Coming of Age in the Digital Era”

Leave a comment

Helen R. Tibbo delivered her presidential address at the 2011 annual meeting of the Society of American Archivists (SAA) held in Chicago, Illinois.  She has spent her career in education, serving on the faculty of the School of Information and Library Science (SILS) at the University of North Carolina at Chapel Hill since 1989.  I both took several classes from her and worked with her on the Closing the Digital Curation Gap grant project during my time at SILS.  Her address was published in the Spring/Summer 2012 issue of the American Archivist.

Tibbo identified the 75th anniversary of SAA as a turning point for the archival profession — a “coming of age in the digital era” (18).  She suggested three steps necessary for archivists to move forward in this digital era (19):

  1. learn about new technologies
  2. acquire new skills
  3. implement these skills

She cited research that underscored the overwhelming shift from analog to digital — in 2000 about 75% of information was in analog form, but by 2011 over 99% of information was born digital.  She identified some milestones related to electronic records:

1939: Records Disposition Act defined punch cards as records

1943: Records Disposal Act included in its description the phrase “regardless of physical form”

1965: the National Archives and Records Service (NARS) helped the Bureau of the Budget inventory punch cards and computer tapes

1968: the Data Archives Staff was formed by the Archivist of the United States

1970: NARS accessioned the first electronic records from federal agencies

1989: the National Association of Government Archives and Records Administrators and the University of Pittsburgh sponsored “Camp Pitt,” an advanced institute for government archivists focusing on archival electronic records

1993: Armstrong v. Executive Office of the President brought attention to the importance of email management and preservation

The intervening years “have witnessed extensive progress toward robust repository models and architectures, preservation tools and strategies, collaborations and community building, and trustworthy and sustainable digital curation” (23-24).  Yet archives still struggle to plant themselves firmly in the digital realm.  A 2010 report from OCLC entitled Taking Our Pulse listed these results of a survey of special collections in Association of Research Libraries institutions (24-25):

  • “Half of archival collections have no online presence;
  • User demand for digitized collections remains insatiable;
  • Management of born-digital archival materials is still in its infancy;
  • 75 percent of general library budgets have been reduced;
  • The current tough economy renders ‘business as usual’ impossible.”

Tibbo asserted that while cost is certainly a factor inhibiting electronic records management in archives, inadequate education is the more significant problem.  She suggested there should be graduate programs in digital archiving, technical courses, and systematic continuing education for archivists.  She pointed to three initiatives during her tenure as SAA president that addressed these needs:

  • The “Guidelines for a Graduate Program in Archival Studies” (GPAS) were updated.  Tibbo explained “GPAS can only provide a framework and metrics for excellence but no recognition or enforcement” (26).  However, Tibbo contended GPAS serves an important role in raising expectations for  graduate programs.
  • The SAA Digital Archives Continuing Education Task Force designed the Digital Archives Specialist (DAS) Curriculum and Certificate Program.
  • Along with Cal Lee, Tibbo helped develop the DigCCurr digital curation curriculum at SILS.  This framework includes the Matrix of Digital Knowledge and Competencies and the High-Level Categories of Digital Curation Functions.

Tibbo concluded with four challenges to her listeners (33):

  1. “do something significant before next year’s SAA conference to advance your skills and knowledge”
  2. “design your digital repository or how you are going to participate in some sort of digital consortium”
  3. “go get funding support”
  4. “take some steps and do something to preserve digital content important to your collection and your users”

“Janus in Cyberspace: Archives on the Threshold of the Digital Era”


Richard Pearce-Moses delivered his presidential address at the 2006 annual meeting of the Society of American Archivists (SAA) held in Washington, D.C.  During his archival career, Pearce-Moses worked as a Local Records Management Consultant for the Texas State Library, as Documentary Collections Archivist and Automation Coordinator for the Heard Museum, and as Curator of Photographs at the Arizona State University Libraries.  He also worked for nearly twelve years at the Arizona State Library and Archives — as Director of Digital Government Information, Coordinator of the Cultural Inventory Project, and lastly Deputy Director for Technology and Information Resources.  Finally, he was the first director of the Master of Archival Studies program at Clayton State University (2010-15).  This address was published in the Spring/Summer 2007 issue of the American Archivist.

Pearce-Moses used his presidential address to challenge archivists to focus on the future.  Like Herbert Angel in his 1967 presidential address, Pearce-Moses referenced Janus, the Roman god who faces both forward and backward — suggesting he is the “perfect patron of archivists” (13).  While archivists have an established history of protecting the past, Pearce-Moses asserted the core reason for this work is to make these records available in the future.

He acknowledged some of the changes brought on by new digital technologies, such as shorter planning horizons and greater expectations for access to information.  He considered three scenarios for what could happen to archives in the future:

  1. Status Quo.  This scenario was quickly dismissed by Pearce-Moses because of its implausibility.  The rapidly changing formats of records guarantee that things cannot stay the same.
  2. Worst-Case.  This scenario will come true if archivists don’t learn the skills necessary for the digital era.  Pearce-Moses noticed that increasingly business people involved in the oversight of electronic records are highly regarded while those caring for records rarely rise to the top of the organizational hierarchy.  Therefore, if archivists don’t embrace the digital era, his view of the future is that “records of enduring value are lost and poorly organized.  Often people cannot find the records they need, and if they do, those records are hard to use, understand, or trust. We will have lost our social memory” (16).
  3. Best-Case.  The rosy picture of the future depicts archives as critical institutions that preserve key records, providing easy access to them along with expert assistance.

Pearce-Moses suggested a number of steps necessary to make the best-case scenario a reality — most importantly, archivists must learn new skills to become comfortable with digital records.

  • Archivists should be as knowledgeable about digital records as we are about paper records.
  • Archivists must “appreciate that the fundamental nature of records has changed in the digital environment” (17).  The work that has been done on this front in the theoretical world needs to be translated into practical knowledge.
  • “Work with electronic records will not be a job for specialists as the majority of records will be digital” (18).  (Based on current job postings, I’m not sure we’ve reached this point of familiarity and flexibility yet in the archival world.)
  • Archivists must develop the soft skills necessary to work with people who work in the technological world.
  • Archivists need to think strategically — “We cannot predict the future, but we can influence it and confront it in more informed ways” (19).
  • We need trend spotters who can identify changes that will impact archives.
  • We need embracers who are willing and able to incorporate new technologies into archival work.
  • We need planners and evaluators to make sure archives’ use of technology ultimately suits the needs of the patrons.

In addition to these skills, Pearce-Moses listed some attitudes that are crucial for crossing this threshold into the digital world — many of which overlap with the skills already mentioned:

  • early adopters
  • risk takers
  • problem solvers
  • creativity
  • initiative/drive
  • reality — “Let us celebrate the reality of what we can accomplished, rather than bemoan the dream we did not fully realize” (20).

Finally, Pearce-Moses defined a set of core principles and goals that should guide the work of archivists (20-21):

  • “Archivists select and keep records that have enduring value as reliable
    memories of the past.”
  • “We organize our collections so that the information in the records can
    be found and interpreted in proper context.”
  • “We help people use and understand those records.”
  • “We protect records from degradation, ensuring that they remain accessible
    over time.”
  • “Archivists know that ‘what is past is prologue,’ that history informs and
    influences the future.”
  • “We understand the importance of authenticity and trustworthiness.”
  • “We are driven by knowledge that records play a key role in holding
    people and organizations accountable.”

Pearce-Moses concluded with this challenge to archivists:

“Ultimately, to thrive in this world, to realize the best-case scenario, we need
the spirit and attitudes of pioneers.  We need the courage and—maybe more
important—the desire to step outside our comfort zones.  We need the willingness to leave what is comfortable and familiar and to pass through the doorway
to the unknown.  If we learn to be comfortable taking risks, we can take a
leading role on the digital frontier.  We can be pioneers—first through the
door, scoping the terrain, and figuring out what to do next.  And if we are on the
leading edge, we will be better positioned to fulfill our social mandate of
preserving the cultural record” (22).

Personal information management

Leave a comment

Another year is coming to a close, so I find myself turning to the idea of personal information management (PIM) — trying to figure out what I do well and what I could do better in managing the information that flows through my fingers, literally or digitally.  The term itself is somewhat debated, with some emphasizing the personal rather than business, others emphasizing the information management, and others suggesting end user information management is a better term for the stuff that must be managed by people.  There are certain principles that define PIM:

  • search
  • find
  • encounter
  • interpret
  • decide to keep or not
  • file and organize for re-use
  • re-access
  • use

Personally, I find the decide to keep or not keep step to be the most daunting.  And being able to re-access information in a prompt manner is most important to me.

A blogger for Scientific American specified four principles that guide PIM for her:

  1. Identify the types of things you are trying to organize
  2. Keep things simple
  3. Set up automatic tools
  4. Consistency

The Signal blog of the Library of Congress recently highlighted a curriculum for a personal digital archiving workshop that was developed in Georgia.  Such efforts point out not only that PIM is a growing field but that archivists recognize the necessity of providing leadership in the field of PIM.  As long as the archival world is embracing the concept of a continuum model of records, and many records are now being created in born-digital formats, we also must embrace the role of archivists in shepherding people to be good digital stewards who create and maintain items that can be appropriately accessioned into archival collections.

If you need more information on PIM, here are some suggestions:

  • Jan Zastrow provides a simple definition of PIM and includes a great list of other references
  • Purdue University has a good LibGuide (though it’s slanted towards citation management)
  • William Jones wrote an article entitled “Finders, Keepers?” that focuses on the questions of whether and how to keep information
  • for a much more academic approach, the Haystack project out of MIT researches information access, analysis, management, and distribution
  • if you need a term for the stuff you collect that eludes your management systems, try “information scraps” on for size
  • if in the end you decide you prefer your current state of disorganization, take heart in David Weinberger’s Google TechTalk, “Everything is Miscellaneous

Crowdsourcing public records requests

Leave a comment

The discussions about using body cams in the law enforcement community have continued this week.  NPR’s Morning Edition aired a piece entitled “Transparency vs. Privacy.”  The transparency comes from laying bear the interactions of police with the public.  But Martin Kaste interviewed the Chief of the Los Angeles police department, who said while videos would be made available for legal cases, privacy concerns would prevent their widespread distribution.  The confidentiality provisions that apply to law enforcement records are numerous.

The man in Seattle who made a public records request for all police videos has now been identified.  Timothy Clemans explained his rationale to Kaste: “If we make all these videos public and people really start watching them, that any inappropriate use of force and bias policing will eventually go away because there’ll just be so many people complaining all day long.”  But instead of fighting the request, the chief operating officer of the Seattle police department has taken the novel approach of enlisting Clemans and other techies to help devise a way to redact information from the police videos that should not be made public.  Clemans has suggested a method for blurring people’s identities in videos.  The Seattle PD hosted a hackathon in the hopes of generating more ideas of how to balance transparency with privacy.  While it will still take some time to parse the results, the Seattle Times reported the COO considered the event a success.  Obviously Seattle has the advantage of being located in a tech hub — it will be interesting to see whether other localities are able to coordinate similar sorts of events to harness the power of digital activists and whether the solutions proposed in Seattle get wider usage.

By the way, Clemans has withdrawn his public records request.

Older Entries