Harnessing the good of the Internet

Leave a comment

Less than two weeks ago, the Smithsonian entered the ranks of institutions that have crowdsourced transcriptions of documents in their collections.  As explained in their press release, they have digitized images of millions of documents, but those that are handwritten or otherwise cannot easily be deciphered by a computer have limited discoverability because their text cannot be searched.  So volunteers can register to participate and then choose among a diverse group of projects to transcribe, which are listed according to the individual museum (e.g., the National Museum of American History) or by theme — American Experience, Biodiverse Planet, Civil War Era, Field Book Registry, Mysteries of the Universe, and World Cultures.  In this short period of time, dozens of projects both short and long have already been completed.  Once a transcription is finished, it is reviewed by other volunteers before it is marked complete.

This work is done anonymously, which fascinates me considering that so much of American society today seems intent on calling attention to individuals.  In their call for volunteers, the Smithsonian targets “researchers, educators, citizen scientists and history buffs” — but I think it would be a great project to find out more about who joins these ranks of volunteers.  Are they retired persons? subject experts? students? technological experts?  While I support doing work for the greater good rather than for individual fame, there will remain a part of me that is very curious about the membership of these new crowdsourcing communities that are being created.


Archives*Records 2014

Leave a comment

Library of Congress reading room

Library of Congress reading room

The annual meeting of the Society of American Archivists concluded yesterday in Washington, DC.  This year’s meeting was a joint meeting with the Council of State Archivists and the National Association of Government Archives and Records Administrators.  More than 2,300 attendees pre-registered, making it the largest meeting to date.

One of the sessions that I attended featured Scott Armstrong and other principals of the PROFS case, reflecting on how far we’ve come in the 25 years since that case was decided.  In addition to learning some interesting facts about the case involving emails in the Reagan White House, this session provided some food for thought about why we do the work we do.  Armstrong emphasized that information is the currency of decision-making, underscoring the importance of guaranteeing citizens access to the records of government.  Tom Blanton of the National Security Archive suggested that the National Archives and Records Administration doesn’t understand its own moral/shaming-and-naming/intervening power.  It’ll be interesting to watch whether this attitude wins out after the recent IRS records scandal.

Social media received a good deal of attention this year.  Geof Huth of the New York State Archives pointed out that records in the form of social media are especially valuable for demonstrating how the government wants to be seen.  He suggested that repositories should develop policies regarding content creation, appropriate use, and security.  Most importantly, he acknowledged that merely capturing social media records does not equate to preservation; in addition, at the end of the day, the point is to provide access to these records.

In a session about appraising government records, Sarah Koonts, the State Archivist of North Carolina, commented on one of the issues that I raised in previous posts.  She concluded that the main reason it’s hard to get people to conceptualize why it’s important to appraise electronic records is that most people assume it’s easier to just save it all.  She also suggested that researchers, citizens, and public interest groups need to be engaged about future uses of modern government records.  This can help shape both what is shaped and how it’s made accessible.


Digital hoarding

Leave a comment

Concerns about hoarding seem to have become a fascination to Americans in recent years.  Consider that A&E squeezed 6 seasons (41 episodes) out of its show Hoarders.  In 2012, Melinda Beck wrote an article in the Wall Street Journal about digital hoarding.  She cites experts who suggest that the accumulation of digital files verges into hoarding when it is disorganized and interferes with other relationships and responsibilities.  She estimates that people only use about 20% of what they save.

While physical hoarding has signs that may be recognized, digital hoarding is harder to recognize.  I come across a lot of people who are determined to pare down their stacks and drawers of paper — in favor of scanning these same items and keeping them in digital form, FOREVER.  I’ll confess, I have taken to scanning magazine articles and saving them as tagged PDF files rather than filing the print version.  Being able to have them as files that can be searched using the index function of Windows Explorer and that are fully keyword searchable makes them more useful to me, and I have developed a system for filing them that makes them findable.  But I also try to weed out my electronic files of items that have exceeded their usefulness just as I do my paper files.

It’s not only people — governments are getting into the digitizing craze.  The National Archives and Records Administration has as one of its strategic goals “Make Access Happen,” and according to a blog by David Ferriero, one of the methods of accomplishing this is to digitize records.  The Pittsburgh Post-Gazette reports that a new law takes effect in Pennsylvania today that will allow counties to store court records electronically rather than requiring paper or microfilm record copies.  They have not yet finalized the requisite standards and procedures, but soon enough, PA courts will no longer be required to maintain human-readable court records (i.e., records that can be read without the use of a machine).  The article touts the cost savings this will bring to the counties because of decreased physical storage requirements.

I like the increased access that comes with electronic records.  But my fear is that the rush to digitize ignores the costs of digital preservation.  The Nationaal Archief of the Netherlands has a report on the Costs of Digital Preservation that breaks the costs down in this way:

  • creation of a digital repository — physical space, hardware, and software
  • personnel
  • preservation — software to guarantee the authenticity of records plus efforts required to migrate and/or emulate records
  • public services — training, etc.

In his famous 1995 article for Scientific American, Jeff Rothenberg warns, “digital information lasts forever – or five years, whichever comes first.”  So I guess my main concern is that we not put everything into our digital “file cabinets” and then think we can walk away.  There’s still a lot of work to be done to maintain these files — and there will be costs.  And just as there can be disasters that compromise paper records, electronic records are also vulnerable.  Take as an extreme example this Dropbox disaster that was reported last week.  (Spoiler alert: this story will make you want to start keeping photo albums on your coffee table rather than in the cloud!)  As with all things in life, decisions regarding how to maintain records should be made after thoughtful review and with careful analysis of the costs and benefits.

Summer reading

Leave a comment

Summer is the perfect time to catch up on some reading.  In case you’re in need of suggestions, here’s a list assembled by the New York Public Library that runs the gamut from babies to adults.  For a list with more of a history focus, check out this one from the Library of Congress.  David Ferriero, the Archivist of the United States, keeps a running list on his blog of books that he’s reading.

Many colleges require incoming first year students to read a book that will be discussed during orientation, as a means of exposing students to new ideas and encouraging them to become part of a community of dialogue.  Last year, the Business Insider compiled a list of 19 books being read by incoming freshmen.  Here’s the list for this year’s reading at some of the top ten colleges on the U.S. News and World Report national universities list: