Friday, July 4, 2014
Preserving online information for posterity
First and foremost, there is the Internet Archive, a not-for-profit digital library dedicated to preserving the Internet’s past for the use of future historians. It runs the popular Wayback Machine for seeing what websites used to look like. (Tech-media-tainment has been saved twice in its history, once in June 2011 and again in May 2014.)
Personally I wish I had been more diligent about saving email from old accounts from the early days of the commercial Internet. All that digital correspondence with friends from the 1990s is gone forever.
The Internet Archive is one of just a handful of institutions, including parts of the British Library and the Library of Congress, trying to ensure that what is online now is saved for the future, the Financial Times wrote. (See “How to preserve the web’s past for the future,” Financial Times; April 11, 2014.)
In May, the Internet Archive announced that its Wayback Machine had archived 400 billion indexed webpages, from late 1996 to the present, The Next Web reported.
The Internet Archive has collected about 15 petabytes of information to date, Mother Jones reported. (A petabyte is about 1 million gigabytes of data.)
Wikipedia also is becoming a resource to preserve Internet history.
The National Archives and Records Administration is uploading its digital holdings to the Wikipedia Commons to gain a wider reach for the documents, TechCrunch reported.
In April, the Digital Public Library of America celebrated its one-year anniversary. The DPLA is a platform that connects the online archives of many libraries around the nation into a single network. All the archives are searchable through the digital library’s website, Ars Technica reported.
Photo: Internet Archive servers in August 2011. (Pernilla Rydmark on Flicker)