Skip to Main Content
Pollak Library

Web Archiving: What Is Web Archiving?

Internet archiving overview, tools, policies, and resources for California State University, Fullerton.

What is Web Archiving?

International Internet Preservation Consortium LogoAccording to the International Internet Preservation Consortium (IIPC), "Web archiving is the process of collecting portions of the World Wide Web, preserving the collections in an archival format, and then serving the archives for access and use." California State University, Fullerton uses web archiving to capture, preserve and provide access to web-based content related to our institutional history, faculty research, and student projects.

cobWeb

With grant funding from the Institute of Library and Museum Services (IMLS) and in partnership with UCLA and Harvard University, the California Digital Library CDL is developing cobWeb, "an open-source platform to support collaborative collection development for web archives. cobWeb’s core functionality supports digital curators in establishing projects to collect thematically organized web archives, allows for submission of nominations to those projects, tracks claims made by collecting organizations to participate by archiving nominated web resources, and reports on holdings that have been archived."

Reasons

Reasons to Archive the Web

  • Preserve a permanent, time-stamped copy of born-digital material, which is intrinsically susceptible to change or deletion
  • Create permalinks (permanent hyperlinks) that avoid link rot in web pages and citations/bibliographies
  • Protect scientific and public data from censorship
  • Use machine-automated crawling to track website changes that are difficult for humans to detect
  • Hold corporations and persons accountable for their published speech, such as on Twitter
  • Plan responsibly for the preservation of soon-to-be deleted web sites, domains, faculty and student web projects, etc.
  • Manage web server storage more efficiently by outsourcing the storage of past versions of web content

IIPC Video: Why Archive the Web?

PBS NewsHour: The Story of the Internet Archive

Internet Archive Logo The Internet Archive (which includes both the free Wayback Machine and the subscription-based Archive-IT) archives over one billion (!) pages every week. Learn more about this fascinating non-profit corporation at the heart of archiving the public web.