Ever encountered a “404 Not Found” page? Discover ways to retrieve lost information and keep online posts safe.

Have you ever followed a link and ended up on a blank page showing “Error 404” or “404 Not Found”? If yes, you’re not the only one. There are various reasons this can occur — the most straightforward being a typo in the web address. However, more often than not, the issue arises because the page has been removed or relocated, occasionally on purpose.

That’s why Fact check has compiled a guide to assist you in locating deleted or modified content. We also examine the most widely used tools for digital archiving, which are beneficial not only for retrieving lost information but also for safeguarding any online material that could be significant in the future.

As The significance of online fact-checking has increased, making tools for preserving digital content crucial. These tools enable users to take ‘snapshots’ of websites or social media updates, recording their appearance at a particular point in time and keeping them available — even if the original material is removed.

Internet content changes constantly — pages disappear, links become inactive, and content is altered or deleted. A study by the Pew Research Center found that 38% of webpages from 2013 are no longer accessible.

Archiving goes beyond a technical approach — it serves as a means for responsibility, openness, and safeguarding the past. Practical examples demonstrate the importance of archiving.

In January 2025, the White House closed its Spanish-language page. The Library of the US Congressremoved specific sections of the U.S. Constitution from its digital repository.

In September 2022, Iranlimited internet access in certain areas of Tehran and Kurdistan, with Instagram and WhatsApp restricted during demonstrations after a Kurdish woman died while in police custody.

And in China, a previously large online archive managed by Peking University, which enabled searches of over 2.5 billion historical Chinese web pages, is no longer available.

WEB archives have played a significant role in offering proof during legal proceedings and public debates.Pictures, such as screen captures, can be easily altered.“Web archives, in contrast, capture the complete content of a web page, including its source HTML along with embedded images, stylesheets, or JavaScript code,” wrote Michele Weigle, a computer science professor at Old Dominion University, in her article.The Significance of Web Preservation.

Hence, Fact Check compiled a list of four essential tools for web archiving:

The Wayback Machine

One of the most commonly utilized free archiving tools is the Wayback Machine, which was introduced in 2001 by the non-profit Internet Archive. The goal of this service is to “preserve these [digital] items and establish an Internet library for researchers, historians, and academics.”

Their initial online searches started in 1996,to resolve missing links (404 errors)A crawl refers to an automated method of gathering and duplicating web pages, resulting in ‘snapshots’ of their content. Visitors can search using a URL or specific keywords to see how a website appeared on particular dates.

Pros: Thorough, free of charge, and extensively utilized.Cons: Sometimes difficult to access because of hacking; searching with keywords can be challenging.

The Wayback Machine, operated by the Internet Archive, is the earliest and largest public collection of archived web pages, butIt’s not the sole one. Numerous countries and national libraries also maintain their own web archives.

Archive.today

Launched in 2012,Archive.today is a community-based platform that captures web pages without interactive components or scripts. It’s ideal for preserving dynamic content such as social media updates. It maintains working links. Additionally, it is smaller compared to the Wayback Machine, yet more individualized and quicker to respond.

ProsQuick, simple, and costless.Cons: Depends on user effort; limited collection.

Perma.cc

Developed by the Harvard University’s Library Innovation Labin 2013, Perma.ccfights against link rot, particularly in academic and legal settings, where reliance on citations of stable sources that readers can access is crucial.

Archived through Perma.cc, the website stays Interactive, with links remaining functional. However, it is available at no cost only to organizations connected with academic institutions and courts. Others must opt for a monthly subscription.

Pros: Suitable for academic purposes.Cons: Limited free access.

The Ghostarchive

Introduced in 2021, Ghostarchive focuses on preserving videos and interactive content, commonly found on social media platforms — domains where many other tools face difficulties. It demonstrates a strong effectiveness with video material, although it isn’t consistently dependable.

Pros: High effectiveness of video materialCons: Not 100% reliable.

A Chrome extension called Web Archives also includes a variety of archiving tools, highlighting the increasing demand to safeguard online material as it keeps growing.

Why archiving matters?

Archiving enables the monitoring of public figures and the progression of their statements over time.

“We can at least share the digital archive of our reality,” states Henk van Ess, an expert in online research and open-source intelligence.“IIf they (the politicians) made statements many, many years ago and later changed their stance, it definitely affects public opinion. It’s extremely important to discover what they truly said. So, it’s essentially the most effective method of re-establishing a shared understanding of reality,” he explains.

“It’s not about preserving what is true, but preserving the dialogue,” says Mark Graham, director of the Wayback Machine, in an interview with theFinancial Times.

When archiving fall short

Not every page is archived in the same way, and it is impossible to archive all online content. Well-known sites such as CNN are frequently scraped, whereas smaller ones are archived less consistently. Platforms like Archive.today rely on users to start the archiving process.

“Every hour, there’s an immense amount of content generated online that makes it practically impossible to copy and paste,” states Van Ess.

Additionally, Some websites prevent archiving tools from accessing their content by using configurations such as robots.txt, while others remain unlinked, rendering them invisible to crawlers.

SOccasionally, technical problems such as connectivity issues or data caps may hinder

successful archiving.

“One of the major difficulties in web archiving is recording today’s interactive websites on a large scale,” states Weigle.

Van Ess also cautions that legal challenges could progressively impede the process of preservation:“WWe live in a world, at least within Western democracies, that is heavily influenced by legal professionals. If you have a critique of an argument presented, it has become quite simple to eliminate it due to potential legal consequences.

The key point is that the phrase, “The internet never forgets!” is indeed accurate, and we can leverage this to locate older versions of websites or even those that have been removed from the web through online archives.

Edited by: Rachel Baig

Author: Alima de Graaf, Chi-Hui Lin

Leave a comment

Trending