What is the Internet Archive and why is it facing a backlash from book publishers? | Explained

Sahana Venugopal Sahana Venugopal | 07-07 00:11

The story so far: Internet Archive, a non-profit that aims to digitise, preserve, lend, and share multi-media content, is embroiled in a major legal challenge as it faces off against traditional publishers accusing it of copyright violations. The free digital library is currently fighting the forced removal of around half a million books from its platform, which it argues functions like a library.

What is the case against Internet Archive?

While a great number of books digitised and uploaded by Internet Archive were already in the public domain - such as historical sources, old classics, etc. - many traditional publishers have alleged that Internet Archive violated their copyrights and illegally made their books available to the public as well, by scanning physical copies and distributing the digital files.

In the case Hachette vs Internet Archive that began in 2020, traditional publishers Hachette, HarperCollins, Wiley, and Penguin Random House sued Internet Archive. On March 24 last year, District Judge John G. Koeltl issued an order in favour of the publishers.

“IA’s Website includes millions of public domain ebooks that users can download for free and read without restrictions,” noted the order, adding, “Relevant to this action, however, the Website also includes 3.6 million books protected by valid copyrights, including 33,000 of the Publishers’ titles and all of the Works in Suit.”

(For top technology news of the day, subscribe to our tech newsletter Today’s Cache)

In particular, traditional publishers were against IA’s temporary ‘National Emergency Library’ (NEL) initiative that it launched during the COVID-19 pandemic. This was to allow more users to access the e-books in its collection while physical libraries were locked down.

“During the NEL, IA lifted the technical controls enforcing its one-to-one owned-to-loaned ratio and allowed up to ten thousand patrons at a time to borrow each ebook on the Website,” stated the 2023 order.

In general, IA uses a system known as “controlled digital lending” to limit the number of people who can access an e-book. It ended its emergency library system after being hit with the lawsuit.

Internet Archive used the doctrine of fair use to defend itself in the case, but this did not hold up. The organisation said it would appeal, but did so after some delay.

The case is ongoing, with the oral argument stage of the appeal taking place on June 28.

Why are books being removed from the Internet Archive?

As a result of the lawsuit, IA was forced to remove over half a million books from its database, with the Director of Library Services at Internet Archive, Chris Freeland, calling out the “profoundly negative impact” on users. 

According to testimonies collected by IA, the mass removal hurt students who could not access books for academic research. 

While IA identifies itself as a library, it has been compared to a shadow library or a piracy database by traditional publishers, who disagree with its “controlled digital lending” approach.

Despite the removal, however, Internet Archive is still home to a rich collection.

As of late June, the web archive said it contained 835 billion web pages, 44 million books and texts, 15 million audio recordings, 10.6 million videos, 4.8 million images, and 1 million software programs. Live concerts and television programs also make up part of this collection.

What is Wayback Machine?

While Internet Archive buys physical books, digitises them, lends them to users, or makes them available for download, it has since 1996 also focused on preserving web pages. The platform claims users can explore over 866 billion saved web pages through its own search service.

“We began in 1996 by archiving the Internet itself, a medium that was just beginning to grow in use. Like newspapers, the content published on the web was ephemeral - but unlike newspapers, no one was saving it. Today we have 28+ years of web history accessible through the Wayback Machine and we work with 1,200+ library and other partners through our Archive-It program to identify important web pages,” noted Internet Archive on its website.

Users can help IA archive parts of the internet at no cost, or they can reach out to the platform to make their own work publicly available.

How can one use Wayback Machine?

Using Wayback Machine is easy and free of cost, though results are not always guaranteed.

To begin, navigate to the Wayback Machine web page, where you will see a bar in which you can enter a URL/keywords relevant to the web page or content you are looking for. Then, hit ‘enter’ and wait for the results to be shown.

If the content was new, rarely viewed, or deleted a very long time ago before being captured for the archive, you may not get many results or any at all. 

However, you have a good chance of finding content such as old websites that no longer exist today, earlier versions of existing websites, deleted social media posts, archived versions of paywalled articles, and archived versions of content that is blocked or censored in your jurisdiction.

A graphic will show you how many times Internet Archive “crawled” the content in the past months or even years, allowing you to click on the calendar bubbles to pick out “snapshots” of the web content from different periods of time. However, the service can be patchy at times and not all content might have been perfectly saved; broken links, missing media, or pages that won’t load are often the end result. 

While Wayback Machine is useful for personal research or to access information sources, users should be cautious about relying on the data obtained through such sources, as the saved information can sometimes be outdated or inaccurate.

Disclaimer: The copyright of this article belongs to the original author. Reposting this article is solely for the purpose of information dissemination and does not constitute any investment advice. If there is any infringement, please contact us immediately. We will make corrections or deletions as necessary. Thank you.


ALSO READ

Saudi Arabia jails cartoonist Mohammed al-Hazza for 23 years for insulting leadership, rights group says

Dubai — A Saudi artist has been sentenced to more than two decades in prison over political cartoons...

world | 3 minutes ago

Rain may have helped form the first cells, kick-starting life as we know it

Billions of years of evolution have made modern cells incredibly complex. Inside cells are small com...

science | 11 minutes ago

The Science Quiz: AI in science, from neurons to nodes

Questions: 1. The functioning of organic neurons is the model for artificial neural networks. In bio...

science | 11 minutes ago

Today’s top tech news: Meta’s U.S. legal troubles; Intel and AMD team up; Apple’s new iPad mini

(This article is part of Today’s Cache, The Hindu’s newsletter on emerging themes at the intersectio...

technology | 11 minutes ago

AI firm Perplexity offers a peek into a new financial analysis tool

AI company Perplexity revealed a work-in-progress finance-centric platform that would let users look...

technology | 11 minutes ago

Apple iPhone 16 Pro Max and Samsung Galaxy S24 Ultra | Prices, specs, features compared

As the festival season rolls by, many shoppers in India are considering whether it’s time to take ad...

technology | 11 minutes ago