For 26 years the Internet Archive acts as a “backup” of the web. The site preserves software, videos, images and pages of the most diverse types. The content maintained there is so extensive that, recently, the service’s database has reached 100 PB mark (petabytes)! And there’s more to come: the plan now is to also archive materials on ham radio.
Internet Archive is the name of the non-profit organization behind the initiative. The service that collects and stores files from the internet is called wayback machine.
As the name suggests, the site works as a kind of time machine. Much of the content archived there no longer exists at the origin or is different today.
100 PB and 741 billion web page
For web pages only, the wayback machine gathers more than 741 billion copies. But videos, audios, images and books, for example, are also part of the platform’s collection.
To give you an idea, the Internet Archive Archived 2,000 MS-DOS Games in 2019. The following year, it was the turn of Flash games. In 2021, a process of digitization of more than 250 thousand discs at 78 rpm has been put into effect.
This work has been carried out since 1996. In 1997, the service reached the mark of 2 terabytes of archived data — today, this volume can fit on a single SSD drive.
As you already know, the service has reached 100 petabytes of data recently, a number equivalent to 100,000 terabytes.
Now it’s amateur radio’s turn
This week, the Internet Archive revealed which is putting together content for the newly created Digital Library of Amateur Radio and Communications (DLARC).
The library was designed to preserve all kinds of digital content related to amateur radio. Some examples: digitized printed materials, specialized websites, audio files, personal collections and communications records.
Therefore, this work will follow some strategies, such as:
- Scan printed materials such as newspapers, books and documents;
- Archive and organize “born digital” content, such as photos, websites, videos, newsletters and podcasts;
- Conduct audio interviews with key community members.
DLARC is financially supported by a private foundation — the ARDC —, but depends on the collaboration of other entities or people to assemble its collection.
Interested parties can help with magazines, books, manuals, catalogs and any other type of content related to amateur radio.
Paywall is a problem
Attention to amateur radio is an addition. The work to preserve pages, videos, software and the like continues. But there are some challenges ahead. One of them is the paywall, that is, the mechanism used by news sites that blocks access to content by non-paying users.
This kind of restriction has made it difficult to collect news and other content that, in a few years, will probably no longer exist in the origins.
The problem is not exactly new. The restriction of access to content already happens on social networks. Facebook, for example, allows its pages to be indexed by external services, but many of its posts are restricted to closed communities or paying users.
It is undeniable that the Internet Archive does an excellent job, but despite efforts, much content must be left out of its coverage because of these limitations.
With information: Financial Times.
https://tecnoblog.net/noticias/2022/10/06/internet-archive-supera-100-petabytes-e-quer-cobrir-ate-radioamadorismo/