Could we onchain The Internet Archive in the future?

gjlite · 16 December 2024 08:27

If you don’t know The Internet Archive has been taking snapshots of websites since 1996 (The Wayback Machine). Over time, it has expanded into a free and open digital library as well. link

During COVID lockdowns, it began an emergency library of e-books as a community service, for which it found itself under scrutiny for the unlicensed sharing of published IP. The Internet Archive has recently settled after 4-5 years of litigation against them. The IP litigation is still ongoing, now with its archive of digitised 78 RPM recorded music. (last paragraph of linked article) link

Since we have Book.io for e-books and now Stuff.io for other digital media, and of course NEWM.io for music, why can’t we develop a means to store website data onchain? A website is basically a user-friendly app to present database information. Most, if not all, popular website generation platforms (WordPress, Wix, Shopify, OpenCart, Joomla, etc) pull their data from MySQL or similar database platforms. So why not from blockchain? Is this something we could do with IAGON and SigularityNET and others in the future?

I would love to hear others’ thoughts on this.

gjlite · 16 December 2024 09:46

Just thinking about this a bit more, and I remembered a idea I shared with the Typhon wallet team several years ago.

Would it be a good idea to use NFTs for the different structures of a web page, for example CSS, index page, robot text etc. By doing it this way when accessing the site the nft structure is pulled offchain and cookies could direct the visitor to the default view unless the wallet included customised versions of the view. Book.io uses site cookies in this way to parse the security checks for the owner of ebooks, don’t they?

This would allows any enthusiasts to customize their entire internet/app experience. Back in the early days of Bitcoin, I played around with Joomla to create a site to help foreign language teachers and students of a major Japanese teaching company keep in touch and perhaps continue lessons privately after the company went bankrupt.

That was using PHP and MySQL. I chose Joomla because it was open-source with thousands of plug-ins and templates to play with and learn from.
I actually contracted a Flash or Java developer on an early version of fiverr.com to create an online 4-way video conferencing app similar to one offered from the teaching company. The students had no problem finding out about the site but unfortunately it wasn’t the same with the teachers.

These days my Hidden Disability:sunflower:prevents such focused short sprints and I am more an ideas generator.

icycranberry · 16 December 2024 17:44

I do not think this is possible as putting data on the blockchain is extremely expensive. Think of around 10-20k people running full nodes and stake pools and just think of the amount of data copies that would create. So even if a website is 1gb , it will create 20TB of redundant copies around the world. Which is why even NFTs are not stored on chain. The way it works is that we get a pointer to the data location off chain where only a single copy is stored. Problem with this approach is that the data source is centralized and can be removed by anyone.

icycranberry · 16 December 2024 17:47

This is already done across a myriad of websites which provide exclusive content / experiences to owner of certain NFTs. e.g. you can use pro version of taptools, access Xebereus playfrom etc Not sure what else you mean by this proposal.

HeptaSean · 16 December 2024 18:51

What do you want (someone) to do why?

Book.io, Stuff.io and NEWM don’t store the media on the blockchain, they use IPFS and similar services for that. Just the NFTs required to access those data are on-chain. Which also makes them a bit less decentralised than advertised. If you still want to monetise content, want to restrict access to those having bought and still holding a certain NFT, you have to have some centralised component checking that ownership.

The Internet Archive is very different. It wants to provide open access to content/media. Not much need for anything blockchain, let alone cryptocurrency in there.

gjlite · 16 December 2024 21:21

That’s why I included IAGON. Not sure how they achieve structured file storage, I have assumed that they’re using something similar to InterPlanetary File System, IPFS. According to Wikipedia.org IPFS had been considered by the IA in 2018 to achieve migration of TWBM to IPFS, don’t know how that’s going. And, according to this link on IPFS’s homepage, hacktivist published the Turkish version of Wikipedia.com on IPFS in 2017 to circumvent Turkey’s internet
censorship. (article includes this TEDx talk)

IAGON may not be utilising IPFS but I’m fairly sure they have something up their sleeves.

Hope this helps @icycranberry

gjlite · 16 December 2024 21:24

Hey @HeptaSean
I got into this more answering @icycranberry’s concerns👇

Let’s work the problem🙏

HeptaSean · 16 December 2024 21:33

Which problem?

Sorry if this seems contrarian, but I just don’t see a problem here that blockchain, cryptocurrency, Cardano can help solve.

That probably is a minority view in these circles, but I feel that a lot of the nails crypto people see and imagine that their hammer is a perfect fit for (pumping their bags in the process) just aren’t nails at all.

gjlite · 16 December 2024 21:47

The problem of centralised server infrastructure and corporate social media monopolies, for one thing.

I also am not truly interested in “price goes up” mentality. I see problems, whether online or IRL and endeavour to solve them. Furthermore, I have a broad knowledge base and blockchain tech is just an in-the-middle-interest to me, but I can see how our global/local IRL issues can be addressed by the technology, so I’ve delved deeper over the past decade or so.

Is that clearer @HeptaSean?

HeptaSean · 16 December 2024 23:47

For the Internet Archive, this presentation on its infrastructure is maybe interesting:
https://archive.org/details/jonah-edwards-presentation (from 2021, so maybe a bit outdated, but since they have run this for decades I won’t expect fundamental changes)

Is that centralised? Well, kind of, it’s all their hardware. It’s redundant, but only across some locations all in the US bay area.

But can we just replace it with something like IPFS? Hardly, I’d say. For those amounts of data and the coordination between the different sites you want to have a dedicated network between them.

Other interesting project: https://www.lockss.org/about
Libraries have thought for decades about digital preservation topics.

That has little to do with archives and preservation, has it?

I have doubts for a lot of the use cases touted for blockchain/crypto. Very often, either the problem is not really a problem or the proposed solution doesn’t really solve it or you can just remove the blockchain from the solution and it is just the same. Or all three of it. Happens all the time: digital identity, elections, …

This (from https://medium.com/@sbmeunier/when-do-you-need-blockchain-decision-models-a5c40e7c9ba1) seems to remain true:

gjlite · 17 December 2024 09:53

Interesting viewpoints @HeptaSean.
Give me some time to process this information.

I’m starting to free flow related ideas, even off-topic a little. Just a me thing.

Topic		Replies	Views
IPFS Immutable NFTs Misc Dev Talk nft	21	1398	27 July 2022
Domains and websites on Cardano blockchain General Discussions	13	2966	1 January 2022
A Database is not a Blockchain Education cardano , ada , cardanians , blockchain	1	1296	29 March 2021
Displaying data from a blockchain to anyone General Discussions	4	311	3 March 2021
Is Web3 Bullshit? General Discussions	9	1767	21 December 2021

Could we onchain The Internet Archive in the future?

Related topics