Locking the Web Open: A Call for a Distributed Web
By Brewster Kahle
Aug 11 2015
Over the last 25 years, millions of people have poured creativity and knowledge into the World Wide Web. New features have been added and dramatic flaws have emerged based on the original simple design. I would like to suggest we could now build a new Web on top of the existing Web that secures what we want most out of an expressive communication tool without giving up its inclusiveness. I believe we can do something quite counter-intuitive: We can lock the Web open.
One of my heroes, Larry Lessig, famously said “Code is Law.” The way we code the web will determine the way we live online. So we need to bake our values into our code. Freedom of expression needs to be baked into our code. Privacy should be baked into our code. Universal access to all knowledge. But right now, those values are not embedded in the Web.
It turns out that the World Wide Web is quite fragile. But it is huge. At the Internet Archive we collect one billion pages a week. We now know that Web pages only last about 100 days on average before they change or disappear. They blink on and off in their servers.
And the Web is massively accessible– unless you live in China. The Chinese government has blocked the Internet Archive, the New York Times, and other sites from its citizens. And other countries block their citizens’ access as well every once in a while. So the Web is not reliably accessible.
And the Web isn’t private. People, corporations, countries can spy on what you are reading. And they do. We now know, thanks to Edward Snowden, that Wikileaks readers were selected for targeting by the National Security Agency and the UK’s equivalent just because those organizations could identify those Web browsers that visited the site and identify the people likely to be using those browsers. In the library world, we know how important it is to protect reader privacy. Rounding people up for the things that they’ve read has a long and dreadful history. So we need a Web that is better than it is now in order to protect reader privacy.
But the Web is fun. The Web is so easy to use and inviting that millions of people are putting interesting things online; in many ways pouring a digital representation of their lives into the Web. New features are being invented and added into the technology because one does not need permission to create in this system. All in all, the openness of the Web has led to the participation of many.
We got one of the three things right. But we need a Web that is reliable, a Web that is private, while keeping the Web fun. I believe it is time to take that next step: I believe we can now build a Web reliable, private and fun all at the same time. To get these features, we need to build a “Distributed Web.”
Imagine “Distributed Web” sites that are as easy to setup and use as WordPress blogs, Wikimedia sites, or even Facebook pages, but have these properties. But how? First, a bit about what is meant by a “distributed system.”
Contrast the current Web to the Internet—the network of pipes on top of which the World Wide Web sits. The Internet was designed so that if any one piece goes out, it will still function. If some of the routers that sort and transmit packets are knocked out, then the system is designed to automatically reroute the packets through the working parts of the system. While it is possible to knock out so much that you create a chokepoint in the Internet fabric, for most circumstances it is designed to survive hardware faults and slowdowns. Therefore, the Internet can be described as a “distributed system” because it routes around problems and automatically rebalances loads.
The Web is not distributed in this way. While different websites are located all over the world, in most cases, any particular website has only one physical location. Therefore, if the hardware in that particular location is down then no one can see that website. In this way, the Web is centralized: if someone controls the hardware of a website or the communication line to a website, then they control all the uses of that website.
In this way, the Internet is a truly distributed system, while the Web is not.
Distributed systems are typically more difficult to design than centralized ones. At a recent talk by Vint Cerf, sponsored by the California Academy of Sciences, Cerf said that he spent much of 1974 in an office with two other engineers working on the protocols to support a distributed Internet system, to make it such that there are no central points of control.
Here’s another way of thinking about distributed systems: take the Amazon Cloud. The Amazon Cloud is made up of computers in Amazon.com datacenters all over the world. The data stored in this cloud can be copied from computer to computer in these different places, avoiding machines that are not working, as well as getting the data closer to users and replicating it as it is increasingly used. This has turned out to be a great idea. What if we could make the next generation Web work like that, but across the entire Internet, like an enormous Amazon Cloud?
In part, it would be based on peer-to-peer technology—a system that isn’t dependent on a central host or the policies of one particular country. In a peer-to-peer model, those who are using the distributed Web are also providing some of the bandwidth and storage to run it.
Instead of one Web server per website we would have many. The more people or organizations that are involved in the distributed Web, the more redundant, safe, and fast it will become.
Locking the Web Open: A Call for a Distributed Web