Tuesday, June 15

The global fall of thousands of pages warns about the fragility of the internet | Technology



Shortly before noon on June 8 and for a period of almost an hour, a good part of the main websites around the world such as Amazon, Twitch, New York Times, HBO Max, Hulu, the website of the United Kingdom Government, Spotify, Reddit or even EL PAÍS, began to register operational problems. In some cases, these became inaccessible to many users. The reason, one of the secondary links of the system, a company called Fastly, suffered an error in its systems, which caused a chain fall all over the world. In a world that was forcibly digitized and given over to teleworking by the pandemic, all the alerts were triggered: the internet seemed to collapse at times.

Barely an hour later the incident was resolved and panic gave way to humor and memes. One of the secondary links in the system, a company called Fastly, had suffered a failure that caused the chain fall of all the companies it served. But, beyond the anecdote, the event once again revealed the weakness of the configuration of a network of networks on which communications, the economy and the functioning of modern societies are based. Especially at a time when a large percentage of companies – 43% in Spain, according to the INE – make use of teleworking.

“This event highlights the fragility of the system on which the internet is based,” says Igor Unane, technical manager of S21. The key, says this engineer in telecommunications systems, “lies in the concentration of a structure in which” a series of large manufacturers that are monopolizing the hegemony. “The system is weak because sometimes it depends on a single point in this content cloud”, completes Jordi Serra, professor at the UOC. The key to solving the problem go through distribute game: that a single link cannot cause a general failure. The problem for this: the costs.

There are more than 1.8 billion web pages worldwide, according to data from Internet Live Stats. These pages need the services hosted in the cloud, that is, on expensive external servers distributed all over the planet. Both this article and a large part of the services that millions of people access daily: Gmail, Spotify, WhatsApp and also the devices we have at home, such as Alexa or Google Home, are hosted in the cloud. Six out of 10 web sites or services worldwide depend on just three providers: Amazon Web Services, Microsoft Azure, and Google Cloud. And next to these three giants, on a second level, are other firms called content delivery network (CDN, for its acronym in English). The best known are Cloudfare, Akamai and Fastly, the cause of yesterday’s global failure.

A CDN is basically a network of servers in different data centers around the world that are dedicated to temporarily storing copies of your clients’ pages. The idea is to avoid that the geographical remoteness of a service or its central servers, or high user demand, can cause a page to take time to load or even cause the system to crash.

One of these networks was the one that caused the generalized fall of this Tuesday. At 11:58 AM, Fastly posted an incident stating, “We are currently investigating the possible performance impact of our CDN services.” At 12.44 a.m. Spanish time, the company assured that the problem had been identified and that it was already being fixed. Nine minutes before 3:00 p.m., he concluded the incident. The reasons for the fall have not been entirely clear. The affected company explained this Tuesday that it had “identified a service configuration” that caused “interruptions in points of sale worldwide”, so it had proceeded to deactivate this configuration. The company quickly ruled out, yes, that there had been a computer attack.

In any case, whatever the reason, the doubts centered on the vulnerability of the network as a whole. “This event highlights the fragility of the system on which the internet is based,” says Igor Unane, technical manager of the cybersecurity company S21sec. The foundational idea of ​​the Internet is decentralization but the problem, says this telecommunications systems engineer, “lies in the concentration of a structure in which a series of large manufacturers are monopolizing hegemony.” “The system is weak because sometimes it depends on a single point in this content cloud”, completes Jordi Serra, professor at the UOC.

The key to solving the problem would be to distribute the game: that a single link cannot cause a generalized failure. But there is a problem with it, the costs. Also, that companies like Fastly do not have too many competitors. To begin with, this business requires heavy investment in infrastructure, which limits competition in the sector. In addition, it is not profitable for companies to have several suppliers. “These situations are unavoidable when we depend on a single supplier,” explains Unane. “It’s like counting two companies to put the telephone and fiber at home, why pay double so that one day you run out of internet?”

It is not the first time that a similar fall has occurred. In November, the Amazon Web Services servers registered a failure that eventually caused home cleaning robots that needed the cloud to stop working. In 2017, this company recorded an even bigger failure, lasting five hours during which chaos spread across the network. Amazon, in addition to apologizing, then explained that everything had been due to an error on the part of an employee, who allegedly typed a misprint in his code. “Unfortunately, one of the command signs went wrong and a large number of servers went down,” the company explained.

It is not the first time that a similar fall has occurred. In November, the Amazon Web Services servers registered a failure that eventually caused home cleaning robots that needed the cloud to stop working. In 2017, this company recorded an even bigger failure, lasting five hours during which chaos spread across the network. Amazon, in addition to apologizing, then explained that everything is due to an error on the part of an employee. You typed a typo in your code and the servers stopped working. “Unfortunately, one of the command signs went wrong and a large number of servers went down.”

Last December, Last December another large company suffered a resounding decline. Most of Google’s services (Google, Gmail, Google Docs, YouTube and the cloud storage service) were inactive for an hour due to an internal storage problem. The ruling affected millions of people around the world who had adopted its tools to carry out their work remotely.

You can follow EL PAÍS TECNOLOGÍA at Facebook Y Twitter or sign up here to receive our newsletter semanal.




elpais.com

Leave a Reply

Your email address will not be published. Required fields are marked *