In its 58 years of operation in the Esperanza neighborhood, the arecibo radio telescopepart of the Observatory that collects many other scientific measurement and study systems, captured great discoveries and helped advance astronomy and other fields.
In little more than half a decade, multiple generations of scientists, including Nobel Prize winners, amassed exceptionally large amounts of data. How large is the data file stored on the Observatory’s servers? Right now, the estimated figure is around eight or nine petabytes. To get some context on size, a petabyte is roughly 1,000 terabytes.
After the unfortunate partial collapse of the radio telescope on December 1, 2020, the National Science Foundation (NSF) took on the task of finding a way to send a backup of the data collected by the facility to designated servers and cloud services in the United States. This work fell on the collaborative workspace service Engine-4, T Mobile and other companies that, using high technology and similar systems, work to protect the large amount of information collected by the installation.
“Before the collapse of the radio telescope, we had signed an agreement to start transferring the data accumulated by the facility to secure places such as the cloud and external servers. When the radio telescope was in operation, scientists working at the Observatory could access the data directly from the servers. But once the cables (that held the Gregorian dome) began to break and structural failures were detected, the work took on urgency, since it was not known if a collapse could destroy the buildings that house the local servers, or if the vibrations of a collapse would affect the machines”, he highlighted Luis Armando Torresco-founder of Engine-4, to The new day.
Fortunately, the collapse of the Gregorian dome did not cause any damage to the building containing the storage servers. Since the federal government has not decided (or provided funds) for the reconstruction of the radio telescope, or for the construction of a new generation model, the NSF approved sending backups of the data to cloud services and approved servers in the United States where scientists from all over the world will have access to the information and will be able to continue their research.
“In Engine-4 we have a fiber optic node that offers 200 Gigabits (under ideal conditions, it translates to about 25,000 Megabytes per second) in download and upload. We form a conglomerate with FiberX, Hewlett-Packard and Aruba Networks, along with other entities, and we are contributing to send backups of those dataTorres added.
The Engine-4 executive argued that, weekly, they send about 170 Terabytes of data. To maintain said weekly transfer capacity, and taking into account the estimated size of the data of about nine Petabytes, it would take Engine-4 between two and two and a half years to complete the process. from upload the backup.
“In the future, the idea is to create a kind of portal or dashboard to which scientists, university students and students can access to use the information”, he emphasized.
Torres added that the process of transferring data becomes more complicated when they have to convert information stored on magnetic tape rails to digital formats that can be used by modern computers. In addition, specialized personnel must handle these rails with care, since magnetic tapes are subject to degradation and, therefore, the possible loss of stored data.
“This process takes time and is very delicate. At least in Arecibo we have a person experienced in handling these storage systems, because today’s technicians, as they say, grew up in the digital age and do not know about these magnetic tape systems. To extract and convert the data, you have to use analog machines that can read the tapes on the rails and, from there, transfer it to modern digital formats that, in turn, allow us to send the data,” explained Torres.
Torres added that, in the future, they plan to add a second shot to their racks of servers to, in this way, send the data of two data containers of the radio telescope at the same time.
Valuable initiative for scientists and academics
For his part, the Professor Carlos Padin Bibilonidirector of the educational component of the Observatory, co-investigator and director of the Environmental Sciences graduate program at the Ana G. Mendez Universityhighlighted that science is still being done at the Observatory with the data collected by the radio telescope and stressed the importance of having access to said data from anywhere in the world.
“There is a whole old metadata that is being analyzed because it contains data and information that is being looked at in order to determine if recorded phenomena had already been previously reported or catalogued. There is a lot of science that continues to occur in this space”, emphasized Padín Bibiloni.
“The idea is to be able to investigate things that were not detected, perhaps because the algorithms that we have now to search for that information did not exist, or to corroborate if these events had been repeated 30 or 40 years ago. The goal is to have that information access portal. The data collected by the radio telescope is public information, but the people who can take advantage of it are people involved in science and trained for that, because we are talking about an immense mass of information”, added the academic.
The new day requested an interview with the director of the Observatory, Francisco Cordovabut was not available at the time of ordering.
George is Digismak’s reported cum editor with 13 years of experience in Journalism