Colourful Grid Structure
, , ,

Grids are Dead – or are they?

Nadine Weber

Huge datasets – not enough computing power. What to do? Don’t worry! The supercomputer concept Grid Computing is here to save you!

With the rise of cloud computing, fewer companies decide upon using Grid Computing – and even less know what the latter really is or how it can be used. What’s behind this matter? Is Grid Computing really dead? Let’s take a closer look.

Why did superman choose Grid Computing? Because he can save the world in parallel!

What is Grid Computing?

Grid Computing describes a network of decentral, distributed computers working together on a specific task. Instead of computing a task that would be rather difficult for a single machine, the computers combine their forces. This means, despite having different geographic locations, they work like a virtual supercomputer. The computers thereby contribute resources, such as processing power, memory and storage capacity, along the network. [1]

Because the virtual supercomputer is generated out of a cluster of coupled computers, Grid Computing is known as a subset of distributed computing. Developed to solve computationally intensive operations, it can be considered the Kryptonite of distributed computing. [1,2]

Meme of Grid as superman, saying it's saving you from complex computational operations.
Picture 1 | Supergrid

Why is it called Grid? Well, just like with electricity, the power comes “out of the socket”. By simply plugging the computer into the power grid, the computing starts. [1,3]

Foster and Kesselman postulate the following definition for Grid Computing:

A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.

Is the architecture… a grid?

The Grid Computing Network generally consists of three types of computers, the so-called Nodes [1]:

  1. Control Node: Computer (Server / Group of Servers) which administers the whole network and keeps account of the resources in the network pool
  2. Provider Node: Computer which provides its resources to the network resource pool
  3. User Node: Computer which uses the resources of the network

These nodes have designated positions in a simple hierarchy, beginning with the (a) applications, then the (b) Middleware, the (c) available resources and the (d) computer itself that connects to the grid. Here’s a simple overview:

Graphic showing how Grid Computing works
Picture 2 | Grid Computing Architecture, adapted from [3]

The machines are connected via Ethernet or Internet. Instead of using a lot of CPU cores of one computer, the grid contains multiple cores that are spread across various locations in a large or even world-wide network. For controlling the network and its resources, a Middleware protocol is needed. This is mostly an open, standardized protocol connecting the grid’s resources with high-level-applications. The Control Node is the executor of the Middleware. [3,4]

The nodes may consist of machines using the same operating system, transforming the computer collective into a homogeneous network. If the nodes use different operating systems in a single grid computing network, they are called heterogeneous networks. The fact that they are allowed to use different OS, is one of the advantages and distinguishing parts of grid computing, in comparison to other distributed computing architectures. The virtualization of a single-system image – also called virtual organization – granting users and applications seamless access to IT capabilities is a very powerful feature. [1,3]

From Grid to Grid Computing

Contributing computing power to a Grid is relatively simple. A loosely-coupled computer makes a request to the control node. Then the Control Node gives the user access to the resources available. Whenever the computer is not in direct use, it contributes its resources to the network. The computers therefore are in constant balance / switching between being a user or a provider, based on their needs. [1,3]

Grid Computing shows the clustered and balanced lifestyle by living, laughing and computing in harmony.

Being the control node is not that easy. Since it is responsible for administering the network and making sure a provider is not overloaded with tasks. The provider gives permission to the user to run anything on its computer, which is a potential security risk for the whole network. This is why another important aspect is to authorize any process or task which is being executed on the network. [1,2]

Grid Infrastructure, showing many connected computers
Picture 3 | Grid Infrastructure, adapted from [6]

When the task is assigned, it gets broken into different subtasks. They are then divided and sent to different machines of the grid. After being solved, the results are sent back to the control node. The complete infrastructure shows the grid where computers interact with each other to coordinate the solving of a complex task. [2,6]

Depending on the use-case and the connected resource, one can differentiate between five different types of grid computing [3-5]: 

  • Computing Grids: Provide distributed computing resources to solve complex applications as a high-performance computer 
  • Data Grids: Shared usage of distributed data, provide storage capacity
  • Knowledge Grids: Scanning, connecting, collecting and analyzing big datasets
  • Resource Grids: Role-defined provider of data, software and hardware
  • Service Grids: Provide complete services and computing power to users

Grid vs. Cloud Computing

The idea of consuming computing power – like electricity is consumed from a power grid – is similar to what we nowadays know as Infrastructure-as-as-Service (IaaS) of Cloud Computing, where only these resources are used that a system currently needs. So, are Cloud and Grid Computing the same thing? [5,6]

Comparison of Grid and Cloud Computing Infrastructue
Picture 4 | Comparison of Grid vs. Cloud Computing Infrastructure, adapted from [5]

In a nutshell: No, they are not the same. They both are similar in that they contribute resources, make interactions between machines possible and provide resources. Hence, they use the concept of using the power of other entities to increase efficiency.

One difference lies in the segmentation of tasks. Grid Computing maximizes the available resources by splitting tasks into several different subtasks and computes them via different machines. No physical data center is needed. Cloud Computing, on the other hand, usually does have a physical data center, thus offering a homogeneous network.Another difference lies in their uniformity. Grid Computing allocates resources in its network to the related subtasks through decentralized coupling of computers. The Grid Provider thereby owns and organizes the infrastructure as a virtual supercomputer, while Cloud Providers organize the computing centrally and uniformly in the Cloud. [5,6]


‌History-Time: Popularity of Grid Computing

Let’s take a trip down memory lane: While doing some research on this topic, I came across many recent articles which state that grid computing allegedly is dead. Just like Cloud Computing nowadays, the Grid was predicted to be the state-of-the-art of computing infrastructures in the early 2000s. In an article from 1999 called “The Grid: Blueprint for a new computing infrastructure”, the idea of Grid Computing originally was first proposed as a concept by Ian Foster and Carl Kesselman. [1,7]

Businesses in biotech, the sciences and financial services have discovered that grids are competitive necessities.

NetworkWorld, 2006 [8]

In 2006, Network World USA stated that “this is the year that grid computing will move into mainstream” [8]. The main advantage of fewer costs for highly computational supercomputers spiked a lot of interest. But contrary to what was expected, Grid Computing did not reach mainstream status. It is a fact: the Grid is getting older and we have an enormous cloud guiding us to a phenomenal future. Why would we settle for anything less? [4]

The meaning of Grid Computing has changed over the years. The former distributed computing infrastructure has morphed into a distributed collaborative network. It now has a much more scientific approach towards solving mathematical, analytical and physics problems in research dealing with Big Data. Years later, mainly research institutes and some brave businesses apply Grid Computing. [5]

But not everyone has replaced Grid Computing yet.

When Grid Computing goes of the grid

Every desktop computer can be used to add power to the grid. This is why “normal” people can contribute as well. Common applications of grid computing are to be found in the sector of financial services (risk assessment), gaming (splitting tasks), entertainment (VFX), science and engineering. Computing and Data Grids are very commonly used in scientific projects (like genome sequence analysis during the Covid-19 pandemic), which inspire and motivate people to be part of a public collaboration across companies and networks. This is why Grid Computing falls into the category of peer-to-peer computing. [2,6]

The European Grid Infrastructure (EGI), the SETI@home (Search for Extraterrestrial Intelligence) by the University of Berkeley, the Scandinavian NorduGrid and the neuGRID (research on neurodegenerative diseases) are some famous examples of scientific projects using Grid Computing. The field of e-Science projects (scientific research collaborations for analysis) is certainly gaining popularity. This is especially interesting for natural sciences, medicine, meteorology, industry and particle physics – as further exemplified in the following section. [4,5,9]

A real life example: The Worldwide LHC Computing Grid (WLCG)

CERN – the European Organization for Nuclear Research – is most commonly known for physicists doing fundamental research on what the universe is made of. But it is less commonly known that this research involves computer scientists working with Grid Computing for data analysis.

CERN Accelerator Complex
Picture 5 | CERN Accelerator Complex [14]

The Large Hadron Collider (LHC) is currently the largest particle accelerator in the world. With a circumference of 27 kilometers, it investigates high-energy particle collisions through four main experiments: ALICE, ATLAS, CMS and LHCb. Stationed in Meyrin, Switzerland, particle physicists and computer scientists from all around the world collaborate together on something as special as fundamental research. [10,11]

Fun Fact: CERN is not only a huge cooperation, but the research around the building blocks of the universe (the quarks, antimatter and dark matter) provide inspiration for great science-fiction movies!

Simulation of Particle Collision (proton  - proton) at 13,6 TeV (Tera Electron Volts), as seen in the CMS detector during Run 3 at CERN
Picture 6 | Particle Collision (proton – proton) at 13.6 TeV as seen in the CMS detector during Run 3 at CERN [15]

The data produced by the CMS detector at the LHC is about 15 Petabyte per year. To show how much data that is to the common user, here are some comparisons [11,12]:

  • 1 PB is equivalent to over 4,000 digital photos for every day of your entire life. 
  • 1 PB is equivalent to 11,000 4k movies. It would take you over 2.5 years of nonstop binge watching to finish watching them.

The detector by those means can be seen as huge, high-resolution camera. You can now imagine just how enormous this dataset is. But enough background information… back to the Grid!

In order to be able to handle that huge amount of data produced by the LHC, CERN uses a Grid Infrastructure called the Worldwide LHC Computing Grid. In 2007, the WLCG was one of the first Grid Computing Infrastructures – and it is still in use today. CERNs mission is to provide global computing resources for the storage, distribution and analysis of the data generated by the LHC. It is a global collaboration of around 170 computing centers in more than 40 countries. They link up national and international grid infrastructures and analyze a dataset of around 200 Petabytes of data every year. CERN itself provides about 20% of the resources of the WLCG. [9,11]

Video | Close-to-real-time Visualization of the WLCG Grid Activity [15]

Individuals wishing to contribute with home computer resources can become involved in the LHC@home 2.0 project. Universities and Organisations can contribute as well. In order to access the LHC Grid, one must be registered as a Virtual Organization. Extensive information can be found on their website. [11]

Pros & Cons

Let’s recap all the Pro’s and Con’s of Grid Computing. [1,5]

Advantages of Grid Computing

  • Hardware-Independence: The concept of heterogeneous machines allows for different operating systems to be used in a single grid computing network. The coordination and management of cross-device processes and tasks is facilitated, as well as collaboration across networks possible.
  • Increased Efficiency: Through the physically distributed computer networks, parallel processing and analysis of huge amounts of data is possible.
  • Cost-effective Scaling of Processes: Using coupled computing power and storage capacities, complex tasks can be solved faster and more effectively.
  • Less Hardware-Costs: Through using the unused capacities of computers, no large investments in server infrastructure is needed. The users don’t have to pay for the resources and therefore save hardware costs.
  • Decentral and Flexible: Because there are no central servers required (except for the control node used for controlling), it is possible for computers to be located anywhere. Hence they have low failure rates, as capacities are distributed flexibly and modularly in the grid.
  • Easy Distribution: Reliable utilization and optimal use of IT infrastructure through virtual organizations and flexible task distribution is guaranteed.

Disadvantages of Grid Computing

  • Complex administration: The grid is unique for every project and system being used. There must be an agreement on which protocols are in use, to improve security.
  • Willingness to share resources: Through high energy costs and other circumstances, users might prefer to just plug out their computer, instead of leaving it running all night long. Therefore, creating the actual grid is difficult.
  • Non-Linearity: Computing power does not increase linearly with the number of coupled computers. At the same time there is less control over outages. This makes the behavior of the Grid unpredictable.


Conclusion: Is Grid Computing dead?

Yes and no. The answer to this question is not as easy as it seemed at first glance. Grid Computing no longer gets the attention it used to get. But that doesn’t mean it’s dead. It is still very much affordable and adequate for usage in highly computational processes concerned with Big Data. 

Because cloud computing is receiving more and more recognition with reliable and fast systems, plus virtualisation put into concepts, this might be the death of Grid Computing. But the existence of Grid Computing Infrastructures like the LHC Grid or other e-Science projects, with lots of people wanting to share their resources, is pretty unstoppable – for now.

Experiments being done with – for instance – particle accelerators are one-of-a-kind. They are special use-cases for which libraries or Middleware do not serve as resources. The concept of Grid Computing isn’t suitable for all software solutions, but it definitely isn’t dead to science and engineering.


Literature

[1] “Grid Computing.” GeeksforGeeks. https://www.geeksforgeeks.org/grid-computing/ (accessed Feb. 13, 2023).

[2] “What is grid computing? – Grid-Computing explained.” Amazon Web Services, Inc. https://aws.amazon.com/what-is/grid-computing/?nc1=h_ls (accessed Feb. 13, 2023).

[3] “What Is Grid Computing? Key Components, Types, and Applications,” Spiceworks. https://www.spiceworks.com/tech/cloud/articles/what-is-grid-computing/#:~:text=Grid%20computing%20supports%20various%20commercial (accessed Feb. 13, 2023).

[4] N. Litzel and S. Luber, “Was ist Grid Computing?,” www.bigdata-insider.de. https://www.bigdata-insider.de/was-ist-grid-computing-a-629099/ (accessed Feb. 21, 2023).

[5]  “Was ist Grid Computing?” IONOS Digital Guide. https://www.ionos.de/digitalguide/server/knowhow/grid-computing/#:~:text=Was%20ist%20Grid%20Computing%3F,Auslastung%20der%20Infrastruktur%20zu%20optimieren (accessed Feb. 13, 2023).

[6] J. Bausch. “Cloud computing vs grid computing.” Electronics Products. https://www.electronicproducts.com/cloud-computing-vs-grid-computing/ (accessed Feb. 15, 2023).

[7] I. Foster and C. Kesselman, The grid : blueprint for a new computing infrastructure. San Francisco, Calif.: Morgan Kaufmann ; Oxford, 2003.

[8] J. Bort. “Grid computing comes of age.” Network World. https://www.networkworld.com/article/2309239/grid-computing-comes-of-age.html (accessed Feb. 15, 2023).

[9] F. Bry, W. E. Nagel & M. Schroeder, “Grid-Computing,” Informatik-Spektrum, Vol. 27, No.6, pp. 542–545, Dec. 2004, doi: https://doi.org/10.1007/s00287-004-0443-4.

[10]  “The Worldwide LHC Computing Grid (WLCG).” CERN. https://home.cern/science/computing/grid (accessed Feb. 20, 2023).

[11] “Welcome to the worldwide LHC computing grid.” WLCG. https://wlcg.web.cern.ch/ (accessed Feb. 15, 2023).

[12]  T. Fisher. “Terabytes, Gigabytes, & Petabytes: How Big Are They?” Lifewire. https://www.lifewire.com/terabytes-gigabytes-amp-petabytes-how-big-are-they-4125169 (accessed Feb. 20, 2023).

[13] Autoren der Wikimedia-Projekte. “LHC computing grid – wikipedia.” Wikipedia – Die freie Enzyklopädie. https://de.wikipedia.org/wiki/LHC_Computing_Grid (accessed Feb. 15, 2023).

[14] “The CERN accelerator complex”. CDS Photos https://cds.cern.ch/record/1260465 · CERN. https://cds.cern.ch/record/1260465 (accessed Feb. 15, 2023).

[15] “The Worldwide LHC Computing Grid activity captured live by the EGL application in August 2017.” CDS Videos · CERN. https://videos.cern.ch/record/2640380 (accessed Feb. 15, 2023).

[16] “Shifting Perspectives of Grid Structure #1, Configuration #2.” Process by Alois Kronschlaeger. https://aloiskronschlaeger.wordpress.com/2014/09/21/shifting-perspectives-of-grid-structure-1-configuration-2/ (accessed Feb. 15, 2023). 

[17] “Imgflip – Create and Share Awesome Images.” Imgflip. https://imgflip.com (accessed Feb. 15, 2023).

Pictures

  • Picture 1 | Supergrid, designed with [17]
  • Picture 2 | Grid Computing Architecture, adapted from [3]
  • Picture 3 | Grid Infrastructure, adapted from [6]
  • Picture 4 | Comparison of Grid vs. Cloud Computing Infrastructure, adapted from [5]
  • Picture 5 | CERN Accelerator Complex [14] 
  • Picture 6 | Particle Collision (proton – proton) at 13.6 TeV as seen in the CMS detector during Run 3 at CERN [15]
  • Title Picture | Colorful Grid Structure [16]

Comments

Leave a Reply