Die “Cloud” ist ein Begriff, der in den letzten Jahren immens an Bedeutung gewonnen hat. Häufig wird sie für die Bereitstellung von Diensten und Services genutzt. Im Lauf der Zeit haben sich dabei verschiedene Architekturen entwickelt, die in der Cloud eingesetzt werden und unterschiedliche Ansätze für die Handhabung des Codes der Entwickler und die Anfragen von Nutzern haben. Eine davon ist die sogenannte Serverless-Architektur. Doch wie genau funktioniert dieser Ansatz, welche Vor- und Nachteile bietet er und handelt es sich dabei um die Zukunft der Cloud-Entwicklung?
Wie bereits erklärt handelt es sich bei Serverless (auf deutsch serverlos) um eine Architektur zur Entwicklung von Anwendungen in der Cloud. Der Name lässt vermuten, dass es bei diesem Ansatz darum geht, auf die Verwendung von Servern zu verzichten. Ganz ohne Server funktioniert es natürlich nicht, da die Anwendung und Services trotzdem über das Internet (und damit über irgendeine Art von Server) für die Nutzer erreichbar sein müssen.
Funktionsweise
Das “Serverlose” bezieht sich darauf, wie die Entwickler mit der Cloud umgehen und mit ihr interagieren. Denn im Gegensatz zum klassischen Cloud-Computing muss sich das Entwicklerteam hier keine Gedanken über die Bereitstellung der Anwendung für die Nutzer, die Skalierung oder die Verwaltung des Codes machen. Diese Aufgaben übernimmt alle der Cloud-Anbieter und folgt damit ein bisschen dem Prinzip “aus dem Auge aus dem Sinn”, da die Entwickler ihren Code nur hochladen müssen und sich danach keine, bzw. nur noch wenige Gedanken mehr machen müssen.
Man unterscheidet dabei zwischen BaaS (Backend-as-a-Service) und Faas (Function-as-a-Service). Die zwei Möglichkeiten differenzieren, was genau der Cloud-Anbieter bereitstellt. Bei BaaS kann man auf bereits vorhandene Dienste zurückgreifen, die der Cloud-Anbieter schon im Repertoir hat. AWS Amplify beispielsweise ermöglicht es den Entwicklern, einfach Datenbanken, Authentifizierung, etc. nach ihren Wünschen zu konfigurieren und einzubauen. Bei FaaS wird auf diese Art der Drittanbieter-Services verzichtet und die Entwickler schreiben ihre Logik und System selbst. Statt das Ganze dann auf einem eigenen Server laufen zu lassen, wird der Code bei dem Cloud-Anbieter hochgeladen, der diesen dann verwaltet. Je nach Bedarf wird dann durch Events der passende Code aufgerufen und ausgeführt, ohne dass der Entwickler dies manuell handhaben muss.
Abb. 1: Monolithische Architektur und Microservice-Architektur [4]
Zusammenfassend bietet Serverless also wie in Abbildung 1 dargestellt den Zugang zu vordefinierten Services, zu Möglichkeiten der Datenverwaltung, zu Monitoring über das System mit Metriken und Dashboards und natürlich eigene Zugangspunkte, die die Services und Funktionen der Entwickler ausführen.
Doch wie sieht der Code, beziehungsweise das gesamte System eigentlich aus, dass so ein Ansatz funktionieren kann? Im Grunde MUSS der Code keine besondere Struktur aufweisen. Serverless bezieht sich lediglich auf die Bereitstellung des Systems über einen Cloud-Anbieter, der sich um die Verwaltung dessen kümmert und diese Aufgabe von den Entwicklern abnimmt. Es hilft jedoch, das System in einzelne, kleinere Services oder sogar einzelne Funktionen zu unterteilen, die unabhängig voneinander funktionieren, um die Vorteile von Serverless besser nutzen zu können. Hier kommt der Vergleich zwischen der traditionellen, monolithischen Software-Architektur und der Microservice-Architektur ins Spiel.
Abb. 2: Monolithische Architektur und Microservice-Architektur [3]
Bei einer monolithischen Software-Architektur (Abbildung 2 links) wird das System klassischerweise in ein Frontend, Backend und die Datenbankanbindung unterteilt. Die drei Teile arbeiten eng zusammen und sind stark voneinander abhängig. Sie lassen sich kaum auseinanderziehen oder separieren. Im Gegensatz dazu nutzt die Microservice-Architektur (Abbildung 2 rechts) hinter dem Frontend mehrere, unabhängige Services (daher der Name). Diese können auch unterschiedliche Datenbanken verwenden. Mit diesem Ansatz wird die Logik des Backends in kleinere Teile aufgedröselt. Besonders wichtig ist dabei die angesprochene Unabhängigkeit. Die Services sind dazu in der Lage, ihre Aufgabe selbstständig zu erfüllen, können dazu auf ihre Datenbank zugreifen und müssen nicht andere Services hinzuziehen. Beispiele für solche Services könnten die Authentifizierung von Nutzern, das Suchen nach Produkten oder das Kaufen von Produkten sein.
Betrachtet man nun eine Microservice-Architektur und lässt diese Serverless in der Cloud betreiben, kann man den Nutzen von Serverless gut erkennen: Die einzelnen Services können separat voneinander gestartet und ausgeführt werden. Die Entwickler definieren eventbasierte Funktionen (siehe “Event-driven Computing”), um beim Eintreten eines Events den passenden Service zu starten und auszuführen. Wenn beispielsweise ein Nutzer nach einem Produkt auf unserer Webseite sucht, wird der Such-Service hochgefahren, dieser durchsucht die Datenbank nach passenden Treffern und gibt diese an das Frontend zurück. Nachdem der Service diese Aufgabe erledigt hat, wird er durch den Cloud-Anbieter wieder heruntergefahren, wenn keine weiteren Anfragen hinzukommen.
Zusammenfassend lässt sich zu der Funktionsweise von Serverless sagen, dass es sich dabei um eine Architektur oder ein Modell zur Verwaltung und Bereitstellung des Systems über einen Cloud-Anbieter handelt. Dieser übernimmt Funktionen, wie das Hoch- und Runterskalieren von Services/Funktionen nach Bedarf durch die Bereitstellung von passender Rechen- und Speicherkapazität, die Wartung der darunterliegenden Hardware und stellt bei Bedarf auch vordefinierte Services bereit.
Vorteile
Nun stellt sich die Frage, welche Vorteile der Serverless-Ansatz mit sich bringt und wann es Sinn macht, ihn einzusetzen:
Skalierbarkeit: Ein bereits angesprochener und auch einer der zentralsten Punkte ist die Skalierbarkeit. Es können je nach Bedarf neue Ressourcen für das System bereitgestellt und auch wieder zurückgenommen werden. Dadurch ist es möglich, die Anfragen an das System effektiv zu beantworten und gleichzeitig effizient mit den Ressourcen umzugehen.
Geringe Kosten: Das führt auch direkt zu dem zweiten Vorteil, nämlich den geringeren Kosten. Da die Rechenleistung nur dann verwendet wird, wenn sie auch wirklich benötigt wird, zahlt der Kunde des Cloud-Anbieters auch nur für die wirklich genutzten Ressourcen. Das bedeutet, dass wenn wenig Last vorhanden ist und dementsprechend auch nur wenig Rechenleistung benötigt wird, die Kosten für das System gering gehalten werden und man nicht beispielsweise einen eigenen Server, der dauerhaft erreichbar ist, bereitstellen muss.
Geringerer Aufwand: Für die Entwickler selbst verringert sich der Aufwand durch die wegfallenden Verwaltungs- und Wartungsaufgaben. Es muss keine Zeit für die Instandhaltung, Administration oder Konfiguration der Server aufgebracht werden, wodurch mehr Zeit in die eigentliche Entwicklung der Software gesteckt werden kann. Bei der Verwendung von BaaS lässt sich zudem auf bereits vorhandene Services des Anbieters zurückgreifen, wodurch noch mehr Zeit eingespart werden kann.
Sicherheit und Zuverlässigkeit: Die Cloud-Anbieter stellen Sicherheitsmaßnahmen zum Schutz des Systems und der Daten bereit. Nicht nur sind die Daten dadurch professionell geschützt, sondern auch hier sparen sich die Entwickler wieder Zeit, da sie sich nicht selbst um die Sicherheit und Zuverlässigkeit des Systems kümmern müssen, sondern diese durch den Anbieter umgesetzt werden.
Ihre Vorteile kann die Serverless-Architektur am Besten ausspielen, wenn die dafür genutzte Applikation auch auf die genannten Vorteile eingeht. So macht es beispielsweise Sinn, Serverless einzusetzen, wenn man mit sehr unterschiedlichem Verkehrsaufkommen rechnen muss. Hier kann Serverless durch die einfache Skalierbarkeit und auch die dadurch effizientere Kostenverteilung besonders hilfreich sein, da man bei geringem Verkehr nur wenig bezahlt und bei vielen Anfragen diese trotzdem schnell behandeln kann.
Nachteile
Natürlich hat Serverless auch seine Nachteile und Probleme, die in diesem Abschnitt aufgegriffen und erläutert werden.
Startzeit: Da die einzelnen Services/Funktionen erst ausgeführt werden, wenn sie auch wirklich gebraucht werden, müssen sie zu Beginn einer Anfrage und wenn noch kein anderer, freier Service derselben Art gerade läuft, erst gestartet werden. Dies wird als “Cold Start” bezeichnet. Das Starten eines solchen Services beinhaltet das Starten eines Containers mit dem passenden Code, das Laden der dazugehörigen Packages und die eigentliche Ausführung des Services. Bei Anwendungen, die auf Serverless verzichten, fällt dieser Cold Start weg, da das System ja bereits komplett läuft.
Weniger Kontrolle: Der größte Nachteil ist wahrscheinlich der Verlust von Kontrolle in vielerlei Hinsicht. Zum einen hat man keine Möglichkeit mehr, die Hardware selbst zu konfigurieren und zu managen. Darunter fällt auch die Macht zu entscheiden, welche Sicherheitsvorkehrungen genau implementiert werden, welche Updates und Versionen von Software benutzt werden und vieles mehr. Bei der Nutzung von BaaS ist man zudem an den Code des Cloud-Anbieters gebunden und kann keine eigenen Optimierungen oder Anpassungen vornehmen. Außerdem liegen die verwendeten Daten auch nicht mehr bei den Entwicklern selbst auf einem Server, sondern müssen dem Cloud-Anbieter anvertraut werden.
Abhängigkeit: Ein weiterer Nachteil ist die Abhängigkeit vom Cloud-Anbieter. Lässt man sich auf sein System ein und verwendet möglicherweise von ihm bereitgestellte Services und Funktionen, ist ein Wechsel zu einem anderen Anbieter schwierig und kompliziert, da dieser eventuell nicht dieselben Funktionen oder Workflows anbietet.
Evaluierung
Betrachtet man die genannten Vor- und Nachteile von Serverless, so lässt sich sagen, dass Serverless durchaus viele relevante Vorteile mit sich bringt, die es als die Cloud-Architektur der Zukunft rechtfertigen würde. Besonders die Einfachheit, die durch die automatische Skalierung ermöglicht wird und damit gleichzeitig das System kostengünstiger und einfacher bereitzustellen macht, sind wichtige Kriterien.
Natürlich muss man die Abhängigkeit vom Cloud-Anbieter und den Kontrollverlust in Betracht ziehen. Besonders letzteres schließt Serverless für manche Unternehmen aus, die den vollen Zugang zu den Servern benötigen und selbst an dem System aktiv arbeiten wollen. Für viele Unternehmen eröffnet Serverless jedoch eine Möglichkeit, sich voll und ganz auf die Entwicklung der Anwendung zu konzentrieren.
Wie bei vielen Aspekten ist Serverless ein Angebot für einen Tausch. Man bekommt Effizienz und Kostenersparnis, büßt dabei aber Kontrolle und Zugänglichkeit ein. Im Endeffekt muss jedes Unternehmen selbst entscheiden, mit welcher Architektur sie arbeiten wollen. Ich persönlich denke, dass der Ansatz besonders für kleine Unternehmen und Startups eine gute Möglichkeit bietet, ihr Produkt auf den Markt zu bringen, ohne sich über die Verwaltung der Server Gedanken machen zu müssen, sondern sich auf ihr Produkt selbst fokussieren zu können. Denn besonders für diese Unternehmen ist es sehr aufwändig und vor allem teuer, die benötigte Hardware bereitzustellen und auch funktionstüchtig und sicher zu halten. Fällt diese Komponente weg, so können mehr Entwickler das tun, was sie wirklich tun wollen und auch können, nämlich entwickeln und müssen sich nicht mehr um die Bereitstellung ihres Systems sorgen.
Dementsprechend halte ich Serverless durchaus für eine Architektur, die die Zukunft weiter mitgestalten wird.
Ausblick
Neben den angesprochenen Aspekten, warum Serverless vermutlich in der Zukunft weiterhin stark vertreten sein wird, sollte man sich auch etwas Gedanken über die nötige Weiterentwicklung machen. Meiner Ansicht nach ist Serverless besonders für die steigende Anzahl an IoT-Geräten und Applikationen eine optimale Lösung. Oft bieten diese nämlich genau das, was Serverless bereitstellt: Kleine Funktionen und Services, die schnell ausgeführt werden, die hochskaliert werden müssen und die dadurch dem Entwickler einen kostengünstigen Weg bieten, diese Aspekte umzusetzen.
Ich könnte mir vorstellen, dass viele Systeme aufgrund dieser Einfachheit in die Cloud verlegt werden. Dadurch könnte es möglich sein, dass in der Zukunft viele dieser Systeme sich auch untereinander vernetzen. Es entsteht quasi eine große Welt aus miteinander verknüpften Services und Diensten, die auch die Entwicklung neuer Systeme einfacher gestaltet. Denn Cloud-Anbieter werden ihr Repertoir kontinuierlich erweitern und eine immer größere Sandbox aus vordefinierten Services erstellen, mit der Entwickler ihre Ideen schnell und einfach umsetzen können.
As software solutions continue to evolve and grow in size and complexity, the effort required to manage, maintain and update them increases. To address this issue, a modular and manageable approach to software development is required. Microservices architecture provides a solution by breaking down applications into smaller, independent services that can be managed and deployed individually.
Commonly used in distributed and large-scale systems, this architectural pattern is favored for its scalability, flexibility and suitability for systems that require rapid change and innovation. Continuous delivery, high scalability, agility and modularity are all shiny buzzwords associated with microservices, but they don’t tell the whole story. While microservices offer a number of benefits, it is important to remember that there are also challenges to this approach.
What are microservices, anyway?
The term “microservice” was introduced in 2005 by Peter Rogers, founder of Resource Oriented Computing. He used “micro-web-services” to describe more flexible and more service-oriented software architecture.
The microservices architecture is an approach to the development of software as a series of small services that can be deployed independently of each other. The basic principle of microservices, the division of software components into modular units, is nothing new, but rather based on the principle of Service Oriented Architecture (SOA) which came into use in the late 1990s. The microservice architecture is commonly considered an evolution of SOA because its services are more differentiated and run independently.
In a monolithic architecture, everything is implemented as a single, tightly coupled unit, with all components in a single code base. In contrast, a microservices architecture decomposes the application into an unlimited number of small, loosely linked services. Each of these services is responsible for one specific aspect of business.
Comparison of monolithic system architecture and Microservice architecture from [4].
Microservices are not just a technical approach. They are also an organizational approach. Conway’s law states that “Organizations which design systems […] are constrained to produce designs which are copies of the communication structures of these organizations.” In terms of that, it makes sense that there is also a need for a change in the organizational structure when implementing microservices.
Microservices are therefore, as already mentioned, a strong modularization concept. Microservices can communicate with each other via an application programming interface (API) that supports loose coupling. Traditional monolithic structures suffer from a tight coupling between its components, introducing high dependencies between modules. Each of those separate Microservices can be deployed and tested independently. As they communicate using the same protocols it doesn’t matter which technology they use in their implementation. The individual microservices can, for example, be programmed in different languages.
Why do we want a microservices architecture?
In an ideal world microservices help you…
…scale.
Unlike vertical scaling, also known as scaling up, where more resources are added to a single node in the system, there are no limits (from a hardware perspective) to horizontal scaling. Horizontal scaling, also known as scaling out, involves adding more nodes to the system, such as adding more servers to a cluster. An important advantage of horizontal scalability is the ability to increase capacity during operation.
…modularize.
The strong modularization makes the software easily accessible. A microservice is used for a single task and is designed to perform that task in the most effective way possible. A single service is easier to maintain and can be easily replaced. The modularization logic also makes it easier to build in redundancy, services can be duplicated very effortlessly. In addition, the individual components can be easily reused and developed further.
… create loose coupling.
Since the services communicate via an API, they are ideally only loosely coupled. Loosely coupled in this context refers to a system in which the individual microservices are designed to operate independently and do not have a tight dependency on each other. Separating the application into individual services prevents undesired dependencies.
…deploy independently.
An independent deployment allows frequent releases while the rest of the application remains available. This means that they can be modified, tested and put into production independently of each other. The individual microservices can be developed and maintained independently by business-oriented, cross-functional teams. Ideally, the teams should manage their products throughout their entire lifecycle. Following Amazon’s guiding principle „You build it, you run it”.
…be technology independent.
As mentioned above, microservices can be implemented in a technology independent way. Thus, they can be built in a way that suits their task best. Development teams in varying expert areas can use the language that suits their needs (e.g.: AI related parts of the application are implemented in python, C++ is used for critical real-time services).
…decentralize.
Ideally, each microservice has its own database, decentralizing responsibility and allowing updates to be made on an individual basis. In addition, distributing the services to independent databases avoids the problem of a Single Point of Failure (SPoF).
Are the benefits of microservices architecture overstated?
Microservices can help you scale and increase the availability of your system, but if you can’t effectively manage and coordinate the communication between services, it can lead to increased complexity. One of the main challenges is effectively managing and coordinating the communication between microservices, as it can lead to an increase in complexity.
The availability of the whole system decreases with the creation of more microservices . If we assume a 99% availability for a monolith, the availability of a system of microservices is reduced with each additional component that also has a 99% availability; to determine the availability of the whole system, the availability of the individual components are multiplied.
It is easier to debug and test a single microservice compared to a monolith because they are smaller and more manageable. However, debugging multiple microservices in a system can be challenging because it can be difficult to understand which microservice is performing a particular task. In contrast, observing the behavior of the system as a whole is relatively straightforward with a monolithic architecture. Debugging microservices can be a complex and time-consuming process because it requires a more nuanced understanding of how each component interacts.
So what is the best way to test such a complex system? Netflix, for example, implements chaos testing involving planned failures of its own services to test its systems’ ability to handle unexpected and faulty conditions. Another more conventional method would be integration testing which involves testing the interactions between microservices by creating test scenarios that simulate real-world interactions between them. The disadvantage of this method, however, is the lack of knowledge about what happens when one or more services fail. Depending on the specific requirements and characteristics of the microservices and the system as a whole, it may be helpful to combine several testing approaches.
The communication between different microservices should be decreased to a minimum thus only if they need functionalities of other microservices. If the communication between microservices frequently becomes a hindrance, it may indicate underlying architectural issues. A common problem with tightly coupled microservices is that changes in one microservice can have a domino effect on other microservices, leading to unexpected behavior and failures. Another issue is the over-reliance on synchronous communication between microservices, which can lead to deadlocks and slowdowns.
Managing the entire system can be complex, especially if the organization lacks technical expertise. In such cases, utilizing cloud providers like AWS or Azure can be a viable solution, though it may result in increased cost. Additionally, the implementation of a fail-safe API is crucial, but can be a complex task.
Another challenge is the independent deployment of microservices, which increases the operational overhead, testing challenges, and the need for specialized technical expertise. This can result in a higher level of complexity in the overall system. The decentralization of services can also increase the attack surface, making it more difficult to secure the system.
In the ideal microservices world each microservices has its own database. In reality, it is difficult to keep the data of the services separate with data that is needed by multiple microservices. This contradicts the approach of splitting the data into separate databases. A compromise needs to be found between how to split the data into separate databases and how to maintain data consistency.
For the use of microservices additional skill is needed such as knowledge about Kubernetes, Container, Logging or CI/CD Pipelines compared to monolithic applications. For smaller applications, a monolithic approach is more advantageous due to lower overhead in setting up and maintaining the system, as well as simpler and easier testing and deployment processes.
Main learnings
Be clear about why you want to do microservices. Is it because everyone else is doing it or because you need it? A microservice should not be the goal in itself, it can be more of a way to get to your goal.
Consider whether your application is too small. Microservices only make sense when your application reaches a certain size. Below that, the overhead of microservices is far too big.
If it is not possible to divide the project into small parts without creating a large number of dependencies, then you should leave it.
See if your organization has the ability to break down the structures to make microservices work and has the capacity to maintain the infrastructure needed for microservices.
Think about testing the whole system – We already know from monolithic applications that testing is crucial. However, it is equally important that the interactions between microservices can be tested effectively. Automated testing provides you assurance in the reliability and functionality of your system.
Consider whether there is a need for scaling to that extent. For a website with constant traffic or no spikes, it is possible to work well with monolithic systems as there is no need to scale resources quickly “on the fly.”
Conclusion
The question of when to prefer microservices over a monolithic system is a complex one that requires an understanding of the drawbacks and benefits of both approaches. There are certain guiding rules or criteria that can help determine when it makes sense to adopt microservices, such as the size and complexity of the system, the need for increased scalability and resilience, and the skills and resources available to manage and maintain the architecture. Understanding these factors can help organizations make informed decisions about whether to adopt microservices and how to implement them effectively.
What does the future hold for microservices?
Monitoring the health and performance of microservices can be a complex task and is likely to be a central area of interest in the future. The serverless computing approach is also expected to gain traction in the microservices space, as organizations do not have to worry about the underlying infrastructure. Finally, I would like to mention the ways in which artificial intelligence could improve microservices in the future. It is conceivable that AI algorithms could be used to improve the resilience of microservices through AI monitoring and management. Alternatively, the use of AI could be to improve communication between individual microservices. As these technologies continue to develop, it is likely that more and more new applications will emerge.
Main Sources
[1] Wolff, E. (2019). Microservices – A Practical Guide. CreateSpace Independent Publishing Platform. ISBN: 978-1-71707-590-1
[2] D. Shadija, M. Rezai and R. Hill, “Towards an understanding of microservices,” 2017 23rd International Conference on Automation and Computing (ICAC), Huddersfield, UK, 2017
Cloud Gaming lässt sich mit Remote Desktops, Cloud Computing und Video on Demand Diensten vergleichen. Im Grunde beinhaltet Cloud Gaming das Streamen von Videospielen aus der Cloud zum Endkunden. Dabei erfasst und überträgt der Client seine Nutzereingaben (bspw. Maus, Tastatur, Controller) an den Server. Während dieser wiederum die Gesamtspielweltberechnung sowie Nutzereingaben-Auswertung bewältigt. Der Client stellt lediglich die Kapazität bereit, die Frames in der gewünschten Qualität gestreamt zu bekommen. Wesentliche Vorteile ergeben sich für leistungsschwache Geräte, während der offenkundige Nachteil in der Notwendigkeit einer gut ausgebauten Dateninfrastruktur liegt.
Neben Game Streaming gehört zum Cloud Gaming auch das Hosten von Serverinstanzen für Onlinespiele, sowie das Bereitstellen von Plattformdiensten (bspw. Bestenlisten, Chatsysteme, Authentifizierung etc.). Wesentlicher Bestandteil ist des Weiteren das Angebot von leistungsfähigen Downloadservern.
Plattformdienste und etablierte Nutzung der Cloud
Plattform- oder Onlinedienste bieten Schnittstellen, welche Spielemetadaten zur Verfügung stellen. Diese Dienste Umfassen in der Regel Funktionalitäten wie Bestenlisten, Chat- und Gruppensysteme sowie Onlinelobbys, aber auch Metadienste wie Authentifizierung, Analyse, Zuordnung von Spieleridentitäten und Matchmaking. Hierbei können Dienste sowohl öffentlich im Internet als auch intern für andere Dienste zur Verfügung stehen. Während diese Systeme zu Beginn ihres Aufkommens noch als einzelne Monolithen bereitgestellt wurden, bieten heutige Cloudbetreiber solche Dienste als Microservices an. Dies sorgt dafür, dass sich die hohe Skalierbarkeit der Cloud auf die Plattformdienste übertragen lässt. Diese Skalierbarkeit ist vor allem deshalb wichtig geworden, da der Spielemarkt in den letzten Jahren stark gewachsen ist und auch die Spiele selbst immer ressourcenintensiver geworden sind. Als gutes Beispiel kann man hierfür eine der größten Spieleplattformen hernehmen: Steam. Steam integriert viele der angebotenen Spiele in die eigenen Plattformdienste. Dies umfasst Beispielsweise eine Freundesliste, Chatsysteme, Matchmaking und Verbindungssysteme. Zusätzlich wird auch die dahinter liegende Infrastruktur für die Vermarktung der Spiele von Steam gestemmt. Dies umfasst einen Webshop und Downloadserver.
Der große Aufwand und die Nachfrage nach diesen Diensten zeigen sich anhand des weltweiten Datenverkehrs von Steam. So kommt zum Zeitpunkt dieses Blogeintrags allein der deutsche Datenverkehr auf über 35 Petabyte innerhalb der letzten 7 Tage. Und dies wiederum entspricht lediglich ca. 4,3% des weltweiten Datenvolumens.
Diese Zahlen steigen vor allem dann rasant an, wenn es zu speziellen Aktionen kommt, wie etwa dem Release eines neuen, stark erwarteten Spieles oder Sale-Aktionen vergleichbar mit einem Black-Friday für Games. Hierbei kommt es dazu, dass teilweise Millionen von Spielern gleichzeitig das neue Produkt erwerben und herunterladen wollen.
Diese starke Belastung spüren aber nicht nur Shop- und Downloadserver, sondern natürlich auch die klassischen dedizierten Spieleserver selbst. Gerade beim Release eines neuen Massive Multiplayer Onlinespiels (MMO) oder einer neuen großen Inhaltserweiterung versuchen sich gleichzeitig mehrere tausende Spieler auf den Spieleservern einzuloggen, während es zum Normalbetrieb meist nur ein Bruchteil der User ist. Auch hier hilft die hohe Skalierbarkeit der Cloud. Da solche Ereignisse normalerweise zu dem Betreiber bewussten und geplanten Zeiten auftreten können allerdings im Vorfeld schon Ressourcen reserviert und bereitgestellt werden.
Technische Herausforderungen des Game Streaming
Während Plattform- oder Onlinedienste auf die heute weit verbreiteten und gut etablierten Microservice Strukturen und Architektur zurückgreifen, eröffnen sich mit Games as a Service oder Game Streaming ganz neue Herausforderungen. Die Simulation des eigentlichen Spiels kann noch problemlos in einer emulierten Umgebung ablaufen und seine Inputs von außen beziehen, sowie das berechnete Resultat nach außen über einen Video stream abgeben. Das wahre Problem liegt allerdings in der Latenz. Bei Games handelt es sich im Gegensatz zu den meisten anderen Medien um ein interaktives Medium. Das heißt auf eine Aktion des Nutzers muss idealerweise eine unmittelbare Reaktion des Mediums erfolgen. Bei Game Streaming sind die Ansprüche daran besonders hoch, wenn es mit anderen interaktiven Medien wie einem Livestream mit Live Chat verglichen wird. Hier sind Verzögerungen von bis zu 1 Sekunde noch akzeptabel. Bei Spielen hingegen wird eine Latenz von wenigen Millisekunden erwartet. In dieser Zeit muss also die Eingabe beim Client registriert, an den Server geschickt, dort verarbeitet und ein neues Bild an den Client zurückgeschickt, decodiert und angezeigt werden.
Viele Vorteile aus Cloud Computing und Video on Demand Diensten können sich direkt auf Game Streaming übertragen lassen.
Für das Spielen von Games aus der Cloud wird keine teure, eigene Hardware benötigt. Der Streamingdienst Betreiber stellt die nötige Hardware zur Verfügung, um das gewünschte Spiel auf einer maximalen Qualitätsstufe darstellen zu können.
Um die Wartung, Instandhaltung und Modernisierung der Hardwaresysteme kümmert sich der Streaming Anbieter. Für den Endkunden fallen dadurch keine hohen Einzelkosten an.
Spiele stehen jederzeit und überall auf vielen verschiedenen Endgeräten zur Verfügung. Auch leistungsschwache Geräte wie Smartphones, Smart TVs oder einfache Laptops können somit zum Spielen Hardwareintensiver Titel verwendet werden
Das Manipulieren und Betrügen in Online- und Singleplayerspielen wird durch das Streaming erheblich erschwert. Dies resultiert direkt aus der Begrenzung der Interaktionspunkte mit dem Spiel. Lediglich die Bildausgabe und die Nutzereingabe finden auf dem lokalen Gerät statt. Jegliche weitere Informationsverarbeitung, wie beispielsweise die Position eines Mitspielers bleiben dem Nutzer verborgen.
Neben den Vorteilen gibt es jedoch auch einige Nachteile:
Für Gamestreaming ist zwingend eine durchgehende, leistungsstarke Internetverbindung von Nöten. Denn im Gegensatz beispielsweise zum Video on demand kann bei Game Streaming kein Buffering verwendet werden. Dies ist bei Spielen allerdings nicht möglich, da der weitere Verlauf des Spieles direkt von den Eingaben des Spielers abhängig ist. Ein Verbindungsverlust führt somit zwangsläufig zu einer abrupten Unterbrechung der Spielsession.
Schwankungen in der Bandbreite führen zu einer Drosselung der Bildqualität und mindern damit das Spielerlebnis.
Das Übertragen bereits in Besitz befindlicher Spiele werden von vielen Anbietern nicht unterstützt. Dies kann dazu führen, dass Spiele entweder nicht zur Verfügung stehen oder nochmals auf der Streaming Plattform gekauft werden müssen.
Das Modifizieren der eigenen Spieldaten wird unterbunden, da nicht lokal auf die Spielinhalte zugegriffen werden kann. Dies verhindert, dass Spieler Modifikationen erstellen und ihr Spielerlebnis bei Bedarf individuell anpassen können.
Der Anbieter entscheidet welche Publisher und Franchise in seinem Portfolie aufgenommen werden. Dies erschwert vor allem kleine Studios oder Indie Entwickler sich auf dem Markt zu etablieren.
Spiele stehen zusätzlich nur solange zur Verfügung so lange sie sich im Angebot des Streamingdienst befinden oder dieser die Lizenzen hierfür besitzt.
Fazit
Cloud Gaming umfasst mehr als nur Game Streaming. Es ist bereits fester Bestandteil der heutigen Infrastruktur, da ein Großteil der Spiele auf Cloud-Dienste zurückgreift oder durch Plattformen wie bspw. Steam in diese integriert wird. Zwar steht Game Streaming an sich gerade erst in den Startlöchern, doch es würde mich nicht verwundern, wenn immer mehr Nutzer umsteigen oder zumindest teilweise auf dessen Vorteile zugreifen würden. Meiner Meinung nach wird es in absehbarer Zeit kein kompletter Ersatz für alle Spieler werden. Allerdings bin ich der Überzeugung, dass Game Streaming zum Netflix der Spieler wird, da es aufgrund der vorhandenen Technologien, Infrastruktur und Kunden über ein hohes Potential verfügt. Die Entwicklung der Streaming Ttechnologien steht allerdings erst an ihrem Anfang. Eine Weiterverfolgung wird sich in jedem Fall lohnen.
Microservices architectures seem to be the new trend in the approach to application development. However, one should always keep in mind that microservices architectures are always closely associated with a specific environment: Companies want to develop faster and faster, but resources are also becoming more limited, so they now want to use multiple languages and approaches. Therefore, it is difficult to evaluate a microservice architecture without reference to the environmental conditions. However, if you separate these external factors, you can still see that a microservice architecture offers much more isolation than other architectures – and can also handle security measures better than, for example, monolithic approaches due to today’s techniques and tools.
by Benedikt Gack (bg037), Janina Mattes (jm105), Yannick Möller (ym018)
This blog post is about getting started with your first large-scale software project using DevOps. We will discuss how to design your application, what DevOps is, why you need it, and a good approach for a collaborative workflow using git.
This post consists of two parts. Part one is this article which is all about finding out which topics you should consider thinking about when starting with DevOps for your own project as well as a basic introduction to these topics and recommendations for further reading. The second part is our “getting started” repository which contains a small microservice example project with additional readme-files in order to explain some of the topics above in practice and for you to try our workflow on your own.
This post is a summary of knowledge and experience we collected while accompanying and supporting a larger scale Online Platform project called “Schule 4.0” over the period of 6 months in DevOps related topics.
Before we can talk about Software Architecture, we have to talk about the life cycle of a typical software project, because a successful software project involves a lot of planning, consists of many steps and architecture is just one of them. Furthermore, actual coding is a late step in the process and can be quite time consuming when not planned properly. Without further ado, let’s have a look at a typical Software Development Life Cycle:
The first two steps, “Ideation” and “Requirements”, basically mean thinking about your project from a business and user perspective. Technical details are not important and might even be inappropriate, as you should involve all members of your small team in the process and not all of them have the technical knowledge. Also, don’t forget to write everything down!
In the Ideation phase, after you brainstormed ideas, we recommend creating a Use Case Diagram which shows how the users can interact with your software. You can learn more about that here.
Furthermore, we also recommend creating a requirements catalog for the requirements phase. You can download our template with this link. When you’re done thinking about your project and want to start with the technical stuff, but can’t describe your platform requirements in detail, why don’t you go with a more agile project life cycle? Just make sure you thought about your goal and how to achieve it beforehand.
Planning done? Then let’s start with designing your application, especially Software Architecture.
Difference between design patterns and architectural patterns
If you are familiar with the term Design Patterns like the Singleton and Iterator pattern in software development, don’t confuse them with the term architectural patterns. While design patterns are dealing with how to build the important parts of your software, the defining components, architectural patterns are all about how these parts are organized and playing together.
“[software] Architecture is about the important stuff. Whatever that is.”
So, what you’re going to think about in the design process heavily depends on what you define as important components and which architectural style you choose. Just remember that developing is mostly filling in the blanks and applying some design patterns for whatever you came up with in the design phase or, to put it short, implementing your software architecture.
Which architectural pattern to choose?
3 architectural styles – There is always a trade-off
There are many architectural patterns to begin with. In order to understand what these patterns are all about, here is an introduction to the three major architectural styles. These styles describe the overall idea behind different groups of patterns. Note that it is possible to mix these styles together for your own use case.
N-tier:
N-tier applications are typically divided in X different logical layers and N physical tiers. Each layer has a unique responsibility and can only communicate with its layer below, but not the other way around.
N-tier vs monolithic architecture
A typical architecture for web applications is the Three-tier-architecture which is described in detail below, but you could also go with a monolithic approach and separate your one tier application into different logical layers, like most game engines and the Windows NT platform architecture do (user and kernel mode).
This style is the developer friendliest and easy to understand if you feel comfortable with monolithic software development. The challenge is to end up with meaningful logical and physical layers and to deploy small changes or features while the platform is already running, which could lead to rebuilding and deploying a large part of the application. If the application is split up into multiple layers, deployment is made easier, but there is a high risk that some layers are unnecessary and are just passing requests to the next layer, decreasing overall performance and increasing complexity.
Service based
In thisarchitectural style, the application is split into multiple services, which communicate through a network protocol over a network. Each service is a black box with a well-defined interface, serves only one purpose, most likely a business activity, and can be easily designed for its purpose, for example by using different technologies like programming languages. These services are working by either chaining – service 1 calls service 2 which calls service 3 – or by one service acting as a coordinator. A typical modern implementation is the Microservice architecture, which will be covered in detail later.
Key benefits are:
maintainability: changes will only affect one specific service, not the whole application
scalability: each service can be load balanced independently
reusability: services are available system wide and in “Schule 4.0”, we often ended up reusing an already existing service as a template
The service based style is by far the most difficult architectural style. A big challenge is specifying your services to keep them as decoupled as possible from others, which is strongly dependent on the design of the service APIs. In addition it is much more complicated dealing with not only one but many applications.
Event-Driven
This style consists of two decoupled participants, producer and consumer. The producer creates an event and puts it in a queue, typically on behalf of an incoming request from a client, and one of the available consumers consumes the event and processes it. Pure event driven architectures are mostly used for IoT scenarios, but there are variations for the web context like the Web-Queue-Worker pattern.
Their clear benefit is high and simple scalability and short response times. A challenge is how to deal with long running tasks which demand a response, and when to favor direct shortcuts instead of a queue approach.
Architectural patterns in detail
The Three Tier Architecture
This before mentioned architectural pattern is of the N-tier architecture style and consists of three tiers and three layers. This architecture is very common and straightforward when developing simple web applications. The example below is for a dynamic web application.
Example of a Three Tier architecture for the web applications
The topmost layer is the presentation layer, which contains the user interface displaying important information and communicates with the application layer. Whether or not you need a web application or just static web pages, the layer solely runs on the client tier (desktop pc or mobile device) or requires a web server to serve and render the static content for the client. The application tier contains your functional business logic, for example the REST API for receiving and adding user specific dynamic content like shopping cart items, and the last layer, the data tier, stores all data which needs to be persistent.
In the example graph above, the react web application is served by a small web server and runs on the client devices. Next, the NodeJS application exposes a REST API to the client and talks to the mongo document database. The node app and the database are typically deployed into different tiers, most commonly a virtual machine and a dedicated database server or service.
Microservices
A microservice architecture is by far the hardest choice. This style is most often used by massive tech enterprises with many developer teams in order to run their large-scale online platform and allowing for rapid innovation. There are many big platforms like Amazon.com, Netflix and eBay that evolved from a monolithic architecture to a microservice architecture for many good reasons, but most likely none of them really apply to your project for now. To be honest, none of the reasons applied to our project “Schule 4.0” either, which only consisted of five developers, but in the never-ending river of new technologies we were able to find technologies that made things much easier, convinced us to at least give it a try and made things turn out great in the end. Those technologies will be discussed later.
The core idea is to have multiple loosely coupled services which expose an API. The biggest difference to normal service oriented architectures is that in a microservice architecture, each microservice is responsible for its own data and there is nothing like a centralized data tier. That means that only the specific service has access to its data and other services cannot access it directly other than via the service API.
Key benefits:
Highly maintainable, rapid innovation and development
Developers can work independently on a service
Services can be deployed independently
Services are typical small applications and therefore easy to understand
Testability: small services are easy to test
Availability: Errors in one service won’t affect other services
Dynamic technology stack
You can pick whatever technology you want whenever you want
Drawbacks:
Services are products and not projects: Each developer is responsible for their code over the whole life cycle
Highly distributed system
Requires to deal with continuous integration and deployment (which will be covered below)
Nightmare without containerization or virtualization
Domain Driven Design
Many challenges can be avoided by carefully designing your microservice application. A helpful method for designing your application is the Domain Driven Design approach and its notion of bounded contexts. The idea is that the most important thing is your core business, what your application is about. Our goal is to model the core business as different contexts and designing the architecture after that. In order to do that, here are some tips:
After finishing with your requirements analysis you can start defining the domain of your application. Imagine your project as a company. What is your company about? What are its products? We will call that your domain and the products your contexts.
Let us continue with a simplified example of the “Schule 4.0” model. The platform domain is all about teachers sharing content with their students in the form of pins, boards and exercises. With this knowledge we can already distinguish three contexts:
Example of a Domain Model for “Schule 4.0”
These contexts consist of different objects which have relationships. For example, a Board can contain multiple Pins and a User can own multiple Boards. Every context can be implemented as its own microservice. The crucial point later is how the services are connected to each other. Remember, all the data is physically separated into different services but needs to be logically connected in order to query data. That is where the bounded contexts appear.
Note in the graph above, that the User object can be found in multiple contexts. The object itself is most likely modelled differently in each context, but the overall idea of a user stays the same. The different User objects are explicitly linked together over the UserId attribute. Because the concept of a User is shared across multiple Contexts, we can query all data related to the current user.
Communication in a Microservice architecture – GraphQL Federation
After we have modelled the domain of our application, everything is connected logically, but we still cannot request and change data, because we did not define the service APIs yet. We considered two options:
Specifying a RESTful API for each service using HTTP
Specifying a Graph API for each service using GraphQL
Do a mixture of both
Note: We do not recommend doing a mixture of both. Historically speaking, there are many solutions migrating from REST to Graph, but starting from scratch there is no need for that.
Regardless of which API flavor you choose, the data is still spread over multiple services. For example, if you want to display the dashboard for the current user, pin and board data, exercises and user data need to be collected from different services. That means the code displaying the dashboard needs to interact with each service. It is not a good idea to leave that task to the client application, due to various performance and security reasons. In general, you should avoid making the internals of your backend publicly available.
That is why all microservice architectures are equipped with an API Gateway, which is the only entry point to your backend. Depending on the use case, it aggregates data from different services or routes the request to the appropriate service.
Small example for the first option using http
The services expose simple http endpoints to the gateway for manipulating and retrieving data. The gateway itself only exposes an endpoint for retrieving the dashboard for the current user. The UserId will be included in the requests by the client. In order to create and to get specific boards, the gateway needs to be expanded with further endpoints, which might be routed directly to the underlying service.
In conclusion, there is nothing wrong with this approach, especially considering that there are great tools like Swagger which help building large REST APIs, but you still have to design, manage and implement the service APIs and the Gateway API separately.
The second option is different. The idea is to design a data graph which can be queried and manipulated with a language called GraphQL. Imagine it like SQL, but instead of talking to a database you send the statement to your backend.
Representation of a data graph
The following pseudo example query me (name) boards (id, title) returns the name of the current user and its boards as a JSON object.
With GraphQL, the client can precisely ask only for the data it needs, which can reduce network traffic a lot. The best part is that you already possess a data graph if you have created a domain model beforehand. You just have to merge the different contexts and attributes of the same objects to one coherent graph.
It gets even better: By using Apollo GraphQL Federation, most of the implementation for your Graph API is done automatically. For this to work, you only have to define the data graph for each service, which are just the contexts from your domain model, and setup the GraphQL API Gateway. The implementation is straightforward:
Write down your service graph in the GraphQL Schema Definition Language (SDL)
Implement the Resolvers, which are functions for populating a single attribute/field in your graph
Note: instead of requesting every field from the database separately, you can request the whole document and Apollo GraphQL generates the resolvers automatically
Implement the Mutation Resolvers, which are functions for updating and creating data
Tell the GraphQL API Gateway where to find its services
The Gateway then automatically merges the service schemas to one coherent schema and automatically collects the requested data from the implementing services. It is even possible to reference objects in other services and the Gateway will combine the data from different services to a single object.
GraphQL Federation – Limitation
As good as Apollo GraphQL Federation sounds on paper, it is not an all-round solution. In reality, you will always have to climb many obstacles no matter which decisions you make.
One technical limitation you might come across is when you try to delete a user. To do so, you have to decide which service defines and implements the deleteUser mutation. It is not possible (yet?) to define the same mutation in multiple services.
Because deleting a user also involves deleting its referenced pins and boards, the PinBoards service needs to be notified through an additional API, which is only accessible internally and not exposed to the client API.
Furthermore this additional API makes testing your service more complicated. Testing will be covered in detail in our example repository.
Architecture summary
Comparison table of different architectures
Git Workflow
In a successful project, the simple cooperation is one, if not the central point for success. What is more annoying than merge-conflicts or accidentally pushed changes that cause the program to crash… But we can make arrangements to not have problems like this during our project – let’s start:
For the local development you all need a tool to make your git commits, push them, pull the others and so on….
For Windows systems and commandline-lovers we recommend the git bash from https://gitforwindows.org/. If you better like to click and see what you do you can use https://www.sourcetreeapp.com or the implementations from e.g. IntelliJ or VSCode to see what you do and organize your collaboration via git.
Settings
In our git repository, we first go to the project settings and activate Merge Requests and Pipelines. Afterwards the navigation sidebar has two new entries.
The settings page
Workflow
The following step-by-step instruction will give you a good workflow to work together with the configured git-Repository.
The workflow in simple view
Locally checkout the master (or Develop-branch) via git checkout master or in Sourcetree.
Create a new branch and name it sensefully. A good way to hold your repository clean is to sort it in directories by naming it like your abbreviation and the feature or fitting bug-ticket if you, for example, use the ticket-system in gitlab.
Let’s start coding, fixing bugs, developing new features, testing your code without crashing the master and only locally on your machine.
If your local changes are working as wanted, commit your changes with a namingful commit-message. Beware of committing wrongly changed files. If you do not want to use Sourcetree you can do it with the command git commit -a -m “commit-message” .
Then push your changes on your branch with Sourcetree or with git push origin <feature-branch> .
Go to the web interface and create a new merge-request. You can add not only a commit-message but more text, screenshots and files to visualize your changes to your team members.
Inform your team members so they can look at your changes, comment them, leave suggestions and finally approve it if all is fine.
Merge your feature-branch to the master or develop-branch if other team members gave their approvals.
If all tests are green and the build is ok, you can delete your extra branch to not leave so much data-waste.
To work successful with this model, think on some points:
branch new branches always from master
do not commit or push manually to the master
before a merge request merge the newest master to your feature-branch
not to much branches at the same time
small feature branches, no monsters…
name your branches senseful
CI & CD Pipeline
What is DevOps?
The term DevOps is derived from the idea of agile software development and aims to remove silos to encourage collaboration between development and operations. From this principle the term DevOps = development + operation is derived (Wikipedia, 2020).
To achieve more collaboration, DevOps promotes a mentality of shared responsibility between team members. Such a team shares responsibility for maintaining a system throughout its life cycle. At the same time, each developer takes responsibility for his own code, i.e. from the early development phase to deployment and maintenance. The overall goal is to shorten the time between the development of new code until it goes live. To achieve this goal, all steps that were previously performed manually, such as software tests, are now fully automated through the integration of a CI/CD pipeline.
The full automation allows a reduction of error-prone manual tasks like testing, configuration, and deployment. This brings certain advantages. On one hand, the SDLC (Software Development Life Cycle) is more efficient, and more automation frees team resources. On the other hand, automated scripts and tests serve as a useful, always up-to-date documentation of the system itself. This supports the idea of a pipeline as code (Fowler, 2017).
Figure 3 : The DevOps cycle. Source: Akamai (2020).
The structure of the CI/CD pipeline is defined within a YML file in a project’s root and formulates so-called actions, action blocks, or jobs. Pipeline jobs are structured as a block of shell commands which allow, for example, to automatically download all necessary dependencies for a job and automatically execute scripts. A pipeline contains at least a build, test, and deployment job. All jobs are fully or partly automated (GitLab CI/CD pipelines, 2020). A partial automation of a job includes manual activation steps. The benefits of a pipeline integration are, among others, an accelerated code cycle time, reduced human error, and a fast and automated feedback loop to the developer himself/herself. In addition, costly problems when integrating new code into the current code base are reduced.
A pipeline is the high-level construct of continuous integration, delivery, and deployment. The jobs executed in a pipeline, from a code commit until its deployment in production, can be divided into three different phases, depending on the methodology to which they can be assigned. These are the three continuous methodologies:
Continuous Integration (CI)
Continuous Delivery (CD)
Continuous Deployment (CD)
While Continuous Integration clearly stands for itself, Continuous Delivery or Continuous Deployment are related terms, sometimes even used synonymously. However, there is a difference, which is also shown in Figure 4 and will be further explained in the following.
540Figure 4: The three continuous methodologies. Source: RedHat (2020)
What is Continuous Integration (CI)?
The term Continuous Integration can be traced back to Kent Beck’s definition of the Extreme Programming (XP) process, which in turn is derived from the mindset of agile software development, which not only allows short cycle times but also a fast feedback loop (Fowler, 2017). Within such a software practice, each member of a team merges his/her work at least daily into the main branch of a source code repository. All integrations are automatically verified by an automated build which also includes testing and code quality checks, e.g. linting, to detect and fix integration errors early. This approach allows a team to develop software faster and reduces the time spent manually searching for and identifying errors.
Further, each team member bears the same responsibility for maintaining and fixing bugs in the software infrastructure and for maintaining additional tools such as the integrated CI/CD pipelines. Furthermore, a pipeline is not static but has to be monitored, maintained, and optimized iteratively with a growing codebase. To achieve high-quality software products, everyone must work well together in a team culture free of the fear of failure (Fowler, 2017).
Figure 5: Meme Source: Memegenerator, Yoda (2020)
Some important CI Principles
Continuous Integration is accomplished by adhering to the following principles:
Regular integration to the mainline: New code needs to be integrated at least once per day.
Maintaining a single-source repository: Keep a stable, consistent base within the mainline.
Shared responsibility: Each team member bears the same responsibility to maintain the pipeline and project over its complete lifecycle.
Fix software in < 10min: Bugs have to be fixed as fast as possible by the responsible developer.
Automate the build: Test and validate each build.
Automate testing: Write automatically executable scripts and keep the testing pyramid in mind.
Build quickly: Keep the time to run a pipeline as minimal as possible.
Test in a clone: Always test in the intended environment to avoid false results.
What belongs in a Source Code Repository?
As the source code repository builds the base for the pipeline, it is important to keep it as complete as possible in order to be able to fulfill CI jobs. As such a source code repository for a CI/CD Pipeline should always include:
Source code
Test scripts
Property files
Database schemas
Install scripts
Third-party libraries
Why Containerize pipeline jobs?
Some of you may have come across this problem before – “Defect in production? Works on my machine”. Such costly issues with different environments on different machines (e.g. CI/CD server and production server) can be prevented, by ensuring that builds and tests in the CI pipeline as well as the CD pipeline are always executed in a clone of the same environment. Hereby it is recommendable to use docker containers or virtualization. For virtualization, the use of a virtual machine can be enforced by running a Vagrant file that ensures the same VM setup throughout different machines. In short, automating containerized jobs standardizes the execution and ensures that no errors slip through due to different environments in which build, and tests are executed.
Further, by combining the CI/CD Pipeline with docker, an integration between CI and CD is enabled. Advantages are that the same database software, same versions, same version of operating system, all libraries necessary, same IP addresses, and ports, as well as the same hardware setting, is provided and enforced throughout build, test as well as deployment.
Figure 6: Meme 2 Source: Memegenerator, (2020)
How to identify CI Jobs?
To identify pipeline jobs, you should, together with the team, consider which manual steps are currently performed frequently and repetitively and are therefore good candidates for automation. Such repetitive tasks can be for example testing, building, and deployment, as well as the installation of shared dependencies, or even clean-up tasks to free memory space after a build was executed. The number of jobs can be expanded as desired but must remain self-contained. This would mean, for example, a unit test job contains only the necessary dependencies, shell commands, and unit test scripts that are needed to fulfill the unit test job.
Further, when defining the structure of a pipeline, it is important to not integrate any logicinto the pipeline. This means that all logic must be outsourced into scripts which then can be executed automatically by the pipeline. Furthermore, pipelines need to be maintained and updated over the life cycle of a project to always keep it up-to-date and prevent errors. This means that pipelines also travel the whole software development life cycle (SDLC).
How to integrate Testing?
As already mentioned in the section above (see chapter 1 – Microservices) – When writing test scripts, it is important to look at the entire test pyramid, from automated unit tests to end-to-end tests. These test scripts then can be integrated into a CI pipeline’s test jobs.
Overall, testing is very important to avoid later costs due to time and cost-effective bugs and downtimes in production. When testing architectures, such as microservices, care must be taken to test the individual services not only independently of each other, but also in their entire composition. The challenge in this case is that there are also dependencies between the individual services.
Furthermore, an integration with a real database must also be tested. Since databases are not directly mapped into source code, it makes sense to formulate them in scripts against which tests can be executed. Overall it makes sense to not only mock a database but also test with real database integration.
Image 7: The three continuous methodologies. Source: RedHat (2020)
What is Continuous Delivery (CD)?
The term Continuous Delivery is based on the methodology of Continuous Integration and describes the ability to deliver source code changes, such as new features, configuration changes, bug fixes, etc., in a continuous, secure, and fast manner to a production-like environment or staging repository. This fast delivery approach can only be achieved by ensuring that the mainline code is always kept ready for production and as such can be delivered to an end-customer at any time. From this point on an application can be quickly and easily deployed into production. With this methodology, traditional code integration, testing, and hardening phases can be eliminated (Fowler, 2019).
The five principles are also valid for the other methodologies.
Build quality
Work in small batches
Computers perform repetitive tasks, people solve problems
Relentlessly pursue continuous improvement
Everyone is responsible
(Humble, 2017)
Figure 9: The three continuous methodologies. Source: RedHat (2020)
What is Continuous Deployment (CD)?
The term Continuous Deployment implies a previous step of Continuous Delivery where production-ready builds are automatically handed over to a code repository. In conclusion, Continuous Deployment also builds on the methodology of Continuous Integration.
Continuous Deployment implies that each commit should be deployed and released to production immediately. To this purpose, the Continuous Deployment block in the pipeline includes the automation of code deployment via scripts to enable continuous and secure deployment. This follows the fail-fast pattern, which makes it possible to detect errors that have slipped into production through the automated test blocks which lead to an unhealthy server cluster. By implementing the fail-fast pattern, an error is easily correlated with the latest integration and can be quickly reversed. This pattern keeps the production environment downtimes to a minimum while code with new features goes live into production as quickly as possible. In practice, this means that successful code changes, from a commit to the branch to going live will only take a few minutes (Fitz, 2008).
Figure 10: The GitLab CI/CD basic workflow. Source: GitLab (2020)
Automated rollbacks and incremental deployments
Even if automated end-to-end tests against a build in a CI pipeline might have passed successful, unforeseen bugs might occur in production. In such case an important capability of pipelines is the possibility of automated rollbacks. In case of an unhealthy state of deployed code in production, a fast rollback allows to return to the last, healthy working state. This includes automatically reverting a build. Another option is incremental deployments, which are deploying new software to one node at a time, gradually replacing the application to gain more control and minimise risk (GitLab, 2020).
The relationship between Continuous Integration, Delivery, and Deployment
The following graph gives a quick overview of the workflow from the completion of a new feature in a working branch until its reintegration into the mainline. The following figure 11 gives a high-level overview of the workflow depicted in a pipeline.
Figure 11: Overview of the CI/CD Pipeline workflow. Source: SolidStudio (2020)
When a new code feature is committed to a remote feature branch, the CI will carry out an automated build. Within this build process, the source code of the feature branch will automatically be checked for code linting and compiled. The compilation is linked to an executable and all automated tests are run by the pipeline. If all builds and tests run without errors the overall build is successful.
Bugs or errors encountered throughout the CI pipeline will make the build fail and halt the entire pipeline. It is then the responsibility of the developer to fix all occurring bugs and repeat the process as fast as possible to be able to commit a new feature and merge it into the mainline to trigger the deployment process. Figure 12 shows such a CI/CD workflow between Continuous Integration, Delivery, and Deployment (Wilsenach, 2015).
Figure 12: The relationship between continuous integration, delivery and deployment Source: Whaseem, M., Peng, L. and Shahin M. (August 2020)
Why did we pick the GitLab CI/CD Pipeline?
In total there are many products, from Jenkins, GitHub actions, DroneCi, CircleCi, to AWS Pipeline, etc. on the market. They all offer great services for DevOps integration. After research some free tools, we decided to stay with GitLab for the university project “Schule 4.0”, as Gitlab itself is a single DevOps tool which already covers all steps from project management, source code management to CI/CD, security, and application monitoring.
This choice was also made to minimize the tech stack. Since the university already offers a GitLab account on its own GitLab instance for source code management and issue tracking, we felt it made most sense to minimize the use of too many different tools. Therefore, sticking with one product to integrate a custom CI/CD Pipeline into our project meant having everything in one place, which also simplifies the complexity of the toolchain and furthermore allows us to speed up the cycle time.
What is a GitLab Runner?
To be able to execute and run a CI/CD pipeline job, a runner needs to be assigned to a project. A GitLab runner can be either specific to a certain project (Specific Runner) or serve any project in the GitLab CI (Shared Runner). Shared runners are good to use if multiple jobs must be run with similar requirements. But in our case, as HdM has an own GitLab instance (at https://gitlab.mi.hdm-stuttgart.de/), a Specific Runner installed on our own server instance was the way to go.
The GitLab Runner itself is a Go binary that can run on FreeBSD, GNU/Linux, macOS, and Windows. Architectures such as x86, AMD64, ARM64, ARM, and s390x are supported. Further, to keep the build and test jobs in a simple and reproducible environment it is also recommendable to use a GitLab runner with a Docker executor to run jobs on your own images. It also comes with the benefit of being able to test commands on the shell.
Before installing a runner, it is important to keep in mind that runners are isolated machines and therefore shouldn’t be installed on the same server where GitLab is installed. Further, in case of the need to scale out horizontally, it can also make sense to split jobs and hand them over to multiple runners, installed on different server instances which then are able to execute jobs in parallel. If the jobs are relatively small, an installation on a Raspberry Pi is also a possible solution. This comes with the benefit of more control, higher flexibility, and very important – fewer costs.
Figure 13: Architecture for GitLab CI/CD in School 4.0 Source: Own graphic
For the project “Schule 4.0” we chose the architecture in figure 13 with two independent runners on two different machines. For this purpose, two runners were installed on an AWS EC2 Ubuntu 20.04 and in a VM on the HdM server instance to run the jobs for CI and CD separately.
Challenges and Limitations
The reason for the decision to split CI and CD jobs up was less due to horizontal scaling but the fact that the HdM server used for the deployment is located in a virtual, private network. Due to this isolation of the network, it is not possible to deploy directly from outside the network. Since there were no permissions on the HdM server for customizations, the resulting challenges could be circumvented with a second runner, installed in a VM on the target machine.
Further, it is important to isolate the runner, as when installed directly on a server, which also serves as a production environment, unwanted problems can occur e.g. due to ports that are already in use or duplicate docker image tag-names between the actual build jobs and already deployed builds which can cause failure and halt the pipeline.
To reduce costs, which in our case amounted to 250 US$ (luckily virtual play money) on our educational AWS accounts, in only 2 months, the CI runner was also temporarily installed on a RaspberryPi 3 B+ with a 32 GB SSD card. Hereby it is important to state, that often slow home network speed and large jobs have an impact on RaspberryPi’s overall performance and can make build and testing jobs slow. This is okay for testing the pipeline but takes a too large amount of time when developing in a team. Therefore, to speed things up, the runner for the CI jobs was later again installed on a free tier AWS EC2 Ubuntu instance.
How to install a GitLab Runner?
To install a GitLab runner two main steps need to be followed:
A GitLab Runner can be installed on any suitable server instance e.g. on an IBM or Amazon EC2 Ubuntu instance by using the AWS Free Tier test account or an educational account. The following provides further information on how to getting started with AWS EC2.
Another possibility is to install a runner on your own RaspberryPi. The RaspberryPi 3B+ we used has a 32GB SSD card and uses the Raspbian Buster image which can be downloaded from the official distribution. Be aware that it might cause issues following the standard runner installation on GitLab as the RaspberryPi 3B+ uses a Linux ARMV7 architecture. A good tutorial to follow can be found on the blog of Sebastian Martens. After its installation, the Runner will work even after completely rebooting your Raspberry Pi.
Runner intallations on an AWS EC2 instance
Advantages
Better network speed than in a private home network.
Faster build times.
Easy and quick setup of an instance.
Limitations
High costs for services, even with a free tier selection.
Limitations in server configurations were due to free or educational accounts, e.g. memory < 10GB, limitations on server location that can lead to latency and time outs.
The server can be over configured and become unhealthy – all configurations are lost if no backup/clone of the instance has been made.
AWS offers quite complex and ambiguous documentation
In contrast to using a paid service, a more cost-effective solution may be to install the GitLab Runner inside a virtual machine on the HdM server, or on an own RaspberryPi or any other services.
2. Runner installation on a RaspberryPi
Advantages
Full flexibility over available software and software versions
Costs are lower compared to leased servers including root access
Access data to the target infrastructure is available in the local network
Limitations
Takes longer to execute the pipeline
Local home-network speed can slow job execution down due to images, dependencies, git repository, etc. that need to be downloaded
Not well suited for the execution of very large jobs
It is recommendable to take a RaspberryPi 3 or even newer
How does the GitLab Runner work?
To better understand how a GitLab Runner picks up jobs from the CI pipeline, assigns and returns the build and test results to a coordinator, the following sequence diagram in figure 13 will be explained in more detail.
Figure 13: Sequence diagram GitLab Runner interaction with GitLab CI Server. Source: Evertse, Joost (August 2019, p. 56)
A Specific Runner, as used in our project, executes all jobs in the manner of a FIFO (First-in-first-out) queue. When the GitLab Runner starts, it tries to find the corresponding coordinator (the project’s GitLab server) by contacting the GitLab URL that was provided when the runner was registered. When the Runner registers with the registration token also provided at registration, it will receive a special token to connect to GitLab. After a restart, the GitLab Runner connects and waits for a request from the GitLab CI.
A Runner registered to a source code repository listens for change events in a particular branch, which causes the runner to fetch and pull the GitLab repository. The runner then fetches and executes the CI/CD jobs defined in the .gitlab-ci.yml file for the Gitlab server. The build and test results, as well as logging information, are returned to the GitLab server, which then displays those for monitoring purposes. If all jobs were executed successfully each job in the pipeline receives a green symbol and the push or merge onto a branch can be completed.
Figure 15: Pipeline jobs successfully executed Source: Own GitLab project pipelineFigure 16: Pipeline jobs successfully executed. Source: GitLab (2020)
How to build a GitLab CI/CD Pipeline?
A GitLab CI/CD Pipeline is configured by a YAML file which is named .gitlab-ci.yml and lies within each project’s root directory. The .gitlab-ci.yml file defines the structure and order of the pipeline jobs which are then executed sequentially or in parallel by the GitLab Runner.
Introduction to a Pipeline Structure
Each pipeline configuration begins with jobs that could be seen as a bundled block of command-line instructions. Pipelines contain jobs that determine what is needed to do and stages, which define when the jobs should be executed.
stages: # ------- CI ------ - build - quality - test # ------- CD ------ - staging - production
The stages use the stage-tags to define the order in which the individual pipeline blocks/jobs are executed. Blocks with the same stage-tag are executed in parallel. Usually there are at least the following stage-tags:
build – code is executed and built.
test – code testing, as well as quality checking by means of linting, etc.
The architecture shown in figure 20 is not very efficient, but easiest to maintain. As such it is shown, in combination with the following .gitlab-ci.yml file example, as a basic example, to understand the construction of a pipeline’s architecture. By defining the relationships between jobs a pipeline can be speeded up.
The basic structure of the individual jobs/blocks within a pipeline includes the block name, and subordinate stage with stage tag, script tag with the executable scripts/commands. Each job must also contain a tag for the selected runner (here: gitlab-runner-shell). This structure can easily be extended.
Jobs can be initiated in different ways. This depends on the tags which are assigned to the individual jobs. For example, it can make sense to not fully automate a job. In this case, a job e.g. deploy job, gets a tag for a manual joband thus forces the job to wait for manual approval and release by a developer.
Efficiency and speed are very important when running the jobs through a CI/CD Pipeline. Therefore it is important to think not only about the architecture but also consider the following concepts to speed things up.
Host an own GitLab Runner: Often the bottleneck is not necessarily the hardware, but the network. Whilst GitLab’s Shared Runners are quick to use, the network of a private cloud server is faster.
Pre-install dependencies: Downloading all needed dependencies for each CI job is elaborative and takes a lot of time. It makes sense to pre-install all dependencies on an own docker image and push it to a container registry to fetch it from there when needed. Another possibility is to cache dependencies locally.
Use slim docker images: Rather use a tiny Linux distribution for images to execute a CI job than a fully blown up one with dependencies that you might not even need. In our project we therefore used an Alpine Linux distribution.
Cache dynamic dependencies: If dependencies need to be dynamically installed during a job and thus can’t be pre-installed in an own docker image, it makes sense to cache those. Hereby, GitLab’s cache possibility allows to cache dependencies between job runs.
Only run a job if relevant changes happened: This is very useful, especially for a project that makes use of a microservice architecture. For example, if the front-end changed, the build and test jobs for all the others don’t need to be run as well. Such behavior can be achieved by using the only keyword in a job. The following gives a short example.
More detailed information can be found in the Example Repository’s README files.
Useful tools to be considered for building a GitLab CI/CD pipeline
Avoid syntax errors by the use of CI Linting. In order to avoid syntax errors and to get things right from the start, GitLab offers a web-based linting tool, which checks for invalid-syntax in the gitlab-ci.yml file. In order to use the web-based linting tool, simply add the extension -/ci/lint to the end of your project’s URL in GitLab.
As DevOps and CI/CD is a quite popular and complex topic we also want to introduce some further possibilities to optimize your DevOps, some are the following:
CI with Linting & Testing – Drone CI
Deployment with Jenkins
Linting with Sonarqube
Monitoring and Logging with GitLab CI/CD
Part 2 – “getting started” repository
When you’re done getting comfortable with the topics, it is time to see an example implementation of the theory above. Head over to our “getting started” repository containing a microservice application and further explanations.
DevOps is a big buzzword for many complex tools and things that can make your project easier to work with. It´s a long way from a simple project idea until you have a working infrastructure, but it will help you to work easier and more efficient on your project in the future. For all the described tools and use-cases there are so many alternatives you can use.
On a good software-project all the sections above are important. How to design your software, how to work together, how to make the development-workflow more easy to use and bring your project from your local machine to a reachable server.
With this blogpost and the additional repository content you have an overview about the possibilities with a bunch of instructions on how you can start with DevOps in your project.
This is part two of our series on how we designed and implemented a scalable, highly-available and fault-tolerant microservice-based Image Editor. This part depicts how we went from a basic Docker Compose setup to running our application on our own »bare-metal« Kubernetes cluster.
This is part one of our series on how we designed and implemented a scalable, highly-available and fault-tolerant microservice-based Image Editor. The series covers the various design choices we made and the difficulties we faced during design and development of our web application. It shows how we set up the scaling infrastructure with Kubernetes and what we learned about designing a distributed system and developing a production-grade Kubernetes cluster running on multiple nodes.
The last two years in software development and operations have been characterized by the emerging idea of “observability”. The need for a novel concept guiding the efforts to control our systems arose from the accelerating paradigm changes driven by the need to scale and cloud native technologies. In contrast, the monitoring landscape stagnated and failed to meet the new challenges our massively more complex applications pose. Therefore, observability evolved as a mission-critical property of modern systems and still attracts much attention. The numerous debates differentiated monitoring from observability and covered its technical and cultural impact on building and operating systems. At the beginning of 2019, the community reached consensus on the characteristic of observability and elaborated its core principles. Consequently, new tools and SaaS applications appeared marking the beginning of its commercialization. This post identifies the forces driving the evolution of observability, points out trends we presently perceive and tries to predict future developments.
In our fourth part of Microservices – Legolizing Software Development we will focus on our Continuous Integration environment and how we made the the three major parts – Jenkins, Docker and Git – work seamlessly together.
You must be logged in to post a comment.