Security and Usability: How to design secure systems people can use.

Security hit a high level of importance due to rising technological standards. Unfortunately it leads to a conflict with Usability as Security makes operations harder whereas Usability is supposed to make it easier. Many people are convinced that there is a tradeoff between them. This results in either secure systems that are not usable or in usable systems that are not secure. Though developers are still struggling with the tradeoff, this point of view is outdated somehow. There are solutions that do help to design secure systems people can use.

Continue reading

Convenient internet voting using blockchain technology


Within this century the use of digital technology has probably never been as high and as convenient as of today. People use the internet to access encyclopedias, look up food recipes and share pictures of their pets. It doesn’t matter whether you are at home, standing in an aisle at the grocery store or even flying on an airplane. Our devices provide unlimited access to modern technology and even somewhat changed the way we used to do things. For instance, it is now a matter of minutes, sometimes even seconds for us to buy some products online or quickly check our balance on banking accounts, whereas those things used to require you to at least leave the house for some time. In some cases, we even narrowed down our involvement for buying products to simply pushing down a button. In comparison to the older day methods for those actions this seems like a huge improvement. And it is. But maybe not in all regards.

Continue reading

Multiplayer TypeScript Application run on AWS Services

Benjamin Janzen

The project

CatchMe is a location-based multiplayer game for mobile devices. The idea stems from the classic board game Scotland Yard, basically a modern version of hide & seek. You play in a group with up to 5 players outside, where on of the players gets to be chosen the “hunted”. His goal is trying to escape the other players. Through the app he can constantly see the movement of his pursuers, while the other players can only see him in set intervals.

The backend of the game builds on Colyseus, a multiplayer game server for Node.js, which we have adjusted to our needs. There’s a lobby, from which the players can connect into a room with other players and start the game.
Continue reading

How does Tor work?

Written by Tim Tenckhoff – tt031 | Computer Science and Media

1. Introduction

The mysterious dark part of the internet – hidden in depths of the world wide web, is well known as a lawless space for shady online drug deals or other criminal activities. But in times of continuous tracking on the Internet, personalized advertising or digital censorship by governments, the (almost) invisible part of the web promises to bring back lost anonymity and privacy as well. This blogpost aims to shed light into the dark corners of the deep web and primarily deals with the explanation of how TOR works.

Reference: Giphy, If Google was a person: Deep Web
  1. Introduction
  2. The Deep Web
    1. 2. 1 What is the Tor Browser?
  3. The Tor-Network
    1. 3.1 Content
    2. 3.2 Accessing the Network
    3. 3.3 Onion Routing – How Does Tor Work?
  4. Conclusion – Weaknesses
  5. References

2. The Deep Web

So, what exactly is the deep web? To explain this, it makes sense to cast a glance at the overall picture. The internet as most people know it, forms only a minimal proportion of the overall 7.9 Zettabyte (1 ZB = 10007 bytes = 1021 bytes = 1000000000000000000000 bytes 
= 1 trillion Gigabytes?) of data available online (Hidden Internet 2018). This huge amount of data can be separated into three parts:

Separation of the worldwide web, Reference: (Search Engines 2019)

As seen in the picture above, we are accessing only 4% available on search engines like Google or Bing. The remaining 96% (90% + 4%) are protected by passwords, hidden behind paywalls or can be accessed via special tools (Hidden Internet 2018). But what separates the hidden parts into Deep Web and Dark Web by definition?

The Deep Web is fundamentally referred to data which are not indexed by any standard search engines as e.g. Google or Yahoo. This includes all web pages that search engines cannot find, such as user databases, registration-required web forums, webmail pages, and pages behind paywalls. Thus, the Deep Web can, of course, contain content that is totally legal (e.g. governmental records). The Dark Web is a small unit of the Deep Web – which refers to web pages that cannot be found by common search engines. The collection of websites that belongs to this dark web​ only exists on an encrypted network that cannot be reached by regular browsers (such as Chrome, Firefox, Internet Explorer, etc.). In conclusion, this area is the well-suited scene of cybercrime. Accessing these Dark Websites requires the usage of the Tor Browser.

…hidden crime bazaars that can only be accessed through special software that obscures one’s true location online.

– Brian Krebs, Reference: (Krebs On Security 2016)

2. 1 What is the Tor Browser?

The pre-alpha version of the Tor Browser was released on September 2002 (Onion Pre Alpha 2002 and the Tor Project, the company maintaining Tor, was started in 2006. The name Tor consists of three subterms and is the abbreviation of The onion router. The underlying Onion Routing Protocol was initially developed by the US Navy in the mid-1990s at the U.S Naval Research Laboratory (Anonymous Connections 1990). The protocol basically describes a technique for anonymous communication over a public network: By encapsulating each message carried in several layers of encryption and redirecting Internet traffic through a free, worldwide overlay network. It is called onion routing because of the layers in this network and the layers of an onion. Developed as free and open-source software for enabling anonymous communication, the Tor-Browser still follows the intended use today: protecting ​personal privacy and communication by protecting internet activities from being monitored.

With the Tor Browser, barely anyone can get access to The Onion Router (Tor) network by downloading and running the software. The browser does not need to be installed in the system and can be unpacked and transported as portable software via USB stick (Tor Browser 2019). As soon as this is done, the browser is able to connect to the Tor network. This is a network of many servers, the Tor nodes. While surfing, the traffic is encrypted by each of these Tor nodes. Only at the last server in the chain of nodes, the so-called​ exit node, the data stream is decrypted again and normally routed via the Internet to the target server, which is located in the address bar of the Tor browser. In concrete terms, the Tor browser first downloads a list of all available Tor servers for the connection over the Tor network and then defines a random route from server to server for data traffic, which is called Onion Routing as said before. These routes consist of a total of three Tor nodes, with the last server being the Tor exit node (Tor Browser 2019).

Conncetion of a Web-Client to Server via Tor Nodes, Reference: (Hidden Internet 2018)

For the reason that traffic to the Onion service runs across multiple servers from the Tor Project, the traces that users usually leave while surfing with a normal Internet browser or exchanging data such as email and messenger messages become blurred. Even though the payload of normal Internet traffic is encrypted, e.g. via https, the header containing routing source, destination, size, timing etc. can simply​ be spied by attackers or Internet providers. Onion routing in contrast​ also obscures the IP address of Tor users and keeps their computer location anonymous. To continuously disguise the data route, a new route through the Tor network is chosen every ten (Tor Browser 2019) minutes. The exact functionality of the underlying encryption will be described later in section Onion Routing – How Does Tor Work?.

3. The Tor-Network

For those concerned about the privacy of their digital communications in times of large-scale surveillance, the Tor network provides the optimal obfuscation. The following section explains which content can be found on websites hidden in the dark web, how the multi-layered encryption works in detail, and what kind of anonymity it actually offers.

3.1 Content


Most of the content in relation to the darknet involves nefarious or illegal activity. With the provided possibility of anonymity, there are many criminals trying to take advantage of it. This results in a large volume of darknet sites revolving around drugs, darknet markets (sites for the purchase and sale of services and goods), and fraud. Some examples found within minutes using the Tor browser are listed in the following:

  • Drug or other illegal substance dealers: Darknet markets (black markets) allow the anonymous purchase and sale of medicines and other illegal or controlled substances such as pharmaceuticals. Almost everything can be found here, quite simply in exchange for bitcoins.
  • Hackers: Individuals or groups, looking for ways to bypass and exploit security measures for their personal benefit or out of anger for a company or action (Krebs On Security 2016), communicate and collaborate with other hackers in forums, share security attacks (use a bug or vulnerability to gain access to software, hardware, data, etc.) and brag about attacks. Some hackers offer their individual service in exchange for bitcoins.
  • Terrorist organizations use the network for anonymous Internet access, recruitment, information exchange and organisation (What is the darknet?).
  • Counterfeiters: Offer document forgeries and currency imitations via the darknet.
  • Merchants of stolen information offer e.g. credit card numbers and other personally identifiable information can be accessed and ordered for theft and fraud activities.
  • Weapon dealers: Some dark markets allow the anonymous, illegal purchase and sale of weapons.
  • Gamblers play or connect in the darknet to bypass their local gambling laws.
  • Murderers/assassins: Despite of existing discussions about whether these services are real or legitimate, created by the law enforcement or just fictitious websites, some dark websites exist, that offer murder for rent.
  • Providers of illegal explicit material e.g. child pornography: We will not go into detail here.
Screenshot of the infamous Silk Road (platform for selling illegal drugs, shutdown by the FBI in October 2013) , Reference: (Meet Darknet 2013)

But the same anonymity also offers a bright side: freedom of expression. It offers the availability to speak freely without fear about persecution in countries where this is no fundamental right. According to the Tor project, hidden services allowed regime dissidents in Lebanon, Mauritania and the Arab Spring to host blogs in countries where the exchange of those ideas would be punished (Meet Darknet 2013). Some other use-cases are:

  • To use it as a censorship circumvention tool, to reach otherwise blocked content (in countries without free access to information)
  • Socially sensitive communication: Chat rooms and web forums where rape and abuse survivors or people with illnesses can communicate freely, without being afraid of being judged.

A further example of​ that is the New Yorker’s Strongbox, which allows whistleblowers to upload documents and offers a way to communicate anonymously with the magazine (Meet Darknet 2013).

3.2 Accessing the Network

The hidden sites of the dark web can be accessed via special onion-domains. These addresses are not part of the normal DNS, but can be interpreted by the Tor browser if they are sent into the network through a proxy (Interaction with Tor 2018). In order to create an onion-domain, a Tor daemon first creates an RSA key pair, calculates the SHA-1 hash over the generated public RSA key, shortens it to 80 bits, and encodes the result into a 16-digit base32 string (e.g. expyuzz4waqyqbqhcn) (Interaction with Tor 2018). For the reason that onion-domains directly derive from their public key, they are self-certifying. That implements, that if a user knows a domain, he automatically knows the corresponding public key. Unfortunately, onion-domains are therefore difficult to read, write, or to remember. In February 2018, the Tor Project introduced the next generation of onion-domains, which can now be 56 characters long, use a base32 encoding of the public key, and includes a checksum and version number (Interaction with Tor 2018). The new onion services also use elliptic curve cryptography so that the entire public key can now be embedded in the domain, while it could only be the hash in previous versions. These changes led to enhanced security of onion-services, but long and unreadable domain names interfered the usability again (Interaction with Tor 2018). Therefore, it is a common procedure, to repeatedly generate RSA keys until the domain randomly contains the desired string (e.g. facebook). These vanity onion domains look like this for e.g. Facebook (facebookcorewwwi.onion) or the New York Times (nytimes3xbfgragh.onion) (Interaction with Tor 2018). In contrast to the rest of the Worldwide Web, where navigation is primarily done via search engines, the darknet often contains pages with lists of these domains for further navigation. The darknet deliberately tries to hide from the eyes of the searchable web (Meet Darknet 2013)

3.3 Onion Routing – How Does Tor Work?

So how exactly does the anonymizing encryption technology behind Onion Routing work? As said before, the Tor browser chooses an encrypted path through the network and builds a circuit in which each onion router only knows (is able to decrypt) its predecessor and the successor, but no other nodes in the circuit. Tor thereby uses the Diffie-Hellman algorithm to generate keys between the user and different onion routers in the network (How does Tor work 2018). The algortihm is one possible application of Public Key Cryptography that makes use of two large prime numbers which are mathematically linked:

  1. A public-key — public and visible to others
  2. A private-key — private and kept secret

The public key can be used to encrypt messages and the private key is in return used to decrypt the encrypted content. This implicates, that anyone is able to encrypt content for a specific recipient, but this recipient alone can decrypt it again (How does Tor work 2018).

Tor normally uses 3 nodes by default, so 3 layers of encryption are required to encrypt a message (How does Tor work 2018). It is important to say, that every single Tor packet (called cell) is exactly 512kb large. This is done for the reason, that attackers cannot guess which cells are larger cells e.g images/media (How does Tor work 2018). On every step, the transferred message/package reaches, one layer of encryption is decrypted, revealing the position of the next successor in the circuit. This makes it possible, that nodes in the circuit do not know where the previous message originated or where its final destination is (How does Tor work 2018). A simplified visualization of this procedure can be seen in the picture below.

Removing one layer of encryption in every step to the next node, Reference (How does Tor work 2018)

But how does the network allow different users to connect without knowing each other’s network identity? The answer are so-called “rendezvous points”, formerly known as hidden services. (Onion Service Protocol 2019). The following steps are mainly extracted and summarized from the official documentation of Tor about the Onion Service Protocol 2019 and describe the technical details of how this is made possible:

Step 1: Before a client is able to contact an onion service in the network, it needs to broadcast its existence. Therefore, the service randomly selects relays in the network and requests them to act as introduction points by sending its public key. The picture below shows these circuit connections in the first step as green lines. It is important to mention, that these lines mark Tor circuits and not direct connections. The full three-step circuit makes it hard to associate an introduction point with the IP address of an onion server: Even though the introduction point is aware of the onion servers identity (public key) it does never know the onion server’s location (IP address)(Onion Service Protocol 2019).

Step 1: Reference: (Onion Service Protocol 2019)

Step 2: Step two: The service creates a so-called onion service descriptor that contains its public key and a summary of each introductory point (Onion Service Protocol 2019). This descriptor is signed with the private key of the service and then uploaded to a distributed hash database table in the network. If a client requests an onion domain as described in section Accessing the Network the respective descriptor is found. If e.g. “abc.onion” is requested, “abc” is a 16 or 32 character string derived by the service’s public key as seen in the picture below.

Step 2: Reference: (Onion Service Protocol 2019)

Step 3: When a client contacts an onion-service it needs to initiate the connection by downloading the descriptor from the distributed hash table as described before. If that certain descriptor exists for the address abc.onion, the client receives the set of introduction points and the respective public key. This action can be seen in the picture below. At the same time, the client establishes a connection circuit to another randomly selected node in the network and​ asks it to act as a rendezvous point by submitting a one time-secret key (Onion Service Protocol 2019).

Step 3: Reference: (Onion Service Protocol 2019)

Step 4: Now the client creates a so-called introduce message (encrypted with the public key of the onion service), containing the address of the rendezvous point and the one-time secret key. This message is sent to one of the introduction points, requesting the onion service as its final target. For the reason that the communication is again realized by a gate circuit, it is not possible to uncover the clients IP address and thus its identity.

Step 4: Reference: (Onion Service Protocol 2019)

Step 5: At this point, the onion service decodes the introduce message including the address of the rendezvous point and the one-time secret key. The service is then able to establish a circuit connection to the now revealed rendezvous point and communicates the one-time secret in a rendezvous message to the node. Thereby, the service remains with the same set of entry guards for the creation of new circuits (Onion Service Protocol 2019). By application of this technique, an attacker is not able to create his own relay to force the onion service to create an optional number of circuits, so that the corrupt relay might be randomly selected as the entry node. This attack scenario which is able to uncover the anonymity in the Deep Web networks was described by Øverlier and Syverson in their paper (Locating Hidden Servers 2006).

Step 5: Reference: (Onion Service Protocol 2019)

Step 6: As seen in the last picture below, the rendezvous point informs the client about the successfully established connection. Afterwards, both the client and onion service are able to use their circuits to the rendezvous point to communicate. The (end-to-end encrypted) messages are forwarded through the rendezvous point from client to the service or vice versa (Onion Service Protocol 2019). The initial introduction circuit is never used for the actual communication for one important reason mainly: A relay should not be attributable to a particular onion service. The rendezvous point is therefore never aware of the identity of any onion service (Onion Service Protocol 2019). Altogether, the complete connection between service and onion service and client consists of six nodes: three selected by the client, whereas the third is the rendezvous point and the other three are selected by the service.

Step 6: Reference: (Onion Service Protocol 2019)

4. Conclusion – Weaknesses

Different from what many people believe (How does Tor work 2018) Tor is no completely decentralized peer-to-peer system. If it was, it wouldn’t be very useful, as the system requires a number of directory servers that continuously manage and maintain the state of the network.

Furthermore, Tor is not secured against end-to-end attacks. While it does provide protection against traffic analysis, it cannot and does not attempt to protect against monitoring of traffic at the boundaries of the Tor network (the traffic entering and exiting the network), which is a problem that cyber security experts were unable to solve yet (How does Tor work 2018). Researchers from the University of Michigan even developed a network scanner allowing identification of 86% of worldwide live Tor “bridges” with a single scan (Zmap Scan 2013). Another disadvantage of Tor is its speed – because the data packages are randomly sent through a number of nodes, and each of them could be anywhere in the world, the usage of Tor is very slow. Despite its weaknesses, the Tor browser is an effective, powerful tool for the protection of the user’s​ privacy online, but it is good to keep in mind that a Virtual Private Network (VPN) can also provide security and anonymity, without the significant speed decrease of the Tor browser (Tor or VPN 2019) . If total obfuscation and anonymity regardless of the performance play a decisive role, a combination of both is recommended.

5. References

Hidden Internet [2018], Manu Mathur, Exploring the Hidden Internet – The Deep Web [Online]
Available at:
[Accessed 27 August 2019].

Search Engines [2019], Julia Sowells, Top 10 Deep Web Search Engines of 2017 [Online]
Available at:
[Accessed 24 July 2019].

Krebs On Security [2016], Brian Krebs, Krebs on Security: Rise of Darknet Stokes Fear of The Insider [Online]
Available at:
[Accessed 14 August 2019].

Anonymous Connections [1990], Michchael G.Reed, Paul F. Syversion, and David M. Goldschlag Naval Research Laboratory Anonymous Connections and Online Routing [Online]
Available at:
[Accessed 18 August 2019].

Onion Pre Alpha [2002], Roger Dingledine, pre-alpha: run an onion proxy now! [Online]
Available at:
[Accessed 18 August 2019].

Tor Browser [2019], Heise Download, Tor Browser 8.5.4 [Online]
Available at:
[Accessed 29 August 2019].

Interaction with Tor [2018], Philipp Winter, Anne Edmundson, Laura M. Roberts, Agnieszka Dutkowska-Zuk, Marshini Chetty, Nick Feamster, How Do Tor Users Interact With Onion Services? [Online]
Available at:
[Accessed 16. August 2019].

What is the darknet?, Darkowl, What is THE DARKNET? [Online]
Available at:
[Accessed 22. August 2019].

Meet Darknet [2013], PCWorld: Brad Chacos ,Meet Darknet, the hidden, anonymous underbelly of the searchable Web [Online]
Available at:
[Accessed 23. August 2019].

Onion Service Protocol [2019], Tor Documentation, Tor: Onion Service Protocol [Online]
Available at:
[Accessed 8. July 2019].

How does Tor work [2018], Brandon Skerritt, How does Tor *really* work? [Online]
Available at:
[Accessed 8. July 2019].

Locating Hidden Servers [2006], Lasse Øverlier, Paul Syverson, Locating Hidden Servers [Online]
Available at:
[Accessed 8. August 2019].

Zmap Scan [2013], Peter Judge, Zmap’s Fast Internet Scan Tool Could Spread Zero Days In Minutes [Online]
Available at:
[Accessed 21. August 2019].

Tor or VPN [2019], Bill Man, Tor or VPN – Which is Best for Security, Privacy & Anonymity? [Online]
Available at:
[Accessed 8. August 2019].

Cloudbased Image Transformation


As part of the lecture „Software Development for Cloud Computing“, we had to come up with an idea for a cloud related project we’d like to work on. I had just heard about Artistic Style Transfer using Deep Neural Networks in our „Artificial Intelligence“ lecture, which inspired me to choose image transformation as my project. However, having no idea about the cloud environment at that time, I didn’t know where to start and what is possible. A few lectures in I had heard about Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Function as a Service (FaaS). Out of those three I liked the idea of FaaS the most. Simply upload your code and it works. Hence, I went with Cloud Functions in IBMs Cloud Environment. Before I present my project I’d like to explain what Cloud Functions are and how they work.

What are Cloud Functions?

Choose one of the supported programming languages. Write your code. Upload it. And it works. Serverless computing. That’s the theory behind Cloud Functions. You don’t need to bother with Infrastructure. You don’t need to bother with Load Balancers. You don’t need to bother with Kubernetes. And you definitely do not have to wake up at 3 am and race to work because your servers are on fire. All you do is write the code. Your Cloud Provider manages the rest. Cloud provider of my choice was IBM.

Why IBM Cloud Functions?

Unlike Google and Amazon, IBM offers FREE student accounts. No need to deposit any kind of payment option upon creation of your free student account either. Since I have no experience using any cloud environment, I didn’t want to risk accidentally accumulating a big bill. Our instructor was also very familiar with the IBM Cloud, in case I needed support I could have always asked him as well.

What do IBM Cloud Functions offer?

IBM offers a Command Line Interface (CLI), a nice User Interface on their cloud website, accessible using the web browser of your choice and very detailed Documentation. You can check, and if you feel like it, write or edit your code using the UI as well. The only requirement for your function is: It has to take a json object as input and it has to return a json as well. You can directly test the Function inside the UI as well. Simply change the Input, declare an example json object you want to run it with, then invoke your function. Whether the call failed or succeeded, the activation ID, the response time, results, and logs, if enabled, are then displayed directly. You can add default input Parameters or change your functions memory limit, as well as the timeout on the fly as well. Each instance of your function will then use the updated values.

Another nice feature of IBM Cloud Functions are Triggers. You can connect your function with different services and, once they trigger your function, it will be executed. Whether someone pushed new code to your GitHub repository or someone updated your Cloudant Database, IBMs database service. Once invoked by this trigger, your function executes.

You can also create a chain of Cloud Functions. The output of function 1 will then be the input of function 2.

IBM Cloud Function use the Apache OpenWhisk service, which packs your code into a Docker Container in order to run it. However, if you have more than one source file, or dependencies you need, you can pack it in a docker image or, in some cases, like Python or Ruby, you can also zip them. In order to do that in Python, you need a virtual environment using virtualenv, then zip the virtualenv folder together with your python files. The resulting zip files and Docker images can only be uploaded using the CLI.

You can also enable your function as Web Action, which allows it to handle HTTP Events. Since the link automatically provided by enabling a function as web action ends in .json, you might want to create an API Definition. This can be done with just a few clicks. You can even import an OpenAPI Definition in yaml or json format. Binding an API to a function is as simple as defining a base path for your API, giving it a name and creating an operation. For example: API name: Test, Base path for API: /hello and for the operation we define the path /world select our action and set response content type to application/json. Now, whenever we call <domain>/hello/world, we call our Cloud Function using our REST-API. Using the built-in API-Explorer we can test it directly. If someone volunteers to test the API for us, we can also share the API Portal Link with them. Adding a custom domain is also easily done, by dropping the domain name, the certificate manager service and then Certificate in the custom domain settings.

Finally, my Project

Architecture of the Image Transformation Service

The idea was:

A user interacts with my GitHub Page, selects a filter, adds an Image, tunes some parameters, then clicks confirm. The result: They receive the transformed image.

The GitHub Page has been written with HTML, CSS and JavaScript. It sends a POST request to the API I defined, which is bound to my Cloud Function, written in Python. It receives information about the chosen filter, the set parameters and a link to the image (for the moment, only jpeg and png are allowed). It then processes the image and returns the created png byte64 encoded. The byte64 encoded data will then be embedded in the html site and the user can then save the image.

The function currently has three options:

You can transform an image into a greyscale representation.

Left: Original Image by Johannes Plenio, Right: black and white version

You can upscale an image by a factor of two, three or four

Left: Original Image by Johannes Plenio, Right: Upscaled by a factor of 2

and you can transform an image into a Cartoon representation.

Left: Original Image by Helena Lopes, Right: Cartoon version

Cartoon images are characterized by clear edges and homogenous colors The Cartoon Filter first creates a grayscale image and median blurs it, then detects the edges using adaptive Threshold, which currently still has a predefined window size and threshold. It then median filters the colored image and does a bitwise and operation between every RGBA color channel of our median filtered color image and the found edges.

Dis-/ Advantage using (IBM) Cloud Functions

Serverless Infrastructure was fun to work with. No need to manually set up a server, secure it, etc. Everything is done for you, all you need is your code, which scales over 10.000+ parallel instances without issues. Function calls themselves don’t cost that much either. IBMs base rate is currently $0,000017 per second of execution, per GB of memory allocated. 10.000.000 Executions per month with 512MB action memory and average execution time of 1.000ms only cost $78,20 per month, including the 400,000 GB-s free tier. Another good feature was being able to upload zip packages and docker images.

Although those could only be uploaded using the CLI. As a Windows user it’s a bit of a hassle. But one day I’ll finally set up the 2nd boot image on my desktop pc. One day. Afterwards, no need for my VM anymore.

The current code size limit for IBM Cloud Functions is 48 MB. While this seems plenty, any modules you used to write your code, not included by default in IBMs runtime, needs to be packed with your source code. OpenCV was the module I used before switching over to Pillow and numpy, since OpenCV offers a bilateral filter, which would have been a better option than a median filter on the color image creation of the Cartoon filter. Sadly it is 125 MB large. Still 45 MB packed. Which was, according to the real limit of 36 MB after factoring in the base64 encoding of the binary files, sadly still too much. Neither would the 550 MB VGG16 model I initially wanted to use for an artistic style transfer neural network as possible filter option. I didn’t like the in- and output being limited to jsons either. Initially, before using the GitHub Page, the idea was to have a second Cloud Function return the website. This was sadly not possible. There being only a limited selection of predefined runtimes and modules are also more of a negative point. One could always pack their code with modules in a docker imag/zip, but being able to just upload a requirements.txt and the cloud automatically downloading those modules as an option would have been way more convenient. My current solution returns a base64 encoded image. Currently, if someone tries to upscale a large image and the result exceeds 5 MB, it returns an error, saying „The action produced a response that exceeded the allowed length: –size in bytes– > 5242880 bytes.“

What’s the Issue?

Currently, due to Github Pages not setting Cross Origin Resource Sharing (CORS) Headers, this does not work currently. CORS is a mechanism that allows web applications to request resources from a different origin than its own. A workaround my instructor suggested was creating a simple node.js server, which adds the missing CORS Headers. This resulted in just GET requests being logged in the Cloud API summary, which it responded to with a Code 500 Internal Server Error. After reading up on it, finding out it needs to be set by the server, trying to troubleshoot this for… what felt like ages, adding headers to the ajax jquery call, enabling cross origin on it, trying to workaround by setting the dataType as jsonp. Even uploading Cloud Function and API again. Creating a test function, binding it to the API (Which worked by the way. Both as POST and GET. No CORS errors whatsoever… till I replaced the code). I’m still pretty happy it works with this little workaround now, thank you again for the suggestion!

Other than that, I spent more time than I’m willing to admit trying to find out why I couldn’t upload my previous OpenCV code solution. Rewriting my function as a result was also a rather interesting experience.

Future Improvements?

I could give the user more options for the Cartoon Filter. the adaptive Threshold has a threshold limit, this one could easily be managed by the user. An option to change the window size could also be added, maybe in steps?

I could always add new filters as well. I like the resulting image of edge detection using a Sobel operator. I thought about adding one of those.

Finding a way to host a website/find a provider that adds CORS Header, allowing interested people to try a live-demo and play around with it, would be an option as well.

What i’d really like to see would be the artistic style transfer uploaded. I might be able to create it using IBM Watson, then add it as sequence to my service. I dropped this idea previously because i had no time left to spare trying to get it to work.

Another option would be allowing users to upload files, instead of just providing links. Similar to this, I can also include a storage bucket, linked to my function in which the transformed image is saved. It then returns the link. This would solve the max 5 MB response size issue as well.


Cloud Functions are really versatile, there’s a lot one can do with them. I enjoyed working with them and will definitely make use of them in future projects. The difference in execution time between my CPU and the CPUs in the Cloud Environment was already noticeable for the little code I had. Also being able to just call the function from wherever is pretty neat. I could create a cross-platform application, which saves, deletes and accesses data in an IBM Cloudant database using Cloud Functions.

Having no idea about Cloud Environments in general a semester ago, I can say I learned a lot and it definitely opened an interesting, yet very complex world I would like to learn more about in the future.

And at last, all Code used is provided in my GitHub repository. If you are interested, feel free to drop by and check it out. Instructions on how to set everything up are included.

About the Robustness of Machine Learning


In the past couple of years research in the field of machine learning (ML) has made huge progress which resulted in applications like automated translation, practical speech recognition for smart assistants, useful robots, self-driving cars and lots of others. But so far we only have reached the point where ML works, but may easily be broken. Therefore, this blog post concentrates on the weaknesses ML faces these days. After an overview and categorization of different flaws, we will dig a little deeper into adversarial attacks, which are the most dangerous ones.

Continue reading

2 player Connect 4 in the cloud

Play Connect 4 here

Annika Strauß – as324
Julia Grimm – jg120
Rebecca Westhäußer – rw044
Daniel Fearn – cf056


As a group of four students with little to no knowledge of cloud computing our main goal was to come up with a simple project which would allow us to learn about the basics of software development for cloud computing. We had decided a simple game would do the trick. And to make it a little more challenging it should be a two player online game. First we thought of Tic Tac Toe but that seemed too simple. Then we took a look at the Chinese game of Go, but that was too complicated. In the end we agreed on Connect4. Not too simple. Not too complicated.

Getting started / Prerequisites / Tech Choices

In our first group meeting we sat down and brainstormed on all the requirements, features and technologies we would require to realize the game. We also tried to avoid coming up with too many additional features that would be nice to have but exceed our possible workload. The main focus was to get something running in the cloud.

Our version of Connect Four therefore should have a simple user login, so that one can play a game session with other players, interrupt the game and come back later to finish where one left off, which also means game sessions need to be saved between two players. We thought about matching players via a matching algorithm, in order to ensure that players of about equal strength get matched, but then we realized that was way too much effort and our focus really should stay on getting something done. So we decided on simply make it a random match up, or adding friends and connect via a friend list, since this is a study project, not a commercial game.

Of course we don’t really assume that millions of people are going to play our game at once, but we want to learn about scalability, so we are going to act as if no one has ever played it before and everyone in the world is going to act like it’s the new Pokémon Go. In reality it will probably just be the four of us and whomever we show it to.

We already knew beforehand that we would want to program the game in Python, simply because it is getting pretty popular in the web application field and we want to get some experience with it. Also, we finally want to learn something other than Java. But we also want to make sure Python actually is a good choice, so we’re going to check the pros and cons just to be sure and possibly decide on something else later.

We also agreed that we want to use a NO-SQL Database for saving game sessions, simply because they can store arrays and we want to get more experience with it. Again, we must check first if it’s a viable option.

Next question we needed to ask, is what infrastructure, what platform would we put this on. AWS? Our own hardware? IBM Cloud? What are the options there? Also, what requirements does our game bring with it for the platform? Do we need micro services? Should we use Docker? How is the scalability going to be handled? So many questions. It was time to ask the Master for some wisdom. So we talked to lecturer Thomas Pohl, who gave us some very useful insights.

Luckily for us IBM Cloud is available for free for students. And as Thomas works for IBM and has plenty of experience with it, it’s kind of a no brainer to go with IBM Cloud. As we found out, IBM Cloud already provides a great deal of infrastructure and platform, all handled by IBM, which allows us to simply focus on deployment. The service we are looking at in particular is called Cloud Foundry. It handles all the scaling, load balancing and everything automagically for you in the background, so that we can focus on simply getting our game running. It comes with a great variety of tools for almost any technology requirements we desire.

This is definitely the sandbox we want to be playing in. With some help from Thomas we came up with this relatively simple architecture:

So what exactly do we require from the cloud foundry? To answer this we first need to ensure we have all the technology choices figured out. Meaning, programming language, database, user login authentication and so on.

First we’ll start with the programming language. Python. Is it a good choice?

To get a clear impression of whether or not to use Python some research is necessary. After reading some online articles and watching some videos, we came to the conclusion that it is a good choice. Why? Similar to Java, it is an interpreted language and is easy to use. It is portable and has a huge library and lots of prewritten functions. The main drawback is that it is not too mobile friendly, but we’re not worried about that. And it is so simple, that other languages may seem to tedious  in comparison. But it will get the job done quickly and it is simple to maintain. Debugging works very well, it has built in memory management and most importantly, it can be deployed in Cloud Foundry. So, Python is definitely a winner. Other languages may also be a good choice, but the point is not to find the best language, but to confirm that Python is not a bad choice. Cause, we want to learn Python, and we just want to be sure it’s not a waste of time. And from what I’ve read, Python is a pretty good choice.

SQL or NoSQL? We want to use NoSQL, but does that make sense? To answer this question we need to have a look at what kind of data we are actually storing and what the advantages and disadvantages are of using NoSQL and see how it compares to SQL.

SQL is great if one has complex data that is very interwoven and a write would mean updating several different places, which SQL manages well by merely linking the data via relations instead of duplicating it. NoSQL stores all relevant data in one place, which makes reading really fast, but making any changes can mean, having to replace all duplicates of the data. So, NoSQL is efficient if changes only need to take place in one area.

Now, the data we are storing is basically just a game session. This mainly consists of two users, which will not likely change. Then the game state, which only changes in the saved game. And once the game is done, the entire game session no longer needs to be stored anyway and can probably be deleted.

The player data will be coming from a UserDataBase supplied by App ID, a user login service provided by IBM. Feeding this data into either DB Type should be no problem. But the main reason for using a NoSQL Database remains arrays. As we will be storing the game progress in form of arrays, getting them translated into something a normal SQL database could handle would be way too tedious and NoSQL can handle storing arrays no problem.

We’ve also just touched on the topic of handling user accounts. As already mentioned IBM provides a called App ID. Rather than using a social media login service such as facebook provides or worse, coming up with a whole system on our own, we were very happy to discover this tool already existed in the IBM cloud foundry. So we gladly decided to use that.

Getting the show on the road. Sort of.

Now that we had decided on all of our technical resources it was time to actually make it all. So, we all created our IBM Cloud accounts and started setting everything up. Now we had to look around Cloud Foundry and figure out where and how we would find all the services we needed and which ones were the right ones for us. Our main services needed to be:
a place for the main python app to live – Python Cloud Foundry
a NO-SQL database – Cloudant
a user login manager – AppID

Everything was relatively simple to find and set up. Our main app would be living in the Python Web App with Flask. A basic web serving application. For the NO-SQL database would use IBMs Cloudant. For the login of users we set up AppID. Now, I’m not going to go into detail on how we set it up, since it was pretty simple and anyone with a basic understanding of clicking through a webpage could have done most of this.

Everything we needed was in place. Our little playground was ready. Now came the really fun part. Actually writing the code. And with it, all the problems we needed to overcome, which would help us learn the ins and outs of cloud computing.

Random Problems we encountered

Merging Cloud Foundry Python Server with our Python-Flask Server

IBM Python Cloud Foundry delivers the following

We wanted to load our Flask Templates instead. How to do that? The sets the folder “static” as the starting point, which contains the static index.html. But just putting the path onto the templates is not enough, as Flask is initalized with Could we just replace server.js with a Python-Flask server? What on the server.js do we need so that the server still runs in the cloud.
We picked out the port and added it in our app.js (Python-Flask server).

Read port selected by the cloud for our application:

Adding the port in the app start:

So we added this in the first row:

Success! It worked!

Couldn’t start the app from the Cloud anymore
The app wouldn’t start anymore after pushing it to the cloud. We got an error message: “[errno 99] cannot assign requested address in namespace”.
This meant that our app couldn’t be found under the URL in the cloud. The mistake was that when we loaded locally we were using “localhost:8000”, but in the cloud, that doesn’t work of course. What was the correct address in the cloud? Adding host=’′ into solved the issue. Now we could run the app from the cloud and locally.

If you bind localhost or it means you can only connect to the service locally. cannot be connected, as it is not ours. We can only connect to IPs which belong to our computer.
We CAN bind though, because this means, all IPs are on our computer, so that every IP can connect to it.

Templates not found!!!
While two of us were working with Visual Studio Code, the templates for the front end html stuff were working quite nicely locally. But one of us was using PyCharm. And PyCharm did not know where the templates were and kept saying, wrong path, template not found.
On our quest for answers we were victorious and found ProfHase85 in one of the threads on We followed his wisdom and did as he said: “Just open the project view (View –> Tool Windows –> Project). Once there, though shalt right-click on your templates folder. Not left-click. Not double-click. And most certainly not center-click. No. Though shalt right-click on it. There you will find Mark as Direcory and from there you will find the Template Directory. It is there, that thee will find salvation. There you will set the path of your template and all shall come to life.”
So, that worked great. Thank thee, ProfHase85.

The NO-SQL Cloudant DB

When getting our Cloudant DB running we immediately ran into several problems. When we tried to create the DB, as it said in the tutorial from IBM, nothing happened. We then found that the code checks to see if the instance is a cloud instance. And we were trying to run the code locally. So we needed to push our code first and run it from the cloud.
The we needed to enter the domain name in the manifest.yml, which took us a while to find, which turned out to be
Then everything decided to freeze anytime when the password was being prompted. Apparently we made a mistake when we created the DB when configuring the authentication method. We said to only use IAM. So we went back and created the database again, and during setup the authentication method to use both legacy and IAM. Now we also had our in our service credentials and creating the DB and connecting to the Cloudant service finally worked.

Using the DB statically as well as dynamically
Now we were able to use Cloudant locally and in the cloud and could add static data in form of documents to our DB. The next problem we were facing was getting data like the username out of the user input into the document and making that data accessible again. Unfortunately our course on accessibility didn’t help in this case.
Unfortunately the documentation on what is possible with Cloudant didn’t seem to be very expansive.
Functions such as checking if a file exists were only possible locally, but not from the Python-Cloudant extension in the cloud. After several days of trying around we finally had the idea that maybe it was the accessibility functionality that was causing this. Maybe we needed to use IAM to access the Cloudant DB from the Python-Cloudant extension.

After a small issue with finding the right username we tried to connect using:

But were greetet with: Error: type object ‘Cloudant’ has no attribute ‘iam’

IAM requires at least Python-Cloudant version 2.9.0 and for whatever reason the version had in our requirements was 2.3.1. Problem solved. Connection finally established.
And then the next problem came flying along. When updating a file:
409 Client Error: Conclict document update at
What? OK. More reading up to do. Went and read through this article:
This didn’t bring us much further, but it seemed to be better to use document.updatefield() rather than trying to go directly into the DB, in order to avoid simultaneous calls.

How to sort data in CloudantDB
Three SQL programmers went into a NO-SQL bar.
They came back out after five minutes because they couldn’t find a table.

In Cloudant data is stored in completely independent documents. This makes everything more flexible, but also very cluttered and difficult to differentiate when reading. Without any kind of sorting, all data needs to be searched for a specific ID.
For our project we needed the following data structures:

We had to differenciate between users and game sessions. How could we accomplish this in Cloudant?

Use views?
A view makes a query quick and easy. But anytime a document gets updated, so does the whole view, which is counterproductive with big data sets.

We found out, one can create a partitioned DB in Cloudant, whereby naming the ids as follows:

In our case:
games:gameID123 and users:userAbc

This way one can add the partition to the queries, resulting in much better search performance. And also the DB looks a lot tidier.

Search query example:

And that was that.

AppID – The “simple” user login service for web apps

IBM offers a nice little web app called AppID. Easy to integrate. Made for the cloud. Great security features. Easy, right? Well, they have all the code you need for Java or node.js. But not Python. So, a few more lines of code, research and effort. How hard can it be?
AppID is based on OIDC (OpenID Connect). Since we used Python we needed to fall back on Flask-pyoidc. This module is a OIDC client for Python and the Flask framework which interacts with AppID for authentication.

Configuring the OpenID Connect Client

The metadata in “appIDinfo” serves as input for configuring the OIDC client.

Securing Web Routes
After configuration the OpenID client can be used to secure single pages or sections (“Routes”) of the web-app. This is achieved by attaching a decorator to the rout definition:

“@auth.oidc_auth” ensures that the code only gets executed for authenticated users.

The first problems with using AppID already arose with establishing a connection. First we tried connecting with a direct approach via the create button, which show a connection in the browser, but not when pushing the app to the cloud. So, we created the service again directly in the project via the command line. And voila. The next test push got a connection.

Creating an instance of the AppID service
We connected to “ibmcloud resource service-instance-create connect4AppID appid lite eu-gb”. After that an alias of the service instance is created in Cloud Foundry. Then we had “ibmcloud resource service-alias-create connect4AppID –instance name connect4AppID

And we had a finally established a connnection between our app and the AppID service. Seemed like things were coming along. But then of course we encountered the next problem. Turns out the redirect_uri doesn’t work with secured connections.

And then the next problem was that the AppID login widget was probably not going to work with our Python-Flask app either. So, we decided not to use AppID after all. Instead we created our own user login in python.

Sometimes something that looks like plug and play turns out to be plug and pray. And in this case our prayers weren’t heard. But now we know that one needs to thoroughly check the capabilities of these services before trying to implement them.

The heart of the game/Game Engine

Probably the most challenging part of creating this game was writing the main application, as the first slew of questions arose. How do you write a game for two players? How to connect the database? How do make it refresh when a player makes a move so the opponent can see it? How do make taking turns work, so the opponent is blocked from making another move?

Reloading the window after a player made a move
After some online research we figured out we could use a socket server to handle the multiplayer functionality. But that seemed like way too much overhead as it meant possibly having to learn an entire new framework.
The first issue we needed to takle was getting the web page to update/reload for both players, anytime one made a move.
With Python-Cloudant one can listen to changes in the DB, but unfortunately this loop blocks all other actions in Python. Were cloud functions maybe the answer? They are like serverless event listeners. The function gets triggered when a watched event occurs. And fortunately the was even a quickstart template available from IBM Cloud. You can create an action sequence and a trigger on the DB. We would need to call a Flask template in the cloud function. But it was unclear if that was possible. So we tried a Python-Cloudant only approach instead. Same as before, but this time asynchronous. That way the feed can run continuously and listen for changes in the DB.
But now the problem is that the asynchronous loop, which is waiting for changes in the DB, cannot be executed at the same time as the return render_template and is blocking. Which also means that it’s blocking server side, which is causing the website to freeze.
According to a post on stackoverflow threading is a better solution. One can deamonize a thread and thereby make it run in the background.
But, then it was time for a new approach. Getting a better understanding. What is a feed? What is a trigger? Several documentations and coffees later we finally came up with a proper solution.

The Solution to Cloud Functions
After enough reading we finally uncovered the proper way to do this.
Under select “Trigger”
1. Create new Trigger. Select a premade Cloudant Trigger. Select your DB and the actions selected in the next step will be triggered anytime something changes in the DB.
2. Create the actions, which are supposed to be triggered.
3. Create a sequence, which executes the actions in order.
4. Add the actions to the sequence.
5. Add the sequence to the trigger.
6. Call the API link from getDocument in Javascript: wait with Async await and recursive timeout functions till the trigger gets triggered. If the DB change occures during the current game, the waiting players window gets reloaded.
As we’ve never done anything with async await, cloud functions or APIs we definitely learned a lot getting this done.

Javascript code:

The Architecture

The final architecture

In the end, this is what our final architecture looked like. As seen in the square, our main app is comprised of Python-Flask, which handles all the python code and displaying the front-end view. The python code itself is the game engine and also handles all the back end connectivity. Javascript supports front end interactivity and listens to cloud functions. The cloud functions are bound to the DB and are set to trigger when changes in the DB occur. Javascript then reloads the web page to display the current state of the game. The DB contains the user data and current game sessions and is continuously updated by the main app. And the users sit in front of the web browser and enjoy all the magically appearing fun on their screen, without a clue of how much blood sweat it takes to make that magic happen.


What can one say? Everything always sounds so simple and easy in theory. But when it comes down to it, one often gets stuck on little things. Some choices we made in the beginning were good, some were quite challenging and some lead to dead ends. When we started out, we had only developed software for local use on computers or mobile devices. The closest we’ve gotten to something like cloud computing was maybe getting something to run over a network. It is quite challenging getting everything to run in the cloud. It’s a whole new game. Similar, but different rules. And even with all the services provided by IBM, we still ran into many obstacles. Especially when developing locally and then trying to make it work in the cloud.
Also, getting all the different types of technology to work together is pretty tricky. Only with experience will one get good at it. Because you won’t know if the service can provide the functionality one requires until you try it. And often we needed additional features or functions we didn’t think of beforehand. Aspects we didn’t consider. The software technologies we’ve encountered may be very powerful, but with great power comes great confusion. But that is where progress happens. Not when everything is going smoothly, but when one is faced with difficult challenges. And we’ve had plenty.
I would say that this project, this course, has been one of the most beneficial in our studies at the HdM. We had the opportunity to get our hands dirty, with expert guidance in a safe environment. The experience we’ve gained is priceless. Our understanding of cloud computing and our ability to develop software for such has progressed several levels. And since this was our main goal I would have to say that our project was a complete success.

Progressive Web Apps – Wer braucht noch eine native App?

Beispiele zum Einstieg

Progressive Web Apps sind schon weiter verbreitet wie man denkt. Auch große, innovative Unternehmen Twitter, Airbnb, Spotify oder Tinder setzen auf Progressive Web Apps.

Abb. 1: Eine Auswahl von Progressive Web Apps [1]

Wer sich ein tolles Beispiel anschauen möchte, dem empfehle ich (auf mobile) zu testen. Nach einiger Zeit erscheint ein Popup, das fragt, ob man die App zum Startbildschirm hinzufügen möchte. Bestätigt man diese Abfrage, wird die PWA im Hintergrund auf dem Gerät installiert und ist ab sofort wie jede andere App auf dem Gerät verfügbar.

Abb. 2: Rio Run App von the Guardian [2]

Weitere tolle PWAs finden sich auf den Übersichtsseiten von:

Historische Entwicklung

Kurz nach der Einführung des ersten iPhones hatte Steve Jobs schon die nächste Vision für sein Schmuckstück. Für ihn war nicht das Gerät an sich, viel mehr der Browser, der Hafen zum Tor der neuen Welt. Seiner Vision nach, sollte ein Smartphone bestenfalls nur noch einen Browser beinhalten. Entwickler können Apps über das Web bereitstellen, die sich vollständig in Safari integrieren lassen und sich selbstständig im Hintergrund updaten. [3]. Die Idee ist gut, allerdings, typisch Jobs, ihrer Zeit voraus. Die Erfolgsgeschichte des iPhones zwang Apple an den Rand der Kapazitäten, so dass vermutet werden kann, dass Apple schlichtweg keine Zeit und Ressourcen hatte das Thema PWA weiter zu verfolgen. Mit der Masse an Apps, die in den Store drang, hatte man mehr als genug zu tun. Die ersten, die nach Steve Jobs das Thema wieder aufgegriffen haben, waren Mozilla, mit der Veröffentlichung des FirefoxOS im Jahr 2013. Ein Betriebssystem, das es ermöglicht Web Apps als native Apps auf dem Endgerät laufen zu lassen. [3] In den Jahren 2006-2013 hat sich auch sonst viel getan: Endgeräte haben unfassbar an Performance gewonnen, die Webentwicklung hat JQuery und PHP hinter sich gelassen. Neue Möglichkeiten, neue Frameworks und ausgefuchste CSS-Kniffe ermöglichen seitdem eine völlig neue Form der User Experience im Web. Und Web? An wen denken wir da im Web? Genau, Google! Und Google hat ein Problem: Auch wenn sie selbst einen App Store betreiben und sich Kniffe haben einfallen lassen, wie sich Apps indexieren lassen, ist eine native App nicht in der Form auswertbar für ihre Suchmaschine, wie der Content einer Website. So fordert Alex Russells, Chrome Engineer bei Google, im Jahr 2015, die Einführung von Progressive Web Apps (PWA), eine Kombination aus aktueller Web Technologie mit den modernsten Möglichkeiten des Browsers [3]. Mit geschicktem Marketing und einer großen User-base überzeugte Google nun auch Apple wieder an dem Thema zu arbeiten, die 2017 mit der Unterstützung vom PWAs in Safari gleichzogen, komplettiert von Microsoft, dem schlafenden Giganten, der 2018 vollen Support für PWAs im Edge Browser verkündete [4]. Somit ist das große Trio vollends am Start und einer Erfolgsgeschichte von PWAs steht nichts mehr im Wege!

Was bedeutet Progressive Web App?

Vom Naming her leitet sich Progressive Web App von Web Apps mit Progressive Enhancement ab. Aber wo liegt da die Innovation? Bei der Entwicklung einer Web App denkt man heute an große Frameworks wie Angular, React, Vue und neueste Browser APIs wie die Geolocation API und viele mehr. Das ist auch bei einer PWA alles einsetzbar, denn Achtung: Eine PWA unterliegt keinem Framework und schließt auch kein Framework aus. Die Limitierung besteht lediglich durch die Gegebenheiten des Browsers.

Progressive Enhancement, zu deutsch progressive Verbesserung, beschreibt den Content-First Ansatz. Ziel dieser Optimierung ist es, dass der First-Meaningful-Paint beim Aufbau einer Seite möglichst früh geschieht. Sprich Content zuerst und weitere Layer wie Skripte, Style und Multimedia Files, werden nach und nach, je nach Netzwerkverbindung geladen und aufgebaut. Das ist aber keine Neuheit, die im Zuge von PWAs entstanden ist, sondern ein Ergebnis der letzten Jahre der Webentwicklung, Suchmaschinen- und User Experience-Optimierung.

Zusammengefasst lässt sich sagen, der Begriff Progressive Web App weist nicht auf die wahre Innovation hin. Web Apps, moderne Browser APIs und progressiver Aufbau wurden nicht mit PWAs erfunden. Frances Berriman, einer der Mitbegründer bei Google geht sogar so weit zu sagen: “The name isn’t for you… The name is for your boss, for your investor, for your marketeer.” [5] 

Worin liegt die Innovation?

Um auf die wahre Innovation hinter PWAs zu kommen, müssen wir zunächst die aktuellen Probleme in der Bereitstellung von Web Apps und nativen Apps betrachten. Web Apps sind eine tolle Sache, aber auf der geschäftlichen Seite fragt man sich, wie man den Benutzer an den Service binden kann. Soll man den Benutzer dazu auffordern, Lesezeichen im Browser zu machen, Social Media Aktivitäten zu verfolgen oder soll man eine weitere App entwickeln, die im Store ausgeliefert wird? Das alles erfordert zusätzliche Motivation beim Benutzer, den Content oder Service zu konsumieren und, auch wenn kein Medienbruch erfolgt, liegt an jeder Stelle der Customer Journey eine kleine Abbruchrate (Bounce Rate) vor. Es gibt auch Hinweise die bestätigen, dass die Downloadzahlen aus den App Stores immer kleiner werden, wobei der Traffic im Internet weiter zunimmt [6]. Dort ist der User und dort will er auch abgeholt werden. Was ist also naheliegender, als den Content, den man sowieso schon über das Web bereitstellt, für den Kunden dauerhaft zur Verfügung zu stellen? Ohne überflüssige Zwischenschritte in der Customer Journey? Das lässt sich zwar mit Web Apps realisieren, aber wenn der Benutzer keine Internetverbindung hat, klingelt auch bei den Entwicklern nichts in der Kasse.
Auch native Apps bringen ihre Probleme auf der geschäftlichen Seite mit. Hohe Entwicklungskosten und straffe Anforderungen der Store Betreiber, sorgen regelmäßig für Frust und lange Nächte beim Release des nächsten Updates. Payment-Optionen werden vordiktiert, undurchsichtige Indexierungen der Store Einträge erlauben keine freien Produktplatzierungen und Marketing. Wie soll man da aus der Masse an Apps herausstechen und sein eigenes Produkt sinnvoll bewerben? Von den Gebühren die Apple und Google beim App-Kauf abkassieren, mal ganz abgesehen. Neben der undurchsichtigen Indexierung ist auch die Auffindbarkeit außerhalb der App Stores problematisch. Zwar lassen sich Apps inzwischen in Suchmaschinen auffinden, werden aber vom Content her nicht so von Web Crawler erfasst, wie herkömmliche Websites. Keine direkte Indexierung bedeutet eine lange Customer Journey bis ein Kunde das Produkt entdeckt und das bedeutet hohe Kosten. 

Diesen Punkten versucht eine PWA entgegenzuwirken. Eine PWA kann online und offline, barrierefreie konsumiert werden. Der Zwang einer App-Installation ist nicht erforderlich, der Kunde kann den Content zunächst über das Web auffinden und konsumieren und sich jederzeit impulsiv für eine App-Installation entscheiden. 

Technik & Funktionsweise

Eine PWA funktioniert grundsätzlich wie eine gewöhnliche Web App. Man kann jede Web App als Basis nehmen, erweitert um Manifest, Service Worker und App Icon. Wichtig zu wissen ist, dass eine PWA nur mit https funktioniert und sie sich auch nur installieren lässt, wenn die Verbindung über https gesichert wurde. Dieser Zwang ist ein willkommener Vorteil, der ein bisschen Sicherheit ins World Wide Web bringt. 

Die nachfolgende Übersicht zeigt einen schematischen Aufbau und Zusammenhänge. In den nachfolgenden Abschnitten schauen wir uns die einzelnen Elemente im Detail an.

Abb. 3: Funktionsweise und Aufbau von PWAs [9]


Das Manifest liefert die, für die Installation der App, notwendigen Metadaten. Es muss per Link-Element in den Head des HTML-Dokuments eingebunden werden. In Json formatiert liefert es u.a.:

  • App Name
  • App Beschreibung
  • App Icon in versch. Auflösungen für versch. Endgeräte und Browser
  • Informationen bzgl. des Urhebers
  • App-Scope
  • Start Modus

Eine vollständige Liste aller mögliche Attribute findet sich hier:

App Scope

Die Property “start_url” beschreibt den Scope der App innerhalb der Domain. Über diesen Parameter wird auch definiert welche Seite beim Start der App angezeigt werden soll.

Start Modus

Die Property “display” beschreibt den Start-Modus der App. Die nachfolgende Bilderreihe zeigt die Auswirkungen der einzelnen Optionen. Mit der Standalone-Option kann man den Look einer nativen App perfekt imitieren. Über den Parameter Theme Color lässt sich zusätzlich die Gestaltung der Statusleiste beeinflussen.

Abb. 4: Browser Modes [7]

Service Worker

Der Service Worker ist der Hintergrunddienst einer PWA. Er ist verwandt mit dem Web Worker, läuft in einem eigenen Thread und erlaubt keine direkte DOM-Manipulation, sondern nur die Kommunikation über eine definierte Schnittstellen. 

Er ist auch bei geschlossener Anwendung lauffähig, legt sich schlafen und erwacht bei eintreffenden Informationen! Mit seiner Hilfe wird eine PWA offlinefähig. Seine Aufgabe ist es, alle Requests die aus dem eigenen Scope ins Netzwerk geschickt werden, abzufangen und zu beurteilen, ob er mögliche Anfragen aus dem eigenen Cache beantworten kann oder nicht. Technisch betrachtet fungiert er quasi als Pseudo-Proxy. In welchem Umfang Requests ins Netzwerk geschickt werden oder aus dem eigenen Cache beantwortet werden, liegt im ermessen des Entwicklers. So ist über das Install-Event des Service Workers, dass bei App-Installation getriggert wird, ein vollständiger Download des App-Contents möglich, was die App damit gänzlich offlinefähig machen würde. Über die Wake-Up Funktion werden Push Notifications auf dem Endgerät ermöglicht, womit man der User Experience einer nativen App wieder einen großen Schritt näher kommen kann. 

Weitere Informationen und eine Übersicht von möglichen Events auf die gelauscht werden kann, findet man unter:

Skandal – Sind Service Worker die neuen Cookies?

Das eine App einen Hintergrunddienst benötigt, ist aus der nativen App- Entwicklung betrachtet, nichts ungewöhnliches. Async-Tasks, Services, Background Tasks, etc. sind jedem Android- oder iOS-Developer ein Begriff. In der Entwicklung von gewöhnlichen Websites (keine Web Apps), sind Hintergrunddienste allerdings eher selten von Bedarf. Der Benutzer weiß inzwischen durch Aufklärungsmaßnahmen, dass der Besuch einer Website in seinem Browser Spuren hinterlässt, sei es Cache, Cookies usw. ABER und jetzt kommt der Skandal: Viele Benutzer wissen nicht (!), dass ihr Browser ebenso durch Service Worker belastet wird, die sich wie eine Zecke in den Browser schleichen, jederzeit aus dem Schlaf erwecken lassen und die die Performance des ganzen Gerätes durch Hintergrunddienste beeinträchtigen können! Wenn man den Artikel bis hierher verfolgt hat, könnte man nun meinen, das Service Worker nur für PWAs eingesetzt werden, aber das Web war schon immer dafür bekannt, dass man sämtliche Tricks ausnutzt, die irgendwie möglich sind. Ich kann an dieser Stelle nur jedem Chrome-Benutzer, egal ob auf Mobile oder Desktop empfehlen, mit dem nachfolgenden Link zu überprüfen, wieviele Service Worker sich unwissend in den eigenen Browser geschlichen haben. Eine vergleichbare Schnittstelle zum überprüfen der registrierten Service Worker bieten andere Browser derzeit nicht an, obwohl sie sie unterstützen! Über Service Worker lässt sich beispielsweise das heimliche Crypto Mining auf Kosten von unwissenden Benutzern realisieren.

Check:  chrome://serviceworker-internals

Push Notifications

Eine Push Notification, die über eine installierte PWA gesendet wurde, lässt sich nicht von einer nativen Notification unterscheiden. Auch jede Referenz auf den zugrundeliegenden Browser wird verborgen. Notifications lassen sich über den Service Worker triggern, der wiederum von App oder Server dazu angestoßen wird.

Abb. 5: PWA Push Notifications vs. native Notifications [9]

Splash Screen

Um dem Start einer App eine bessere User Experience zu verpassen, hat man sich entschlossen, dass eine PWA mit einem Splash Screen startet, während im Hintergrund der Content aufgebaut wird. So wird auf geschickte Art und Weise performance suggeriert und langweilt den Benutzer nicht mit einem Whitescreen. Splash Screens lassen sich nicht an- oder ausschalten, aber customizen. Per Default sind App Icon und App Name (Android) gesetzt.

Abb. 6: Splash Screen auf Android (links) und iOS (rechts) [8]

Optimize & Debug

Wer eine PWA programmieren und das beste an Performance herauskitzeln will, der findet in den Developer Tools des Chrome Browsers das beste Hilfsmittel. Über die Audit-Tests startet der Browser auf Knopfdruck eine Reihe an Tests, die eine PWA in Bezug auf Vollständigkeit (PWA-Anforderungen), Performance, Best Practises, Accessibility und Semantik (SEO) hin bewertet. Auf einer Skala von 1-10 wird ein Scoring ermittelt und aufgelistet wo Verbesserungspotential besteht.

Abb. 7: Audits in der Chrome Developer Console [9]

Exkurs – Browser APIs

Um sich bei Funktionalität und User Experience einer nativen App anzunähern, liefern moderne Browser heute schon jede Menge APIs, die man vielleicht gar nicht kennt. Darunter auch einige hardwarenahe Schnittstellen, die häufig noch gar keinen Gebrauch finden, aber die Zukunft mitprägen können. Um sicherzustellen, dass entsprechende APIs auch schon in weit verbreiteten Browsern implementiert sind, bietet sich an, diese vor Implementierung, mit Hilfe des Can I Use – Services zu prüfen.

Eine Auswahl eher unbekannter APIs:

  • Sensors API
    • Ambient Light, Proximity, Accelerometer, Magnetometer, Gyroskop
  • Battery Manager API
  • Web Payments API
  • Gamepad API
  • Geolocation (GPS) API
  • Vibration API
  • Web Speech API (Text-To-Speech, Speech-To-Text, Grammar)
  • Bluetooth API
  • Push Notification API
  • USB Device API
  • WebVR API
  • Indexed Database API
  • File System API

PWA vs. Native

Wir wir bisher erfahren haben, ermöglichen moderne Browser schon mehr Features, wie man vielleicht vermuten mag. Wenn man betrachtet, dass PWAs erst seit 2018 von allen großen Browsern unterstützt werden, kann man schlussfolgern, dass das ganze Thema noch in den Kinderschuhen steckt. Aber bereits jetzt, durch die herausragende Performance-Steigerung mobiler Endgeräte in den letzten Jahren, sind tolle Anwendungen im mobile Web möglich, so dass native Apps aus Kostengründen wohl immer unattraktiver werden, sofern keine native Performance für das Produkt erforderlich ist. Auch der PWA-Standard wird sich in den nächsten Jahren weiterentwickeln und immer mehr Möglichkeiten bieten. Es bahnt sich vielleicht eine spannende Welle der Veränderungen an, die den Mobile-App-Markt durcheinander bringen könnte!

Zum Abschluss eine Übersichtsmatrix die zur Entscheidungsfindung hilfreich sein kann. Die meisten Punkte erhält die Umsetzungsart, die am besten abschneidet.

Web AppPWANative App
Speicherbedarf auf dem Endgerät◼️◼️◼️
Auffindbarkeit in Suchmaschinen◼️◼️◼️◼️◼️
Installation, Updates, Wartung, Versionierung◼️◼️◼️◼️◼️
Performance, Ladezeiten◼️◼️◼️
Natürlicher Traffic◼️◼️◼️◼️◼️
Hardwarenahe Features◼️◼️◼️◼️
User Experience(kann)(kann)◼️◼️

In 10 Schritten zur ersten PWA

Hier noch einen Quickguide zur ersten Progressive Web App:

  1. Web App programmieren
  2. Sicher stellen, dass die Kommunikation zum Server ausschließlich über https läuft
  3. Web Manifest anlegen mit mindestens:
    • Scope
    • Start URL
    • App Icon und App Name
  4. Web Manifest im Head des HTML-Dokuments einbinden
  5. ServiceWorker.js (o.ä.) in root Folder der PWA anlegen
  6. Service Worker im JS der Web App registrieren
  7. ServiceWorker.js je nach Bedarf implementieren
  8. Mit Audits testen ob alle empfohlenen Qualitätskriterien einer PWA erreicht sind
  9. Deployen 🙂


[1] Great examples of progressive web apps in one room
Zugegriffen am 05.08.2019, 22:08 Uhr

[2] Rio Run App von the Guardian,
Zugegriffen am 05.08.2019, 22:08 Uhr

[3] Wikipedia
Zugegriffen am 05.08.2019, 22:08 Uhr

[4] Welcoming Progressive Web Apps to Microsoft Edge and Windows 10
Zugegriffen am 05.08.2019, 22:08 Uhr

[5] Naming Progressive Web Apps
Zugegriffen am 05.08.2019, 22:08 Uhr

[6] Why Build Progressive Web Apps
Zugegriffen am 05.08.2019, 22:08 Uhr

[7]  Progressive Web App Challenges
Zugegriffen am 05.08.2019, 22:08 Uhr

[8] Progressive Web App Splash Screens
Zugegriffen am 05.08.2019, 22:08 Uhr

[9] Eigene Leistung