Yet another Connected Vehicle Cloud Platform

Getting Ideas… write a User Story

When I thought about an idea for a project, part of the lecture “Software Development for Cloud Computing”, I had two related use-cases in mind. So I wrote down those high-level user-stories:

As a user who owns a non-connected car, I want to access some information about my car on my smartphone so that I know about the position of my car and additional information like fuel level, consumption or driving statistics.

 

As an employee who wants to drive a pool-car, I want to know where the vehicle is parked so that I don’t have to search on different parking spots around the building.

Define a Scope

My goal for this project was to build a functional prototype which meets the expectation of at least one requirement.
It wasn’t easy to define the scope for this project because there are many ideas to expand it.

  • Log car telemetry data from OBD-II Interface in 200ms steps.
  • Transmit telemetry data every 5 seconds to AWS IoT Backend.
  • Telemetry data like geoposition, vehicle speed, engine speed, tank fuel level…
  • Store incoming telemetry data on S3 storage.
  • Store actual vehicle state on DynamoDB
  • Build an iOS App to show vehicle data
  • (Additional) Create an Alexa Skill

The Components

The project can be separated into three components. First part is a Datalogger, second the Cloud Backend Infrastructure and third the User-Interfaces. In the following sections I want to go deeper into the single parts:

Data Logger

The datalogger should record signals like latitude, longitude, speed, engine-speed, consumption, odometer and tank level. For that, I adopted an existing solution based on a Raspberry Pi and got it to work with the OBD-II Interface of my car and an external GPS Receiver. The datalogger records the data in 200 ms steps transmits it every 5 seconds to the backend using MQTT and the AWS IoT Framework.

The primary solution is directly connected to the CAN-bus of the car which provides a large bunch of signals including all required signals. During my modification for OBD-II I had to realize that OBD-II does not provide all the required signals. This restricts the implementation of the first user-story slightly but covers still the requirement to show the position of the car, which is also the main feature of the second user story.

Material: Raspberry Pi Model 3 PiCAN2 Board Serial to OBD-II Adapter Cable USB GPS Receiver * USB Power Car Adapter + Cable

Material Data Logger
Material Data Logger

How it works: The device uses the AWS IoT SDK to publish the data via MQTT to the AWS IoT Endpoint. It publishes to the following topic connectedVehicle/telemetry/[vin] with a frequency of 5 seconds. The message structure is as followed:

{
    "UDID": "",
    "tripId": "",
    "canSnapshotObject": {
        "sensor": {
            "value": 0,
            "timestamp": 0
        }
    }
    "canFrames": {}
}

Cloud Backend Infrastructure

The Main Challenge on this was to design an lightweight but extensible architecture. At first, I searched for examples for my use-case. I wanted to know, how others do that. You can find a lot of examples, tutorials and ready-to-deploy solutions for that. The most cloud providers have example stories with car manufactures or automotive suppliers. Two main examples were helpful for me: The AWS Simple Beer Service (SBS) and the AWS Connected Vehicle Solution. The SBS is a simple example with all important components for an infrastructure like that. It was a good starting point, but as I went further, I found out, that the infrastructure, the picture shows, is a little older. For example, the API Gateway between the Kegerator and the Lambda functions is replaced with the AWS IoT Core part (see Simple Beer Service v5). Also for the visualization part, there is a better solution for mobile devices like AWS AppSync.

The Connected Vehicle Solution of AWS offers a range of features and various sample services such as location-based marketing, push notifications or driver evaluation. It’s easy to implement because Amazon provides a cloud formation template that I have installed on my AWS account. It worked without any problems.

Unfortunately, AWS provides no documentation about how to get data to this solution. It would have been helpful to know which signals in which format and how often are expected. So a lot of investigation was necessary.

In this way, I found a helpful tool to simulate a car. The AWS IoT Device Simulator is based on the AWS Cloud Infrastructure, can also be deployed with cloud formation template and provides an interface to simulate car rides.

The source code of the Simulator showed me how the data is structured. The disadvantage became obviously, that for every sensor in the vehicle a message is published instead of aggregating the data and send it every x seconds. Because of this and the complex infrastructure, I decided to implement an infrastructure on my own.

Why Amazon Web Services?

The Main Requirement for my project was an IoT Cloud Platform. Well-known products are from Google, Microsoft, IBM, and Amazon. Every one of the four has an example and success stories for their IoT Platform with connected vehicles. I decided on Amazon AWS because it is very well documented, I have a basic knowledge of some AWS Services and my colleagues are familiar with this product, so in case I can ask them. It seems for me to be the fastest way to get a result but also a good way to learn a lot about cloud services like IoT Platforms, Serverless Programming, Storage & Database, Infrastructure-, Platform- & Function-as-a-Service, and in deep about AWS.

Set up IoT Infrastructure

In the following, I will explain how I set up the IoT Platform for my requirements:

  1. First, create a new IoT thing following the wizard. You’ll need to create a certificate, public and private key. Don’t forget to activate the certificate. The policy will be created in the next step.
  2. Create a new policy. Click on Advanced mode and insert following JSON document:
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "iot:Publish",
            "iot:Subscribe",
            "iot:Connect",
            "iot:Receive"
          ],
          "Resource": [
            "*"
          ]
        }
      ]
    }

    Assign the policy to your newly created thing.

  3. Copy the certificates to the IoT device
  4. Create a rule for the topic. The rule query statements should look like this: SELECT * FROM 'connectedcar/telemetry/#'
  5. Add action Invoke a Lambda function passing the message data. Leave the function name empty unless you created the function.

After setup of the IoT Infrastructure, we can continue with storage, database, and Lambda functions.

Create S3 Bucket: All incoming messages will be stored on an S3 Bucket, arranged after vehicle identification number and trip. Create one with named connected-vehicle-data-eu-west-1.

Create DynamoDB Tables: The dynamoDB table will show all vehicles with their current state. Create one named connectedVehicle-vehicle-snapshot with vin as the primary key.

Create Telemetry Lambda Function

  1. Create a Lambda Function
    • Name: ConnectedVehicle-storeTelemetryDataFunction
    • Runtime: Node.js 6.10
    • Role: Create new role from template(s)
    • Create new Role from Template: connectedVehicle-dev-eu-west-1-lambdaRole
    • Choose Policy Template Basic Edge Lambda permissions
    • Save Lambda Function
  2. Add a trigger
    • Choose AWS IoT from the left
    • Go down and configure the trigger
    • Select Custom IoT rule and choose your rule (ConnectedVehicleTelematics)
    • Click Add
  3. Edit IAM Role for using DynamoDB and S3
    • Go to IAM Console and edit policy
    • Choose Add inline policy and add a JSON policy (connectedVehicle-lambda-dynamodb-bucket-policy):
      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Sid": "VisualEditor0",
                  "Effect": "Allow",
                  "Action": [
                      "dynamodb:PutItem",
                      "dynamodb:DescribeTable",
                      "dynamodb:DeleteItem",
                      "dynamodb:GetItem",
                      "dynamodb:Scan",
                      "dynamodb:Query",
                      "dynamodb:UpdateItem"
                  ],
                  "Resource": "arn:aws:dynamodb:eu-west-1:xxxxxxxxxxx:table/*"
              },
              {
                  "Sid": "VisualEditor1",
                  "Effect": "Allow",
                  "Action": "dynamodb:ListTables",
                  "Resource": "*"
              }
          ]
      }
      
    • Add also a Policy for S3 (connectedVehicle-lambda-s3-bucket-policy):
      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Sid": "VisualEditor0",
                  "Effect": "Allow",
                  "Action": "s3:*",
                  "Resource": "arn:aws:s3:::connected-vehicle-data-eu-west-1"
              }
          ]
      }
      
    • Save it.
  4. Add following code to your Lambda function:
    var DATA_TABLE = 'connectedVehicle-vehicle-snapshot';
    var AWS = require("aws-sdk");
    var ddb = new AWS.DynamoDB();
    var s3 = new AWS.S3({params: {Bucket: 'connected-vehicle-data-eu-west-1'}});
    
    exports.handler = function(event, context) {
    
        if (event.UDID!==undefined) {
    
        var rTimestamp = new Date().getTime();
        console.log(rTimestamp);
        var params = {
            TableName: DATA_TABLE,
            Item:{
                "vin":{"S": event.UDID },
                "lat":{"S": event.canSnapshotObject.hasOwnProperty('lat') ? event.canSnapshotObject.lat.value.toString() : '-'  },
                "long":{"S": event.canSnapshotObject.hasOwnProperty('long') ? event.canSnapshotObject.long.value.toString() : '-' },
                "range_in_km":{"S": event.canSnapshotObject.hasOwnProperty('range_in_km') ? event.canSnapshotObject.range_in_km.value.toString() : '-' },
                "tank_level_percentage":{"S": event.canSnapshotObject.hasOwnProperty('tank_level_percentage') ? event.canSnapshotObject.tank_level_percentage.value.toString() : '-' },
                "canSnapshotObject":{"S": JSON.stringify(event.canSnapshotObject)},
                "lastUpdated":{"N": rTimestamp.toString() }
            }
        };
    
        ddb.putItem(params, function(err, result) {
            if(err)
                console.log("Error in uploading file on dynamodb due to "+ err)
            else    
                console.log("File successfully uploaded to dynamodb.")
        });
    
        var params = {
            Key: event.UDID + '/' + event.tripId + '/vehicle-data-' + rTimestamp,
            Body: JSON.stringify(event),
        };
    
        s3.upload(params, function (err, res) {               
            if(err)
                console.log("Error in uploading file on s3 due to "+ err)
            else    
                console.log("File successfully uploaded.")
        });
    
        } else {
        context.fail("JSON input [UDID] not defined.");
        }
    
    };
    
  5. Now your function should work.

Room for Improvement

  1. IoT Device Rollout. At this point, new IoT devices have to set up and be registered manually. After that, the new created certificates must be copied to the device. This process can be automated and is called “Just-in-Time Provisioning”. The first time the device connects to AWS IoT, a certificate will be created and downloaded to the device. All it needs is the CA certificate.
  2. Account Linking. One issue is that there is no link between a car and a user account. So every App user can access all cars. Account Linking and Rights Management would solve this. The user of a car should only access information about this car, while the fleet owner should be able to access all cars. This applies also for the Alexa Skill.
  3. Scriptable Infrastructure. I had set up the AWS Services step-by-step through the web console. This is good for exploring and trying out the services but it causes a high effort to re-deploy or roll back the infrastructure. An approach for this is to use the aws cli, CloudFormation templates or use a tool like terraform.

Further Development

There are many ideas to extend the dashboard in the app about vehicle information. As we get on the /vehicle/trip/[vin] channel a notification on trip start and end, we cloud trigger on this way some analytics on trip ends like a driver safety score or a green score. Looking at the AWS Connected Vehicle Solution there are more ideas for location-based services (geofence, marketing) or predictive maintenance. In addition to the second user story, a web dashboard for the pool-car manager could be possible to monitor all cars.

User Interface

The App

There are two main user touch points to the vehicle cloud infrastructure. First, I build an iOS App for that. The App shows a Map and a card-slider with all vehicles. The card displays the vehicle identification number and owner of the car.
Adjusting the scope (see part data logger) affected the App mainly, so I put the focus on the vehicle position. In this way, the main feature of the app is to show the position of the vehicle.

A good starting point for building an app with AWS AppSync is the documentation. The section Code Generation for the API was a little bit unclear for me so I will explain it. AWS provides a tool to generate an API Swift code out of your schema file and queries for you. You need two files to generate the code.

  • First, the schema file (schema.json). You can download it on the web interface here:
  • Second, you need to create a new file called query.graphql. Put the queries here:
    query GetCar($vin: String!) {
        getCar(vin: $vin) {
            vin
            lat
            long
        }
    }
    
    query ListCars {
        listCars {
            vin
            lat
            long
        }
    }<span data-mce-type="bookmark" style="display: inline-block; width: 0px; overflow: hidden; line-height: 0;" class="mce_SELRES_start"></span>

     

  • Put both files in a folder called GraphQLOperations
  • generate the API file with aws-appsync-codegen generate GraphQLOperations/*.graphql --schema GraphQLOperations/schema.json --output API.swift

App Authentication

With AWS AppSync you have the possibility to decide between four different authentication methods:

  • API key
  • OIDC (OpenID Connect)
  • Amazon Cognito user pools
  • AWS IAM (Identity and Access Management)

For simplicity, I decided to use for this Prototype the API key. I had no intention of rolling out the prototype for people other than myself. If planned, I will switch to Cognito Authentication. I know, it is not a good way, even for prototypes. Everyone with that API key and the App can access all vehicles. But for a first demo and proof of concept, it seems to be enough for me. On further development, I will change it. Another disadvantage of the API key Method is that the key expires after one week. So have to generate a new key and rebuild the app.

With Cognito User Pools it is also possible to define user groups to restrict access. So it would be possible that normal user only can access the assigned car but car-pool managers can access all cars.

“Alexa, ask carfinder where is my car”

The Alexa Skill

In addition to the app, I created a simple Alexa Skill. The Alexa skill should answer the following question:

What would be the easiest way for an employee to get to know about the vehicle position of a pool car?

Actually, the employee has to look up in the driver’s log. But sometimes the position is described as inaccurate or incorrect. Alexa can answer this question much better. In addition, if an echo device with a display is used, it shows a map with the location.

A question to a skill is constructed as followed:

[Keyword “Alexa”] [Skill Invocation Name] [Intent, optional with Slots]

A sample question could be:

Alexa, ask carpool where is my car?

“carpool” is the skill invocation name, “my car” is a sample for the intent slot named “car” and “where is {car}” is a sample utterance.

These three things are most important to set up. The Alexa skill is set up by an Amazon developer account (in difference to an AWS business account). After building the model an Endpoint has to be defined. This could be a lambda function or an HTTPS endpoint. I decided to use a lambda function. Using the Skillinator, I generated a code example based on my JSON skill model and modified it for my needs. I implemented the link between the skill and the vehicle-snapshot DynamoDB table so that you can ask for specific cars. With the Here API, the geo-coordinates are translated into addresses. So when you ask for the position you get the street, number, and city and also the name of the next POI. This is helpful when you don’t know the exact address but the next POI like another company or a restaurant.

 

Autonomous War – Which dangers are associated with warfare without human intervention?

The term autonomous war has been a controversial topic for years. But what exactly does the term actually mean? Autonomous war means the use of autonomous lethal weapons (short: LAWs) and machines or vehicles, which are primarily used by the military for modern warfare. Autonomous weapon systems can decide independently about life and death on the battlefield. However, autonomous weapon systems are more commonly known in the media as “killer robots”.

Continue reading

Building a Document Translator for a Multi-Language Blog

Motivation

Multi-Language Blog Navigation

The idea for this project occurred to me while I was listening to my sister share her vision for her recently started blog: To create a platform where writers of different ethnicity can publish texts in their native languages and exchange their stories with people from all over the world. Conquering the language barrier and making the texts available at least in the three most prominent languages – German, English and Arabic – requires the involvement of translators who are fluent in at least two of the demanded languages. Anyone who has ever attempted to make a translated text sound natural knows that this is no easy feat and can take up to many hours of finding the perfect balance between literal translation and understandable text.

This is where I saw room for improvement. Nowadays, machine translation tools have reached a decent level of fluency, despite not being able to capture the intricacies of different linguistic styles. Combining them with people who have a basic understanding of the source language can help speed up the process and reduce the effort considerably. Continue reading

Web server with user registration and guestbook with image upload

Overview

The users access the website where they have the option to the view the guest book, register or log in. To register the user has to provide a username, an email address and a secure password (more than 8 characters, upper and lowercase characters, numbers and special characters). Then an email with a verification link will be sent to the provided email address. Clicking this link will enable the user to login.

Upon login the user can post messages in the guest book, which will be saved to the MySQL database. In addition to a text message the user can also upload an image which will be transferred to an S3 Bucket, which in turn will trigger a Lambda. Here the image will be resized to make it suitable for displaying in the guest book and it will be transferred to another S3 Bucket and permissions will be set to make in publicly accessible. The URL to this image will also be saved in the MySQL database.

Project architecture
Project architecture (created with Cloudcraft.co)

EC2 Instance

The OS running on the EC2 instance is Amazon Linux which is based on Red Hat Enterprise Linux. This is because I created the EC2 instance at the time when I was working through the AWS tutorials and this was the recommended distribution. By the time I started with the project, I decided to stick with it, although I had no prior experience with this distribution.

With ‘yum’, the packet manager of this distribution I was only able to install PHP 5.4 with in turn only enabled me to install version 2 of the AWS SDK for PHP which led to major problems later on with Cognito.

After a long search for a possibility to install a newer version of PHP (most advise suggested adding additional repositories, which never succeeded) I  found the following solution.

sudo amazon-linux-extras install php7.2

With this command I could finally upgrade to PHP 7.2. This in turn required an upgrade to Apache 2.4 which could be achieved with the same command.

RDS

For a database system I decided on MySQL. The setup was straightforward with the exception of the firewall. My EC2 instance wouldn’t connect to the RDS instance even after I setup the security group to allow port 3306 from the public IP address of the EC2. The search for a solution led me to create another MySQL user and checking various configuration files (as sometimes the remote access for the root user may be disabled), all to no avail. In the end it turned out I had used the wrong IP address, as the public IP address isn’t the one which connects to the RDS. The correct one can be found by running the ‘ifconfig’ command on the EC2 instance.

Lambda

For the resize Lambda function I built upon a function from the AWS examples. This function is triggered by uploading a file to a S3 Bucket. The image will then be downsized if necessary, transferred to another S3 Bucket, from which it can be publicly read and subsequently deleted from the first bucket. After setting everything up and testing it I found it didn’t work.

I used CloudWatch to debug, which show an access denied error. After much testing and searching I discovered that the process of making an object publicly available requires a separate action, called “s3:PutObjectAcl”  to be allowed in the policy.

Cognito & SES

Originally I had planned to do the user registration myself and just use the SES (Simple Email Service) for the verification email, but upon discovering the Cognito feature, which has the verification system built-in,  I decided to use this instead.

As I had installed an old version of PHP (see above), the packet manager of PHP (Composer) installed the only compatible version of the AWS SDK, which was 2.x. This didn’t have the  necessary ‘CognitoIdentityProvider’ methods but instead the easy to confuse “CognitoIdentity” methods, which is used for the Federated Identities feature of Cognito instead.

After some time of trying I spotted my mistake and after some other time installing a current version of PHP I could upgrade to AWS SDK v3.

Here I discovered the needed methods for user registration and login but was unable to execute them due to a required ‘SecretHash’  attribute which wasn’t documented on how to be generated.

Eventually I found aws-cognito on GitHub. This is a PHP library which provides the Cognito functionality in simpler methods.

S3 Buckets

Working with S3 Buckets was pretty straightforward with the AWS SDK for PHP. The only real problem I’ve encountered was the constructor. In the documentation are to possible methods for creating the client.

The first and recommended one was this:

use Aws\S3\S3Client;
$client = S3Client::factory($config);

This method refused to accept my config file, no matter how I reformatted it.After I changed my code to the second method everything worked fine.

$s3 = new Aws\S3\S3Client($config);

 

The source code is available here

Blockchain Risks and Chances – An 2018 Overview on Public and Private Blockchain, Smart Contracts, DAOs and ICOs

A few years ago, talking about Blockchain was largely consistent with talking about the technology behind Bitcoin. In contrast, Blockchain nowadays comprises a whole technology branch, whereby the Blockchain itself can be implemented in lots of various ways. Not a year ago, on December 17, 2018, the peak of the Bitcoin hype was reached by the breakthrough of $20,000 per coin. With the Bitcoin hype also, the hype around the Blockchain was further fueled. Consequently, we now have over 1800 Blockchain Platforms with Cryptocurrency listed on coinmarketcap.com. In addition, there are numerous frameworks and providers for so-called Private Blockchains, which are mostly used in companies and consortiums. Therefore, I’ll give an overview in this blog article of the current development in Blockchain as well as its chances and risks. I’ll also deal with technologies such as Smart Contracts, DApps, DAOs and ICOs, which are possible or have grown through Blockchain.

Continue reading

How to build an Alexa Skill to get information about your timetable 2018 Version

Imagine a student who just got up. He knows exactly that he has lectures today, but he does not remember which one or even when it begins. So, he asks his Alexa device: “Alexa, which classes do I have today?’” His Alexa device is able to look into his timetable and answers: “You have five lectures today. The first lecture is Digital Media Technology and starts at 8:15 am in room 011, the second lecture is Web Development and starts at 10:00 am in room 135, the third lecture is Design Patterns and starts at 11:45 am in room 017. You can see more lectures in your Alexa app.”

This scenario is what we had in mind when we started to develop an Alexa skill which should be able to tell you information about your timetable.

General Information

Figure 1

Alexa, amazon’s cloud-based voice service, powers voice experience on millions of devices in the home. Alexa provides the possibility to interact with devices in a more intuitive way – using voice. By default, Alexa can give you information about the current weather, set an alarm or more. But Alexa has infinite abilities which are called skills. As a developer you can easily add your own skill with the Alexa Skill Kit. Before we get started let’s take a look at the components of an Alexa skill.
An Alexa skill consist of two main components: The Skill Service where we will write code and the Skill Interface which can be configured through the Amazon Developer Portal. (Figure 1)

The Skill Service will provide the functionality/business logic. It decides what actions to take in order to fulfill the users spoken request. You could use your own HTTPS Server or use AWS Lambda, which is Amazons serverless compute platform. It can handle HTTP Requests, sessions, user accounts or database access for example. For an Alexa skill we must implement event handlers. These handler methods define how the skill will behave when the user triggers the events.
The Skill Interface configuration is the second component of an Alexa Skill. It is responsible for processing the user’s spoken words. To be more precise: it handles the translation between audio from the user to events the skill service can handle. This is done within the interaction model configuration. This is where we train the skill, so that it knows how to listen for users spoken words. We define the words or phrases that should map to a particular intent name in the interaction model. We also define what intents the skill service implements.
The interaction between the code on the skill service and the configuration of the skill interface yields a working skill.

Getting started: Skill Interface

When you want to create an Alexa Skill you probably start with setting up your Alexa skill in the Developer Portal. To develop your own Alexa skill you have to sign up for a free Amazon Developer Account. Once you are signed in you can navigate to Alexa. The Alexa Skill Kit (ASK) is a free to use collection of self-service APIs, tools, documentation and code examples that you can use to quickly and easily add new capabilities which are called Skills. This part does not include any coding.
After clicking on Create Skill you must give your Skill a Skill name and set the language for your skill. The Skill Name will be the name that shows up in your Alexa App. The Amazon Developer Portal provides a simple skill builder checklist with all steps that need to be completed.

  1. The Invocation Name is a word or phrase that tells your Alexa enabled device which skill you want to trigger. It does not have to be the same as the skill name. 

    Figure 2
  2. Intents, Samples and Slots: An intentrepresents an action that fulfills a user’s spoken request. Intents can optionally have arguments called slots 1 Amazon offers built-in Slot types like AMAZON.EmailAddress which converts words that represent an email address into a standard email address. Samples or Utterances are phrases a User might say that are mapped to the intents. There should be as many representative phrases as possible.

    Figure 3
  3. Build Model: As you create and edit your skill save your work and validate, save and build your interaction Model. You must successfully build the model before you can test it.
  4. Endpoint: This is where you can configure your AWS Lamdba as Endpoint
    Figure 4

     

Skill Service

After we have created the intents for our skill within the Alexa Skill Kit, we have to create a new lambda function via the Amazon Web Service (AWS) services. To user AWS Lambda you must create an AWS account.

Various programming languages such as C#, Go, Java, Node.js or Python can be used for the AWS Lambda function. Amazon supports its developers and offers not only empty templates but also ready-made templates for use. In our example we chose the “alexa-skill-kit-sdk-factskill” for the template

After we have given the Function a suitable name and set the executing permissions, we receive an Amazon resource name (ARN) from AWS. The ARN is a unique identifier for AWS resources.This ARN must be entered as
the endpoint within the skill interface.

The ARN has following format:
arn:aws:lambda:eu-west-1:<Tenant-ID>:function:ical_startfunction

Now we need to tell the lambda function how it is triggered. Therefor we add the “Alexa Skill Kit” in the designer of the lambda function as trigger. To secure the function, you must insert the Skill ID of the Alexa skill into the trigger. So that the function may only be opened from a selected skill.

Figure 5

We started coding the AWS lambda function. In Node.js we use an Alexa handler that processes which Intent is called by Skill Kit. Depending on the incomming intent an event handler will be choosen.
Since we want to handle an iCal file, our program must be able to parse it. We found and used the appropriate iCal module in Node. That’s where we encountered the first problem. If you want to use an external module in Node.js you have to install it on the server. But it is not possible to install it in AWS Lambda and use it in the In-Line editor. Therefor we had to implement the code local, then zip it and upload it to AWS.

Here is the function to parse the events:

'searchIntent': function () {

        // Declare variables
        let eventList = new Array();
        const slotValue = this.event.request.intent.slots.date.value;
        if (slotValue != undefined)
        {
            // Using the iCal library I pass the URL of where we want to get the data from.
            ical.fromURL(URL, {}, function (error, data) {
                // Loop through all iCal data found
                for (let k in data) {
                    if (data.hasOwnProperty(k)) {
                        let ev = data[k];
                        // Pick out the data relevant to us and create an object to hold it.
                        let eventData = {
                            summary: removeTags(ev.summary),
                            location: removeTags(ev.location),
                            description: removeTags(ev.description),
                            start: ev.start
                        };
                        // add the newly created object to an array for use later.
                        eventList.push(eventData);
                    }
                }
                // Check we have data
                if (eventList.length > 0) {
                    // Read slot data and parse out a usable date
                    const eventDate = getDateFromSlot(slotValue);
                    // Check we have both a start and end date
                    if (eventDate.startDate && eventDate.endDate) {
                        // initiate a new array, and this time fill it with events that fit between the two dates
                        relevantEvents = getEventsBeweenDates(eventDate.startDate, eventDate.endDate, eventList);

                        if (relevantEvents.length > 0) {
                            // change state to description
                            this.handler.state = states.DESCRIPTION;

                            // Create output for both Alexa and the content card
                            let cardContent = "";
                            output = oneEventMessage;
                            if (relevantEvents.length > 1) {
                                output = utils.format(multipleEventMessage, relevantEvents.length);
                            }

                            output += scheduledEventMessage;

                            if (relevantEvents.length > 1) {
                                output += utils.format(firstThreeMessage, relevantEvents.length > 3 ? 3 : relevantEvents.length);
                                relevantEvents = relevantEvents.sort(function compare(a, b) {
                                    var dateA = new Date(a.start);
                                    var dateB = new Date(b.start);
                                    return dateA - dateB;
                                  });
                            }

                            if (relevantEvents[0] != null) {
                                let date = new Date(relevantEvents[0].start);
                                var eventName = removeTags(relevantEvents[0].summary)
                                var badCharacter = eventName.indexOf("(");
                                eventName = eventName.substring(0, (badCharacter - 1));
                                output += utils.format(eventSummary, "erste ", eventName, date.toLocaleTimeString(), relevantEvents[0].location);
                            }
                            if (relevantEvents[1]) {
                                let date = new Date(relevantEvents[1].start);
                                var eventName = removeTags(relevantEvents[1].summary)
                                var badCharacter = eventName.indexOf("(");
                                eventName = eventName.substring(0, (badCharacter - 1));
                                output += utils.format(eventSummary, "zweite ", eventName, date.toLocaleTimeString(), relevantEvents[1].location);
                            }
                            if (relevantEvents[2]) {
                                let date = new Date(relevantEvents[2].start);
                                var eventName = removeTags(relevantEvents[2].summary)
                                var badCharacter = eventName.indexOf("(");
                                eventName = eventName.substring(0, (badCharacter - 1));
                                output += utils.format(eventSummary, "dritte ", eventName, date.toLocaleTimeString(), relevantEvents[2].location);
                            }

                            for (let i = 0; i < relevantEvents.length; i++) {
                                let date = new Date(relevantEvents[i].start);
                                var year = date.getUTCFullYear();
                                var month = date.getUTCMonth();
                                var day = date.getUTCDate();

                                var germanDate = day + "." + (month + 1) + "." + year
                                cardTitle = utils.format(cardTitleText, germanDate);

                                var eventName = removeTags(relevantEvents[i].summary)
                                var badCharacter = eventName.indexOf("(");
                                eventName = eventName.substring(0, (badCharacter - 1));
                                
                                var eventLocation = removeTags(relevantEvents[i].location)
                                var badCharacter = eventLocation.indexOf(",");
                                if (badCharacter >= 0){
                                    eventLocation = eventLocation.substring(0, (badCharacter)); 
                                }
                                cardContent += utils.format(cardContentSummary, date.toLocaleTimeString(), eventName, eventLocation + "\n");
                            }

                            output += eventNumberMoreInfoText;
                            this.response.cardRenderer(cardTitle, cardContent);
                            this.response.speak(output).listen(haveEventsreprompt);
                        } else {
                            output = NoDataMessage;
                            this.response.speak(output).listen(output);
                        }
                    }
                    else {
                        output = NoDataMessage;
                        this.response.speak(output).listen(output);
                    }
                } else {
                    output = NoDataMessage;
                    this.response.speak(output).listen(output);
                }
                this.emit(':responseReady');
            }.bind(this));
        }
        else{
            this.response.speak(dateNotFound).listen(dateNotFound);
            this.emit(':responseReady');
        }
    },

The result of the function is then passed back to Alexa. It converts the text result into speech and reads it out to the user via an
Alexa device.

Another challenge for us was the connection to the DynamoDB to enable
the user to read out the link. We therefore had to change the authorizations
of our lambda function and move the entire function into a single tenant.

The UserID of the user was given with the session start function of the Alexa Skill Kit. We read these out via the Alexa Handler in the Lambda function.
When the skill is opened, the launch request is executed, which normally greets the user and gives him an introduction to the skill. We adjusted this function so that it queries the database before it starts and searches for the current UserID. If it already exists in the database and an ical link exists, the skill is executed normally. If the current ID does not yet exist or there is no ical link associated, the user is requested to provide his ID and the ical link.

Here is the code for authorization:

'LaunchRequest': function () {
        ical_url = undefined;

        getDataFromDynamoDB(); 
        
        console.log('userId', userId);
        console.log('Attempting to read data');

        console.log('Warte auf Datenbankergebnisse');
        sleep(3*1000); 

        console.log('iCal_Link', ical_url)
        this.handler.state = states.SEARCHMODE;

        if (ical_url) {
            console.log('Hab nen Link!');
            URL = ical_url;
            this.response.speak(welcomeMessage).listen(welcomeMessage);
            this.emit(':responseReady');
        } else {
            console.log('Hat nich geklappt..');
            this.response.speak('Ich kenne dich leider nicht, bitte hinterlege einen gültigen Link in der Datenbank');
            this.emit(':responseReady');
        }
        
    },

During one point in development we asked ourselves how we could identify a user so that everyone can query his or her own timetable. We knew from the beginning that we would need a website with a database. We wanted to show a simple form to get the users ical link and save it into the database. Therefor we created a small serverless Lambda web app using API Gateway, DynamyDB and S3 Bucket.

Figure 6

The API Gateway creates a POST endpoint for the provided user information and forwards the information to Lambda.

The Lamdba function takes the parameters User ID and an iCal link and saves them in the Database.
DynamoDB is where all the users with their iCal links are stored in a NoSQL database.
S3 hosts a HTML and CSS file as a website.

Once we finished the site with all its components, the question arose how to allow Alexa users to enter their data into the website. Our first approach was through the installed Alexa App and Amazons Account Linking. This way a user can provide his login credentials and it is the only way to display the ‘Settings’ button we wanted.  So, we tried to use the Amazon Account of a user. The benefit would be that every Alexa user needs an Amazon Account and therefor has a unique user id we could use for identification. To use account linking, it only needs to be activated and configured in the Amazon Developer Portal as you can see in Figure 6. (We do not want to go into this part in detail.)

Figure 7

We tried account linking but let’s just say that we haven’t been able to fully configure it. We wanted to use the Amazon Account for authentication, but Amazon only allows a redirect to the Alexa App dashboard after login and not to our own website.

Our next idea was to use the cards in the Alexa app. After installing our skill, we wanted to display a card that directs the user to our website. Unfortunately, this was also not a possibility, since no clickable contents can be inserted in the card. Furthermore, it is not yet possible to copy content from a card, the user would have had to type in the very long URL to our website by hand.

Conclusion:

We had a lot of fun with the Cloud project in general, even though we have been confronted with problems during the whole developing phase.

As mentioned above we’re not happy with the way we had to handle the user information. It is frustrating that it’s not possible to copy information from the Alexa App or insert clickable links.
It’s also sad that it’s not possible to get the ‘Settings’ button without an authentication server. We don’t want any other account information. We simply wanted the user to provide some information.

Another difficulty was the format of our ical file. Some lectures are in more than one room which is why we had the problem that Alexa said things like “You have one lecture today: Web Development 2 (xxxxxx), Web Development 2 (xxxxxx), Web Development 2 (xxxxxx)”.
At one day our code didnt’t work anymore even though we did not change anything. We spent hours finding the problem until we realized that the personal timetable was deleted and you just have to reconfigure it.

At the end we were able to prepare our project ready for the final presentation, but it could not be used for productive operation because we can not offer a simple way to provide the ical link.

Parsing all Open Source Elm Code

This project was originally inspired by a talk Felipe Hoffa gave at the Github Universe conference last year. He talked about how we can analyse the code hosted on Github at a large scale to learn interesting things. I’m always excited about learning new programming languages, at the moment my favourite new langue is Elm, a small functional programming language for building web applications. After watching the talk I thought it would be nice to do this kind of analysis on all the public Elm code hosted on Github.

Initial Idea

The big question question is: Why isn’t it already possible to learn interesting things about the code hosted on Github? The problem is that Github works on the text level. All the code is just a huge collection of text files. It would be much more useful if we could operate on the structure which the text represents. Instead of searching for all files which contain the string “List.map” we could precisely search for all source files which actually contain a reference to the List.map function.

If we want to scale this approach to all the Elm files hosted on Github we need a few steps:

  1. Find all Elm repos
  2. Parse all the Elm files in each repos and extract the references
  3. Store the references and files in the db so they can be querried later

In my implementation I’ve limited myself to just storing which symbols a file defines and which symbols it references instead of storing the whole syntax tree of each file in the db. The resulting graph structure is represented in the graph below.

I’m a notorious procrastinater (perfectly proven by the fact that I’m writting this blogpost on the day of the deadline), therefore I decided to submit my idea as a talk proposal for Elm Europe 2018 to give myself some accountability. I guess you could call this approach “talk driven development”. I was lucky and my talk got accepted by the conference.

Prototype for the conference talk

The focus of the prototype was to quickly get some results and see for which use-cases such a graph could be useful. I wrote a Node.js script which I ran on my local machine to import the data into a Neo4j Graph Database.

Example of the resulting dependency graph of the elm-todomvc repo:


In my talk, I’ve demonstrated some examples how you can use the graph

  • get code examples for any library function
  • check if a project is referencing any unsafe functions that can lead to runtime exceptions

The feedback was very positive, but the most commonly requested feature was to have a simple search interface for the graph that allows you to find code examples from other people to help you understand how to use a specific function.

Elm function search

I decided to take the lessons I’ve learned from my talk and turn the prototype into a practical application which allows people to search for code examples. I had to solve 2 problems to get there:

  • Move the node.js script into the cloud
  • Improve the robustness of the parser

Architecture

I build the crawler based on the AWS lambda architecture. The two steps fetching the repos and parsing each repo are two separate functions. The fetch repos function pushes a message for each new repo it finds to a worker queue which works as a buffer for the parsing jobs. The parse repo function is triggered for each repo in the queue. After the repo is parsed the function writes all the references it found to the database.


I’ve decided to switch to using Postgres instead of Neo4j. Because we’re just interested in finding matches code examples we don’t need the graph. Instead we can just store all the references in a simple table with one entry for each reference.

knex.schema.createTable('references', (table) => {
    table.increments('id').notNullable()
    table.string('package').notNullable()
    table.string('module').notNullable()
    table.string('symbol').notNullable()
    table.string('version').notNullable()
    table.string('file').notNullable()
    table.integer('start_col').notNullable()
    table.integer('start_line').notNullable()
    table.integer('end_col').notNullable()
    table.integer('end_line').notNullable()
    table.string('repo_owner').notNullable()
    table.string('repo_name').notNullable()
    table.string('commit_hash').notNullable()
    table.foreign(['repo_owner', 'repo_name'])
            .references(['repos.owner', 'repos.name'])
  })

The API server can then be implemented as a simple node services which acts as a thin layer between the frontend and the database. I’ve decided to use zeit.co to host the API server. Zeit is a startup that allows you to deploy Node.js applications to the cloud without any configuration. You can also scale up applications easily by running multiple instances of it which are automatically load balanced.

Fetching all repos

I had to go through several iteration to solve this problem because unfortunately Github doesn’t provide a direct API to get all repos of a specific programming language. Initially I used the Github Dataset which Filipe also used in his talk. The problem with that approach was that the dataset only contained repos where Github could detect an open source license. This meant that a lot of Elm repos where missing in the data set.

The Elm language itself also has a package repository which contains all the published Elm packages. But this again is just a fraction of all the public elm repos.

Finally I had another look at the Github search API which allows to use programming language as a filter criterion but with the caveat that you can only access the first 1000 results. But I figured out a trick to get around this restriction. I sort the repository by the timestamp when they were last updated then I fetch the first 1000 results. In the next batch I use the timestamp of the last result as an additional filter criterion to get only repos which haven’t been updated more recently. That way I can incrementally crawl all repos. It’s a little bit hacky, but it works.

Sending repos to worker queue

I’m using the serverless framework to setup my application. To connect the fetch repos function with the parse repo function I need to define a queue as an additional resource in my serverless.yml

RepoQueue:
    Type: "AWS::SQS::Queue"
    Properties:
      QueueName: "RepoQueue"

I also need to add the ACCOUNT_ID and the REPO_QUEUE_NAME to the environment of my lambda functions so I can address the queue I’ve defined previously

REPO_QUEUE_NAME:
    Fn::GetAtt:
      - RepoQueue
    - QueueName
ACCOUNT_ID:
    - Ref: 'AWS::AccountId'

Before our lambda function can send messages I also need to add a new permission in the role statements.

iamRoleStatements:
    - Effect: Allow
      Action:
        - sqs:SendMessage
      Resource: arn:aws:sqs:*:*:*

After I’ve setup the queue and added the correct permissions I can use the AWS SDK to send the fetched repos to the queue.

const AWS = require('aws-sdk')
const {REPO_QUEUE_NAME, ACCOUNT_ID} = process.env

// Create an SQS service object
const sqs = new AWS.SQS({apiVersion: '2012-11-05'});

module.exports = async ({ owner, name, stars, lastUpdated, license}) => {
  return new Promise((resolve, reject) => {
    sqs.sendMessage({
      DelaySeconds: 10,
      MessageBody: JSON.stringify({
        owner,
        name,
        stars,
        license,
        lastUpdated
      }),
      QueueUrl: `https://sqs.us-west-1.amazonaws.com/${ACCOUNT_ID}/${REPO_QUEUE_NAME}`
    }, (err, data) => {
      if (err) {
        reject(err)
        return
      }
      resolve(data)
    })
  })
}

Parsing repo

This step also took several iterations to get right. In my prototype I used a modified version of the Elm compiler which spits out all the symbols it discovered during compilation. Alex, who I met at the Elm Meetup in SF helped me with that. This worked well enough for the prototype but it wasn’t a stable solution. First of all I would need to maintain a fork of the compiler, which would be hard for me since I barely know any Haskell.

Next I tried to use the elm-ast library which is a Elm parser implemented in Elm. This looked promising at first and also had the benefit that I could easily run it in node directly because all Elm code compiles to Javascript. But I also ran into some some issues:

  1. the library didn’t cover all edge cases of the elm syntax. Some valid Elm files would lead to parsing errors.
  2. Positional information which maps the symbols back to the sourcefile was only rudimentarily implemented and had some bugs
  3. the performance was really slow. Sometimes it took up to several seconds to parse a single file.

Especially the performance issue combined with the fact that the repo wasn’t actively maintained made the elm-ast library not usable either

In the end I landed on the elm-format library. It’s a code formatting library that automatically formats your Elm code. Brian, who I met at elm Europe introduced me to Aaron the creator of the library. They were already working on adding a flag which export the syntax tree for a file, but the efforts have been pushed back because so far there hasn’t been really a concrete usecase for this. Araon also added positional information. This problem had only one drawback: The library was written in haskell, which meant I had to run custom binaries inside of AWS Lambda.

Running Haskell code inside of AWS Lambda

In principle you can run any binary inside of an AWS Lambda function. The problem is that the environment is an extremely trimmed down Linux. Most binaries depend on external libraries which are dynamically linked. If you run such a binary inside of a lambda function it won’t work because the external libraries are missing.

This was also the case with the elm-format binaries. I had to build the binaries myself with all the dependencies linked statically. For this I had to modify the cabal config file of elm-format

executable elm-format-0.19

    ghc-options:
        -threaded -O2 -Wall -Wno-name-shadowing

    hs-source-dirs:
        src-cli

    main-is:
        Main0_19.hs

    build-depends:
        base >= 4.9.0.0 && < 5,
        elm-format

    ld-options: -static // added static option

executable elm-format-0.18

    ghc-options:
        -threaded -O2 -Wall -Wno-name-shadowing

    hs-source-dirs:
        src-cli

    main-is:
        Main0_18.hs

    build-depends:
        base >= 4.9.0.0 && < 5,
        elm-format

    ld-options: -static // added static option

After that I build the binary from source.

stack install --ghc-options="-fPIC"

At first I tried to run it on my MacBook which didn’t work because MacOS doesn’t provide a statically linkable version for all libraries. I solved this problem by quickly spinning up a Linux VM on Digital Ocean and building elm-format there. This illustrates nicely that the cloud is not only useful for scalable deployments but can also help during development.

Writing results to the database

Finally we need a Postgres database which stores all the references. We can add this as another resource in our serverless.yml file. We also need to create a SecurityGroup for the Database which makes the database accessible. Right now I’m using hardcoded values for the credentials. A better solutions would be to use the AWS Systems Manager to store the credentials.

pgSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: Acess to Postgres
      SecurityGroupIngress:
      - IpProtocol: tcp
        FromPort: '5432'
        ToPort: '5432'
        CidrIp: 0.0.0.0/0

  pgDB:
    Type: "AWS::RDS::DBInstance"
    Properties:
      DBName: "elmFunctionSearch"
      AllocatedStorage: 5
      DBInstanceClass: "db.t2.micro"
      Engine: "postgres"
      EngineVersion: "9.5.4"
      MasterUsername: "master"
      MasterUserPassword: "test12345"
      VPCSecurityGroups:
      - Fn::GetAtt:
        - pgSecurityGroup
        - GroupId
    DeletionPolicy: "Delete"

We also need to add the credentials and the host of the db to our environment so we can access the database later

DATABASE_HOST:
      Fn::GetAtt:
        - pgDB
        - Endpoint.Address
DATABASE_USER: "master" 
DATABASE_SECRET: "test12345"
DATABASE_NAME: "elmFunctionSearch"

Inside our application we can then construct a connection string to connect with the database

const {DATABASE_HOST, DATABASE_USER, DATABASE_SECRET, DATABASE_NAME} = process.env

const connectionString = 
    `postgres://${DATABASE_USER}:${DATABASE_SECRET}@${DATABASE_HOST}:5432/${DATABASE_NAME}`

Final Result

The search is very basic at the moment. The user can enter the name of a function.

If there are multiple matching functions the user can select which package they meant.

The result is a list of links to github files which use the function

 

YourTube – A simple video platform in you personal amazon cloud

During the Dev4Cloud Levture i created a simple static webpage that uses Amazons S3 service for Hosting and video storage and amazons cognito for user authentification and role managemant.

Design considerations.

The platform was designed with simplicity in mind and there for i decided to go with as few services as possible. This is why this project uses no database. The features of the application as is are the following:

  • Video storage grouped in channels
  • Public viewing of videos
  • Authorized users can:
    • Create channels
    • Upload Videos
    • Delete Channels
    • Delete Videos

Implementation

The system was implemented using Amazons web services but is designed in such a simple way that porting it to other cloud providers with an JavaScript API should not be to difficult.

At first i created 2 separate Storage Buckets on amazons s3 object storage service. I configured one to act as a hosting service fore the application and the other one as storage.

The next part was to create an Admin user on Amazons Cognito service and give him a role with full access rights to the video storage bucket and save the user id and secret key. Furthermore i created an Identity pool with anonymous users enabled and gave those read permissions on the storage bucket.

Now i could upload the site itself to the Hosting Bucket.

The site itself makes heavy use of Amazons aws JavaScript API and is build as a single page application. The API is used to authenticate and request the objects from the Video Storage Bucket.

As a quick fix to ad SSL to the site i used amazons cloudfront CDN to cache the site.

Summary

The project itself was an interesting endeavor. I learned a lot about Amazons Cloud services. Going with a cloud based solution in this project had many upsides:

  • It was free  – Up to 5 GB storage is covered by amazons free licensing terms
  • It is scale able – The storage grows with the videos, no additional storage expansions needed

The only downside i have noticed was a bit higher cost of development hours.

Where to take this from here

For the future it is planed to ad meta data and thumbnails to both video and channel via images and text files which will be placed in the corresponding folders. Also a complete redesign of the UI is in order.

The code of the project can be found here.

Written by Nils Kristjansson

 

 

Tweets by Donnie - Building a serverless sentiment analysis application with the twitter streaming API, Lambda and Kinesis

tweets-by-donnie dashboard

 

Thinking of Trumps tweets it’s pretty obvious that they are controversial. Trying to gain insights of how controversial his tweets really are, we created tweets-by-donnie.

“It’s freezing and snowing in New York — we need global warming!”
Donald J. Trump

You decide if it’s meant as a joke or not.

But wouldn’t it be nice to know whether the public is seeing this as a joke or whether it’s getting upset by it? That’s where our idea originated from. By measuring the emotions presented in the responses we can see what the public is thinking of Trumps posts throughout the day.

To generate such insights we decided to create a cloud architecture that can deal with a stream of tweets, enrich them and finish it all up with a simple API to query the results.

Easier said than done.

Home is where your IDE is, right? Writing code in the AWS console wasn’t a thing we felt good about. Also, it’s not reproducible in any way, which is why we chose the serverless framework. It bridges the gap from code in your IDE to the cloud. First we were overwhelmed by the technologies as these were our first steps building anything in the cloud. We never heard of AWS Cloudformation and never touched yaml files before but this seemed the way to go and it turned out to be very handy having all code and configurations checked in a repo. This way changing, recreating, or deleting code (or even the whole architecture) is a breeze. Check out our repo, fork it and try it yourself.

The serverless.yml file acts as your main description of your architecture. It can go from a single lambda function to a whole directory containing separate yaml files for different purposes like functions, resources, roles… you name it.
Speaking of roles it’s easy to maintain the least privilege principle with serverless. You can create a role per serverless.yml or go as far as creating a role per function.

Another good thing was creating the Resources on the fly. We needed some DynamoDB tables.

On a side note: DynamoDB needs some time to get used to. Deliberately select the right set of primary and sort key for your tables because you don’t want to waste time scanning through big tables. In our case we had tweet id’s but that’s not what we’re querying. We are querying for time and as our data is time sequenced, so we chose the day (yyyy-mm-dd) as our primary key and a timestamp as the sort-key. This way we can query for days and sort by timestamp or filter for a time frame of the day.

You can add resources like this.

Referencing these resources in other parts of the serverless.yml is quite handy too. For example to trigger a lambda function from a new input to a table we need the stream ARN in our event trigger.
With ‘Fn::GetAtt:[TrumpsTweetDynamoDBTable, StreamArn]’ the Attribute is automatically resolved when deploying.

If you’re ever in need of examples or documentation have a look at the following links, they were rather helpful in the learning process.


AWS CloudFormation allows you to create and manage AWS infrastructure deployments predictably and repeatedly…


Serverless Examples — A collection of boilerplates and examples of serverless architectures built with the Serverless…github.com


Now there is a catch with having our configuration files in source control —  secrets. We don’t want them in any repository. So we had to think of storing secrets such as api keys and passwords elsewhere. There are multiple ways to do this in a secure manner. We chose a method where we store the secrets in a separate yaml file (serverless.env.yml) which is then referenced in the actual serverless.yml. You can reference to other files with ‘${file(path/to/file)}:some.key’. This way we can gitignore the serverless.env.yml containing our secrets but keep the serverless.yml checked in without accidentally committing them to a repo.

serverless.yml
serverless.env.yml

This method seemed feasible for this rather uncritical POC but if you are planning a bigger projects read the following.

Continue reading

Using the power of google cloud API: A dockerized node app counting words in prasentations.

For the Dev4Cloud lecture at HdM Stuttgart, we created a simple Go/NodeJS/React App, which helps people to keep track of often used words during presentations. In a presentation setting, most people tend to use too many fill words and to train against this, we want to introduce our presentation counter to you.

The presentation counter consists of 3 parts, the React frontend, the GO backend and the NodeJS speech server for the communication with Google Cloud platform. To make it short, the frontend captures the microphone audio, sends it to the speech server, and the speech server gets the audio transcript from Google. Then the transcript is send back to the Go Backend which saves the relevant words in an Alpine db and updates the frontend.

Frontend

Static compiled react frontend, contain code to communicate via Websocket with the Go and NodeJS server. As well as capturing the microphone audio. Capturing audio needs a bit of boilerplate code:

AudioContext = window.AudioContext || window.webkitAudioContext;
    context = new AudioContext();
    processor = context.createScriptProcessor(bufferSize, 1, 1);
    processor.connect(context.destination);
    context.resume();


    var handleSuccess = function (stream) {
        recButton.classList.add("rec-effect")

        globalStream = stream;
        input = context.createMediaStreamSource(stream);
        input.connect(processor);

        processor.onaudioprocess = function (e) {
            microphoneProcess(e);
        };
    };

    navigator.mediaDevices.getUserMedia(constraints)
        .then(handleSuccess);

The micophoneProcess inside the handleSuccess function receives a stream of an audio buffer with a given buffersize. The micophoneProcess does two important things, first it converts the stream from 48000 Hz to 16000 Hz and then uses WebSockets to send it to the NodeJS server close to real time.

Our frontend. Mobile first witch a super simple UI. Tap the mic to start the session and add words on the bottom.

Speech Server

This server is a lightweight NodeJS app. For using the Google Cloud API you need to have your own user-key and this should not be shared with the frontend, so we created a layer in between them. This server holds in key with the dotenv-library. Google Cloud API needs the key to be in the process environment variables, and adding the key on operating system level would be a big pain, so we used this great library. After an audio stream started, the server uses the key to authorize with Google Cloud API, and creates uses speechClient.streamingRecognize(request) to open a stream to the cloud. Inside the request parameter is the configuration and encoding information. Our configuration looks like this:

const request = {
    config: {
        encoding: encoding,
        sampleRateHertz: sampleRateHertz,
        languageCode: languageCode,
        profanityFilter: false,
        enableWordTimeOffsets: true
    },
    interimResults: true
};

Please note the last item: interimResults: true This tells the Google Cloud API to send unfinished results. Why would we want unfinished results, could you ask yourself? The answer is simple: we do not want to wait until the sentence is finished. As soon as a sentence is finished, Google can calculate the probability more accurate, since the context is closed. That means, when ever a Google detects a finished sentence, we can a more accurate prediction. But we would have to wait until the sentence is finished. Because of that we are using the less accurate results to get faster results, near real time again and might have to correct the displayed results if they change.

Google Cloud API gives a transcript of all the spoken words it can recognize in the audio stream. So the NodeJS Server sends this transcript, divided in a final and a interim part to the Go Backend where the words are going to be counted – and as soon as the interim part gets final recounted. In that way we have fast results, which is important for a word counter app – nobody would like to wait until the sentence is finished for the counts to update.

Google Cloud API

For this project we used the Google Speech API. To be able to use this, you first have to a Google API key, but this is quite straightforward and described here. Next you have to download a library from Google, which mirrors the API functions in your code. With NodeJS the installation is npm install google-cloud/speech --save. Now you can import it and initialise it.

const speech = require('@google-cloud/speech');
const speechClient = new speech.SpeechClient();

After those steps you are ready to go! Just follow the steps described in the Speech server section.

Another thing to add, with your Google accounts comes 300$ of free credit for the Google Cloud Platform. At first we wondered if this will be enough to develop the project, until we found out that you also have 60 free minutes to analyse with the Speech API. Even after those 60 Minutes it is only 15 ct/min. What we want to say, just play with it, its easy and the possibility to create a nice prototype is really awesome. Thanks Google!

Go-Backend

At the beginning, this project was fork of a project we did in the same semester. In the base project people can count words of a person doing a presentation by clicking on a button. For this former project I decided to try language in the backend which was new to me: GO. This language is more low-level than Java or JavaScript, but more high-level than C or C++. So if you come from a Java/JavaScript like we did, your mind will struggle probably from time to time. There some kind Pointer like in C but fortunately no pointer arithmetic. The main advantage of this language shall be, that it can be almost as fast as C++ but takes way less time to compile. Additionally it has some pure functional constructs in it.

So long story short: What we wanted to do is to extend this server that it uses the google speech-to-text to get back text form of someone’s presentation in real time. During the development process we found out that the requirements didn’t really meet with the previous ones. The server had to update the frontend in real time like it did before but now we wanted the server also to correct the frontends data. Why correcting? From the google API you get back a text in real time if you do stream processing. More or less word for word the sentence you get back is extended. But this sentence isn’t fix, as well as the words in it as long as the sentence hasn’t finished. This is because google always tries to return a reasonable sentence.

These facts and we still needed to count the words on every update made us to almost rewrite the whole web server code.

Docker

Both of us Simon and me Marius we still don’t know what we think about this tool. The general idea of having container deencapsulation purpose is great but working with docker often isn’t.

There is no real debug tool, at least we haven’t found one. Then it caches that much that you have to rebuild your container almost every time  from scratch disabling the cache. This can cause a very long build-step depending on what you have to install during the startup process.

But now to the advantages and what we did with it. Both of our servers, the node and the GO one are running its own alpine container. Docker-Compose is managing the they are in a docker network with a DNS, so the can communicate with their respective names.

This make it possible to deploy the application anywhere a docker daemon is available. What is quite nice. The whole software infrastructure comes with docker and the definitions in the Dockerfiles.

Summary

Working with cloud service is easier than we thought. Using the technology offered by google was very straight forward, but the docker and the GO thing wasn’t.

This was our first project using the cloud. As a result we just took a simple functionality it offers and created something it. Regarding the effort from an engineering point of view, I might have been easier to use more out of the box functionalities. For example you could have deployed the server code directly to a go or node service a google. There you can create containers by simply drag and drop modules to it.

Finally we can say, that it is worth to look deeper into the cloud topic. Not only because you needn’t to buy hardware, but also because it can save you a lot time to develop an application.

 

Written by: Simon Deussen & Marius Hahn