Using Gitlab to set up a CI/CD workflow for an Android App from scratch

Tim Landenberger (tl061)
Johannes Mauthe (jm130)
Maximilian Narr (mn066)

This blog post aims to provide an overview about how to setup a decent CI/CD workflow for an android app with the capabilities of Gitlab. The blog post has been written for Gitlab Ultimate. Nevertheless, most features are also available in the free edition.

The goal is mainly to provide an overview about Gitlab’s CI/CD capabilities. It is not object of the blog post to test and/or develop a complex android app, or to handle special edge-cases in android app development.

The blog post covers the following topics:

Defining a decent pipeline
Automatically running unit tests
Automatically running integration tests
Automatically running static code analysis checks
Automatically running debug/release builds
Automatically distribute the app for testers
Adding Gitlab’s drop-in features
- SAST
- Dependency management
- License management

The app

The app used/referenced in this post is a dead-simple android app made of two fragments. One fragments allows for entering of two digits in a text field, selecting one of three operations (add, subtract, multiply) and issuing the calculation via a button.
The result of the calculation is shown in the second fragment.

Gitlab

Gitlab, initially a pure web-based git-UI, is a popular source code management and collaboration software, which also provides built-in CI/CD features.

Project management in Gitlab

Gitlab also provides support for project management. You can define issues with description and labels, structure them into different milestones and view them in a clearly arranged board.
For the purpose of a simple android application the offered functionality was sufficient. Nevertheless, when you want to start a large project with many team members and an agile procedure, you should consider using a different tool which focuses only on project management, as Gitlab was simply not designed to be a PM tool and lacks features like for example a sprint backlog.

CI/CD

Gitlab divides CI/CD into some core components. A job is an arbitrary task, for example executing a script in order to build an APK. A pipeline is made of several jobs, where the jobs can be assigned to different stages. Stages are executed sequentially while jobs within a stage might execute in parallel. Basically, a stage is just a named container for jobs.

As a summary: A pipeline consists of one or more stages. A stage consists of one or more jobs.

Most of the CI/CD features are configured via the special file .gitlab-ci.yml located in a git repository’s root. Gitlab does not provide a UI for configuring jobs/stages/pipelines as, for example, Jenkins does.

Runners

While Gitlab itself does not execute any jobs, a machine executing jobs is called runner. Arbitrary machines acting as runner must have Gitlab’s runner daemon installed. It is responsible for polling and executing jobs.
As jobs are typically not run on the machine which hosts the Gitlab server, each job may have defined artifacts. Artifacts are the “result files” of a job (e.g. an APK) and uploaded to the Gitlab server once the job has been finished.
Runners can have arbitrary tags assigned (e.g. android-sdk). This is useful to distinguish the capabilities of runners. Defined jobs in .gitlab-ci.yml can require that a job might only be executed on a runner having a special tag.

Runner execution strategies

Execution of jobs on a runner can be done in different contexts. The shell runner executes a job in the runners user’s home directory. Any job dependencies (e.g. deb packages) must be installed on the runner machine prior running the job. The docker runner provides more flexibility, as it allows to define a docker image in which the job is executed. The docker in docker runner provides even more flexibility allowing a job controlling several docker containers. Drawback are decent security issues regarding docker-in-docker as job might gain root privileges.

In our project, we chose to use the docker runner for good flexibility and docker image support. Hence, due to required KVM-support for the Android Emulator, the docker runner runs in privileged mode which yields to the security issues described for the docker in docker runner.

Configuring Pipeline

Adding Gitlab drop-ins

Gitlab provides several ready-to-use pipeline jobs as drop-in. This includes a security scan, a dependency scan and a license compliance scan.

The security scan provides static application security testing (SAST), which, for examples, searches for unintentionally added secrets in the code. The dependency scan scans the repository for the used 3rd-party dependencies and lists them in the Gitlab UI. It also checks whether there are known security issues with the used versions of the dependencies. The license scan scans the used dependency regarding licensing issues. Project maintainers are able to define black- and whitelisted licenses via the Gitlab UI. A use case would be to disallow GPL-licensed code.
The results of all three scans are displayed in each merge request and in the repository UI (-> Security & Compliance).
The drop-ins are added to .gitlab-ci.yml by including them with a single line:

include: 
  - template: Dependency-Scanning.gitlab-ci.yml
  - template: SAST.gitlab-ci.yml
  - template: License-Management.gitlab-ci.yml

Because we did not have a runner with docker-in-docker (dind) support, the usage of dind by the dependency scan and SAST is disabled:

variables:
  SAST_DISABLE_DIND: "true"
  DS_DISABLE_DIND: "true"

When executing a pipeline, three additional jobs are added to its first stage.

The SAST-job detected our committed Google service account credentials correctly.

Adding tests

Without any doubt, you can not do CI / CD without having proper test coverage and automated tests running after commits or merge requests.
Android has three different kinds of source sets, with two of them being for unit tests and instrumented tests respectively.

Unit Tests reside in the test source set and can run on the JVM alone, as the classes tested do not have any dependencies towards Android itself. This means that no android emulator or physical device is needed to run these tests. However, being very granular and fast tests, they have low fidelity, meaning they do not closely represent real world scenarios.

Instrumented Tests on the other hand reside in the androidTest source set and albeit being high fidelity, meaning a close representation of real world scenarios, they depend on Android. In the case of UI-Tests, it is also needed to run either an emulator or physical device. Most issues of setting up a proper CI / CD pipeline arise here, as waiting for the emulator to start up can be time consuming, android SDKs are large, etc. A more detailed description of these problems and how to improve on them are in the section “Problems”.

example ui test using espresso on emulator

According to Mike Cohn’s original test pyramid, you should have the following things in mind:

write tests with different granularity
the higher the fidelity (high-level tests) the fewer tests you should have

This means that you should have many fast running, granular unit tests but only a few instrumented tests. If running your UI tests on the emulator is too slow, one option would be to set up a pipeline that only uses fast running tests, leaving UI tests out entirely. You should definitely avoid ending up with a test ice-cream cone (https://watirmelon.blog/testing-pyramids/).

Shows the three different source sets (src, androidTest, test)

Now that we can differentiate between those two kinds of tests in Android, how do we add them to the GitLab pipeline?

Using a docker runner with a docker image that has all Android SDKs and Gradle (see Custom docker image) already installed, we simply define a job in our gitlab-ci.yml.

debugTests:
  stage: test
  script:
    - ./gradlew -Pci --console=plain :app:testDebug
  artifacts:
    paths:
      - example-app/app/build/test-results/**/TEST-*.xml
    reports:
      junit: example-app/app/build/test-results/**/TEST-*.xml

This defines a job named debugTests in the stage test. To run the tests using gradlew, one simply has to run the gradle task testDebug. The generated JUnit test report will be uploaded as an artifact at the given path, available for browsing if the job succeeds. Adding the reports:junit keyword will display a summary of the report when you visit a merge request page.

Similarly to how we defined the job for running unit tests, we also define a job for running instrumented tests. Instead of running the gradle task to start unit tests, start the task connectedAndroidTest. If we plan to run UI Tests however, we need an emulator and also wait for it to start before we start the gradle task.

The following excerpt shows how to define a job for instrumentation tests and wait for the emulator to start up using adb.

debugInstrumentationTests:
  stage: test
  cache: {}
  script:
    - emulator -avd default -no-audio -no-boot-anim -no-window -accel on -gpu off &
    - adb wait-for-device
    - adb devices
    - ./gradlew connectedAndroidTest
    - exit
  tags:
    - hardware-acceleration

Adding test coverage

It is a good idea to use some sort of code coverage tool in your project to help you with staying true to the testing pyramid (having a lot of unit tests and fewer instrumented tests). Gitlab allows for parsing a jobs output using regex and displaying the parsed value as percentage of code coverage on commits or merge requests.

https://docs.gitlab.com/ee/user/project/pipelines/settings.html

In order to define a job that generates our reports and parses the code coverage value from the jobs output, we first need to find a tool that lets us create a code coverage report for Android.

By default, Jacoco is in the Androids build system. It can be enabled by applying the Jacoco plugin in your build.gradle file.

apply plugin: 'jacoco'

For your build type debug, add testCoverageEnabled and includeNoLocationClasses = true as follows:

buildTypes {
   release {
       ...
   }
   debug {
       testCoverageEnabled true
   }
}

Setting testCoverageEnabled to true enables your test reports for all the instrumented tests.

testOptions {
   unitTests.all {
       jacoco {
           includeNoLocationClasses = true
       }
   }
}

Setting includeNoLocationClasses under testOptions will enable code coverage for all the unit tests. Android Studio will add a couple of gradle tasks to produce .exec files (for unit tests) and .ec files (for instrumented tests) that will be used as input for Jacoco to generate a report.

As a last step, we create a Gradle task called jacocoTestReport that can be run by a Job after executing testDebug.

task jacocoTestReport(type: JacocoReport, dependsOn: ['testDebugUnitTest']) {
   reports {
       xml.enabled = true
       html.enabled = true
       csv.enabled = true
   }

   def fileFilter = [ '**/R.class', '**/R$*.class', '**/BuildConfig.*', '**/Manifest*.*', '**/*Test*.*', 'android/**/*.*' ]
   def mainSrc = "${project.projectDir}/src/main/kotlin"

   sourceDirectories = files([mainSrc])
   executionData = fileTree(dir: "$buildDir", includes:
           ["jacoco/testDebugUnitTest.exec"]
   )

Running the jacocoTestReport task will generate all of the three available report types. The only report type required to display code coverage in GitLab is csv.enabled = true. The csv report was chosen as it is the simplest to parse.

To run the gradle task right after we run the unit tests, we only need to add one line (./gradlew jacocoTestReport) to the testDebug job.

debugTests:
  stage: test
  script:
    - ./gradlew -Pci --console=plain :app:testDebug
    - ./gradlew jacocoTestReport
    - awk -F"," '{ instructions += $4 + $5; covered += $5 } END { print covered, "/", instructions, " instructions covered"; print 100*covered/instructions, "% covered" }' app/build/reports/jacoco/jacocoTestReport/jacocoTestReport.csv
  artifacts:
    paths:
      - example-app/app/build/test-results/**/TEST-*.xml
      - example-app/app/build/reports/jacoco/jacocoTestReport
    reports:
      junit: example-app/app/build/test-results/**/TEST-*.xml

The awk instruction will parse the csv report and print the code coverage value to the output in the form of “% covered”. This value can now be parsed by gitlab using a regex under pipeline settings. Go to pipeline settings and search for “Test coverage parsing” and replace the regex with “\d+.\d+ \% covered”.

Adding static code analysis

To improve code quality we decided to use on top of the standard android linting another tool for static code analysis, which is called detekt. This dependency can be added in the build.gradle file and requires a configuration file in which you can define your rules. Detekt supports a lot of different rule sets for various code quality check. You can for example configure different thresholds for levels of complexity of methods, set up rules for different naming conventions or restrict the usage of Magic numbers in your code.
By running the detekt gradle job, the tool checks your entire codebase and generates a report with all the findings. In our pipeline’s check stage, we added the detekt job which and configured a threshold of 10 findings for the job to fail.

After some commits we soon discovered, that the default detekt file is way too strict and we ended up in a lot of method refactoring because they were too long or too complex. The question, wheather you are going to refactor your code or adjust the detekt config file came up quite often. Until you finally set up your code analysis rules you will need to deal with this question and figure out the best configuration depending on your desired code quality.

The good thing is that, once set up, you can reuse the configuration file easily among other android projects.

Adding automated deployment

In order to distribute the application in different states during development to testers, we set up a flow for an automated deployment on Firebase. Therefore we used the App Distribution feature, which allowed us to automate our releases so that we avoid mistakes like uploading the wrong build artifacts. To integrate this feature, you need to set up a firebase project and link it with your application. In the android app, the build.gradle file has to be changed by including firebase dependencies and referencing the corresponding google service-account file.
Once set up, you can trigger a gradle job for every build type, which uploads the correct apk, including the current build version, version code and release notes. Unfortunately it will only be distributed, if you click a button in the firebase console after you added your testers, which makes it not fully automated.

By using a “release” Git tag, the pipeline would only trigger the deploy stage and creating a release, when it’s desired.

Build cache

Most of the jobs defined in the pipeline require at least a JVM (for running gradle) or even parts of the Android SDK. The recommended way to run gradle is via the gradle wrapper. The gradle wrapper ensures that the requested version of gradle is used and downloads it, if required. As we are running the jobs in a docker container, gradle would be downloaded each time, a job requires it, which is very time-lasting and inefficient.

For this cases, Gitlab provides the option to cache the contents specified directories between jobs. Once a job has been finished the contents of the specified folders are zipped. When the build executes the next time, the cache files are unzipped prior running the job.

It is possible to define a cache-key per job to control the cache sharing. In this post, the job name is used as cache key, which results in separate caches for each job, meaning that the same jobs shared its cache with subsequent/previous runs of the job.

The following excerpt from .gitlab-ci.yml enables caching of the gradle files on a per-job basis:

cache:
  key: "$CI_JOB_NAME"
  paths:
    - .gradle/wrapper
    - .gradle/caches

Custom docker image

Some of the defined jobs required parts of the Android SDK. The base docker image provides the basic Android SDK tools out of the box (emulator, adb, …) but does not include any parts of the Android SDK (neither emulator images nor the SDK itself).

Any job requiring parts of the Android SDK would trigger a SDK download which takes a lot of time. That’s why a custom docker image has been created and pushed to the docker hub. This image includes the complete Android SDK 29 and a emulator image, resulting in a size of 2.3 GB. Fortunately docker images are cached on each runner, requiring only an initial download of the custom image.

The docker image uses for jobs is specified at the top of .gitlab-ci.yml:

image: maxee42/hdm_syseng:0.0.2-rev01

Problems

Job execution performance

While setting up the CI/CD scenario, the execution performance of jobs has been a real problem. At first, we had a pipeline execution time* of more than one hour. At this time, the process was not optimized in any way – Gradle and the Android SDK were downloaded on demand. Introducing the described, dedicated docker image containing the Android SDK and enabling the build cache (in fact caching the gradle wrapper) reduced the execution time to about 30 minutes*. The jobs were run on the runners offered by the HdM up to this time.

We decided to set up an own runner to further reduce the build time. Using a VM running on a host with Intel i9 and 10 GB of RAM assigned, we could reduce the overall build time to 10 minutes*, what seems a reasonable time.

The main, and to the time of the blog post unsolvable problem, was the execution time of the instrumentation tests. These require a running instance of the Android Emulator or a attached physical device. Running the Android Emulator in a docker container on a VM running on a host with Intel i3 and 24 GB of RAM was not efficient. Execution of the instrumentation tests took about 20 minutes. Due to lack of better hardware, we had to disable the instrumentation tests in the pipeline.

Our experience suggests that running larger pipelines with an Android context requires powerful hardware. Instrumentation tests should be run on physical devices or on very fast hardware (without virtualization and/or shell executor).

*excluding intrumentation tests

Gitlab drop-ins

As described in Configuring Pipeline -> Adding Gitlab drop-ins, all available drop-ins were added to the repository. Unfortunately only the SAST drop-in ran correctly and reported some errors. The license and dependency scans have been successfully executed, but did not report any results. Furthermore it is very difficult to debug the drop-ins as they do not write lots of logs during execution.

The official documentation is not very detailed regarding the drop-ins, why we could not fix the license and dependency check.

Artifact browsing

Every generated report is uploaded in html or xml format in order to see instantly what caused a failure in our pipeline for example. As described earlier, we often had failed pipelines, because one of our checks did not succeed. Gitlab offers the possibility to browse through the uploaded artifacts so that you can see the results. Unfortunately, Gitlab is not able to display html or xml files, which requires you to download the files. If you have several artifacts, it is quite annoying to always download and open them by hand.

Further improvements regarding CI/CD

There are some more topics, which can be included regarding CI/CD:

Automated crash reporting: When a user crashes the android app, the crash report is uploaded to firebase crashlytics, where the stacktrace can be accessed. You could set up a flow, in which developers are automatically notified by a Slack message and a bug issue is created which includes all the information about the crash.
When the app gets deployed, you could integrate a tool for automatically taking screenshots of the app in all supported languages.
If it would be a huge android application, you should think about splitting the app into different modules, which can be tested independently in order to further reduce the pipeline time.

Conclusion

In retrospective, we can say that setting up a full CI/CD pipeline for Android including instrumentation tests has many performance issues yet to be resolved. Running UI tests on an emulator in an automated pipeline requires either too much hardware or does not comply with the Continuous Integration Certification Test by Martin Fowler, stating that pipelines should not take longer than ten minutes.

One could either setup a pipeline without instrumentation tests entirely, as running unit tests is fast and trivial or try to run UI tests on a physical device.