Debugging Jobs in GitLab CI

GitLab doesn't support SSH access to debug a job, so we explore how to set up a GitLab runner to run jobs locally and debug them in a container.

A follow up to An Installer for Drupal 8 and GitLab CI

GitLab does not support SSH access to a job to debug it like other CI tools, such as CircleCI or Travis CI. Instead, it provides developers with the ability to set up a GitLab Runner to run jobs locally. This article explains how to run a job locally and halt it, so you can jump into the container and debug it.

Background: a failing job at GitLab CI

For the article, An Installer for Drupal 8 and GitLab CI, a demo repository was created to host a Drupal 8 site and a GitLab CI Pipeline was written. This pipeline, among other things, ran end-to-end tests against the Drupal 8 site. These tests needed a browser, so Headless Chrome was added.

While setting up the Existing Site Tests job, it appeared that the job would get stuck when running the Chrome browser in headless mode:

GitLab UI showing the job is stuck

 

If you look at the console output, the job gets stuck there and eventually times out. The google-chrome-unstable command is called by a Robo task. This was the job setup:

GitLab blob showing build and test jobs

https://gitlab.com/juampynr/drupal8-gitlab/-/blob/master/.gitlab-ci.yml

 

There had to be a way to make the job stop right before it runs vendor/bin/robo run:chrome-headless and somehow allow an interactive session inside the Docker container to debug that command manually. By looking at the GitLab documentation, it was clear that a GitLab Runner would allow it.

Setting up a Docker runner

After searching for a way to retry a failed job with SSH access, it turned out that this is not supported in GitLab. Instead, the recommended approach is to install gitlab-runner and then register a Docker runner so you can run a job locally.

First, start by installing the gitlab-runner command by following the steps listed at Install GitLab Runner manually on GNU/Linux:

$ curl -LJO https://gitlab-runner-downloads.s3.amazonaws.com/latest/deb/gitlab-runner_amd64.deb
$ sudo dpkg -i gitlab-runner_amd64.deb 

Next, follow the instructions at https://docs.gitlab.com/runner/register and register a runner, which asks a few questions along the way to configure it:

$ sudo gitlab-runner register
Please enter the gitlab-ci coordinator URL (e.g. https://gitlab.com/):
https://gitlab.com
Please enter the gitlab-ci token for this runner:
[Find this token at your repository's Settings > CI / CD > Runners ]
Please enter the gitlab-ci description for this runner:
debugging
Please enter the gitlab-ci tags for this runner (comma separated):
debugging
Registering runner... succeeded                     runner=NZu1SRU5
Please enter the executor: virtualbox, docker, docker-ssh, shell, ssh, docker+machine, docker-ssh+machine, kubernetes, custom, parallels:
docker
Please enter the default Docker image (e.g. ruby:2.6):
juampynr/drupal8ci:latest
Runner registered successfully. Feel free to start it, but if it's running already the config should be automatically reloaded! 

Setup complete. Here is a screenshot of where to find the token to use above. It is not easy to find within the repository settings:

GitLab UI showing where to find the token to set up a runner

Repository runner settings

Now you can run jobs locally.

Adjusting jobs so they can run locally

 This is the command to run the job to debug:

$ gitlab-runner exec docker drupal8ci:existing_site_tests

When running this command, it failed with an error that stated that the drupalci:build dependency did not exist. It turns out, GitLab runners do not support job dependencies. There are a few GitLab CI features that do not work when running jobs locally. It's a good idea to read them so you can adjust your pipeline accordingly.

While these limitations make running jobs locally trickier, you can adjust a job to overcome them. For example, in this case, to make drupal8:existing_site_tests not dependent on drupal8ci:build, you'd need to:

  1. Copy the script section from drupal8ci:build into drupal8ci:existing_site_tests.
  2. Remove the dependencies section from drupal8ci:existing_site_tests.

Here is the resulting job:

GitLab blob showing the adjusted job

 

Add a sleep 1h statement where you want the job to halt so you can debug it.

With these changes complete, commit them so the GitLab Runner can pick them up:

$ git add .gitlab-ci.yml

$ git commit -m "Debug existing site tests job"

There is no need to push the changes, as the GitLab runner will pick the code from your local Git configuration and not the remote. This is good because once you are done with debugging, you can delete the above commit with git reset HEAD~1.

Finally, you're ready to run jobs and debug them. Take a look at how it went in the following example.

Running and debugging

Here is the output of the command that runs a job locally:

$ gitlab-runner exec docker drupal8ci:existing_site_tests
Running with gitlab-runner 12.2.0 (a987417a)
Using Docker executor with image juampynr/drupal8ci:latest ...
Pulling docker image registry.gitlab.com/juampynr/drupal8-gitlab:latest ...
Waiting for services to be up and running...
Pulling docker image juampynr/drupal8ci:latest ...
Fetching changes...
Initialized empty Git repository in /builds/project-0/.git/
Checking out 44d2e858 as master...
$ robo job:build
 [Filesystem\FilesystemStack] _copy [".gitlab-ci/settings.local.php","web/sites/default/settings.local.php",true]
 [Filesystem\FilesystemStack] _copy [".gitlab-ci/.env",".env",true]
 [Composer\Validate] Validating composer.json: /usr/local/bin/composer validate --no-check-publish
 [Composer\Validate] Running /usr/local/bin/composer validate --no-check-publish
Do not run Composer as root/super user! See https://getcomposer.org/root for details
./composer.json is valid
 [Composer\Validate] Done in 0.174s
 [Composer\Install] Installing Packages: /usr/local/bin/composer install --optimize-autoloader --no-interaction
 [Composer\Install] Running /usr/local/bin/composer install --optimize-autoloader --no-interaction
> DrupalProject\composer\ScriptHandler::checkComposerVersion
Loading composer repositories with package information
Installing dependencies (including require-dev) from lock file
    1/142:        https://codeload.github.com/webflo/drupal-core-require-dev/legacy.zip/d9e72ddd4b353727f6bf9293408bfaf429b7c69e
...
    142/142:        https://codeload.github.com/drupal/core/legacy.zip/39164616332832e1456199d32fc3ed11562f4721
    Finished: success: 142, skipped: 0, failure: 0, total: 142
Package operations: 143 installs, 0 updates, 0 removals
  - Installing cweagans/composer-patches (1.6.6): Loading from cache
...
Generating optimized autoload files
> DrupalProject\composer\ScriptHandler::createRequiredFiles
Created a sites/default/files directory with chmod 0777
 [Composer\Install] Done in 29.56s
$ sleep 1h

It works! Do you see that last line that says sleep 1h? This is the debugging statement that you added before so the job would halt. Now that the job is still running, all you need is the identifier of its Docker container so you can open an interactive session to do your debugging. Here is how to find that:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
2aa10ed01ed2        2e1af17c8319        "docker-php-entrypoi…"   8 seconds ago       Up 7 seconds        80/tcp              runner--project-0-concurrent-0-5229b3296d35403b-build-4
df8383d0a1f7        3843695a526c        "mysqld"                 20 seconds ago      Up 19 seconds       3306/tcp            runner--project-0-concurrent-0-5229b3296d35403b-registry.gitlab.com__juampynr__drupal8-gitlab-0

There are two running containers in the above output. GitLab assigns them a name, such as runner--project-0-concurrent…. You can identify which is which by their command: the first one, docker-php-entrypoi… is the entrypoint of the PHP image, the one that hosts PHP and Apache. The second one, mysqld, is the MySQL server. If you wanted to debug the former, the identifier that you'd use on your next command is runner--project-0-concurrent-0-5229b3296d35403b-build-4:

$ docker exec -it runner--project-0-concurrent-0-5229b3296d35403b-build-4 bash
root@runner--project-0-concurrent-0:/var/www/html#
root@runner--project-0-concurrent-0:/var/www/html# cd /builds/
root@runner--project-0-concurrent-0:/builds# ls
project-0  project-0.tmp
root@runner--project-0-concurrent-0:/builds# cd project-0
root@runner--project-0-concurrent-0:/builds/project-0# ls
LICENSE  README.md  RoboFile.php  composer.json  composer.lock        config        console  docs  drush  load.environment.php  phpunit.xml.dist  scripts  web

Once in the container, you'll need to locate the source code. GitLab Runner creates a directory called builds at the root of the container’s file system. Once there, you can see your project files and you're ready to debug the failing job.

Conclusion

The bug turned out to be a missing ampersand at the end of the command that started Headless Chrome. It may sound silly, but these are the kind of errors that may take you hours to figure out if you are not able to debug them properly.

Here are a couple of points for you to take with you the next time a GitLab CI job does not work as expected:

  1. Register a Docker runner for the repository that you are working with. Check the screenshot above for where to find the token you'll need to enter.
  2. Check out the unsupported features when running jobs via gitlab-runner exec. Adjust the job accordingly.
  3. Add a sleep 1h statement where you want the job to halt when it runs. Commit that, but don’t push it.
  4. Run the job with the command gitlab-runner exec docker [job id goes here].

Happy debugging!

Published in:

Get in touch with us

Tell us about your project or drop us a line. We'd love to hear from you!