Overview¶
Welcome to the documentation for the GitLabHost version of GitLab Environment Toolkit.
These pages should be your one-stop reference for everything regarding the High Availability projects.
Shortcuts¶
These links lead to frequently-used external URLs:
Documentation structure¶
This documentation is split into three parts, interlinked where relevant, scoped to the type of task you may need to perform. Currently, the Operator section contains most of the documentation.
-
The Customer section contains high-level explanations and overviews of GET features and components, as well as explanations on how we use certain AWS features, and frequently asked questions. Intended for copy/pasting to customers, and for internal reference for Sales and Marketing.
-
The Operator section contains documentation on how to deploy, update, customize and debug deployments of GitLab with GET. Most non-code day-to-day tasks for HA team members are covered here.
-
The Developer section contains guides on best-practices and conventions used in our codebase, but also covers the code structure of GET and any hacks/workarounds that may not be obvious at first. This section also contains guides on running unreleased versions of GET on AWS.
Where relevant, features will be covered in all three sections. Some examples of this:
-
AWS WAF - The Customer pages cover what we try to achieve with the WAF and what sort of security issues the WAF may prevent from occurring. The Operator pages cover how to enable/disable the WAF, use customizations available for solutions, and how to debug
403errors that may be caused by incorrect or incomplete WAF configuration. The developer pages explain how to add/remove new rules to the WAF in Terraform. -
GitLab Runner - The customer pages provide pointers to upstream documentation, and provide details on which features, platforms and architectures are available for the customer to use. The Operator pages explain how to deploy new runners, how to choose instance types and auto-scaling min/max parameters, and how to debug failed CI jobs caused by system failure. The developer pages provide context on choices made in code and how to navigate the shared codebase for runners, and how to create new/customized AMI types.
Project overview¶
Our deployments of clustered GitLab solutions are fully defined in code, this is called Infrastructure as Code (IaC).
For this, we utilize Terraform for provisioning all resources in cloud environments (such as AWS), and Ansible to provision and update any Linux hosts that were created by Terraform.
Both components reside in of our version of the GitLab Environment Toolkit, which is abbreviated as GET. We used to extend on top of a GitLab provided project called GET as well, but to solve some of our issues we had with that, since version 3.0, GET is no longer dependent on upstream GET. Most notably, we've streamlined the Omnibus installation, and dropped support for GCP, Azure and Kubernetes deployments.
A list of features that we think are noteworthy to mention as being supported:
- Backup schedules for EBS and S3
- GitLab Pages
- Kroki diagram servers
- Managed Load Balancers which handle TLS traffic
- Public/private network separation with Bastion nodes for SSH access
- Security Groups per resource type
- TLS certificate management and DNS validation
- Web Application Firewall
- An extensive observability stack with Grafana, Loki, Prometheus and Thanos
- Support for deploying dedicated and auto-scaling GitLab runners
Internally shared codebases¶
We have split out some functions that were originally part ot GET to seperate projects, so we can re-use these in other codebases as well.
These are grouped per application, and there are currently two:
- Terraform Modules - these contain small reusable Terraform snippets.
- Ansible Collections - this contains a shared codebase for basic machine setup, and a collection of non-mainlined extensions.
Satellite projects¶
Apart from the customer deployments, we have some additional projects managing shared infrastructure, in a similar fashion to how we manage our GET codebase. Of note are the following projects:
-
AWS Infrastructure contains shared services such as management of the DNS of our
glhc.nldomain, a centralized Grafana gateway, and alerting proxies to propagate monitoring messages from solutions to Zulip. -
AWS Identity Management contains Terraform code to provision our root AWS account and grant permissions to everyone inside GitLabHost, based on their specific roles. Please note that the HA team is also responsible for managing permissions to non-HA team members, usually scoped to relevant subaccounts for these users.
-
Reusable Enhancing Command-line Tools contains a few command-line utilities for making working with AWS and our projects faster and less repetitive.