Web Application Firewall¶
To increase security, a WAF is deployed in front of the instance. This contains a list of rules which are managed by AWS Because some of these rules are a bit too strict, some overrides are added.
AWS managed rule overrides¶
Body size restriction¶
By default, all requests with a body larger than 8 KB are blocked. This limit is increased to 1 GB, and can be adjusted
with the terraform variable waf_body_size_restriction if needed.
Body injection¶
By default, all requests which contain body data that looks like it could try injecting malicious data are blocked.
However, this also blocks a number of valid requests, primarily to registry endpoints. To mitigate this,
requests which are marked by these rules, but contain a _gitlab_session cookie, job-token header, or authorization
header
are allowed to continue.
URI path restriction¶
By default, URI paths which contain a number of "forbidden" extensions are blocked. This also blocks requests to any
file named ".config" opened through GitLab's web IDE for example. To mitigate this, requests which are marked by this
rule,
but contain a _gitlab_session cookie are allowed to continue.
Additional rules/overrides from solutions¶
You can edit (parts of) the WAF rule chain from within your solution. This can be useful to override specific rules, or to add additional rules, such as complex IP allowlisting.
You will need to create the relevant WAF rule groups within your solution. You can then pass the ARN of the rule
group to GET, either at the start of the rule chain via waf_custom_rule_group_pre_arn, or add them at the end of the
rule chain via waf_custom_rule_group_post_arn.
A small example that injects a new WAF rule at the top of the chain to block all requests not coming from a certain set of IP addresses:
resource "aws_wafv2_ip_set" "ipset" {
name = "${var.prefix}-ipset"
scope = "REGIONAL"
ip_address_version = "IPV4"
addresses = [
"172.168.16.0/22",
"8.8.0.0/20",
"1.2.3.4/32",
]
}
resource "aws_wafv2_rule_group" "rule_group_pre" {
name = "${var.prefix}-waf-rules-pre"
capacity = 1
scope = "REGIONAL"
visibility_config {
cloudwatch_metrics_enabled = false
metric_name = "${var.prefix}-rules-pre"
sampled_requests_enabled = false
}
rule {
name = "${var.prefix}-waf-iplist"
priority = 1
action {
block {}
}
statement {
not_statement {
statement {
ip_set_reference_statement {
arn = aws_wafv2_ip_set.ipset.arn
}
}
}
}
visibility_config {
cloudwatch_metrics_enabled = true
metric_name = "${var.prefix}-waf-iplist"
sampled_requests_enabled = false
}
}
}
module "gitlab_cluster" {
# ...
waf_custom_rule_group_pre_arn = aws_wafv2_rule_group.rule_group_pre.arn
}
Blocked request logging¶
By default, the AWS WAF will log to CloudWatch. The CloudWatch log group is named aws-waf-logs-{prefix} by default.
Only blocked request will be stored, and the logs will be retained for 14 days by default.
The primary purpose of logging WAF blocks for us is to debug the WAF rules, since the logs will state which rule was triggered and with what values.
Configure logging to S3¶
When desired, you can switch the logging destination to S3 instead. This is primarily useful if the customer wants to retain a full WAF log under their control. You must manage the S3 bucket yourself in your solution.
The S3 bucket needs to have a policy attached so AWS WAF is allowed to upload files to it. GET will generate the
appropriate statements for you and output these in waf_logs_s3_bucket_policy_statements. You can use
waf_logs_s3_bucket_arn to set the bucket ARN to log to. A small example can be found below:
module "waf_logs_bucket" {
source = "git.glhd.nl/glh/bucket/aws"
version = "~> 1.0"
name = "aws-waf-logs-${var.prefix}"
delete_noncurrent_versions_after_days = local.retention_in_days
delete_current_versions_after_days = local.retention_in_days
sse_kms_key_arn = aws_kms_key.default_kms_key.arn
}
resource "aws_s3_bucket_policy" "allow_waf_access" {
bucket = module.waf_logs_bucket.name
policy = jsonencode({
"Version": "2012-10-17",
"Statement": module.gitlab_cluster.waf_logs_s3_bucket_policy_statements
})
}
module "gitlab_cluster" {
# ...
waf_logs_s3_bucket_arn = module.waf_logs_bucket.arn
}
This specific example also avoid chicken-egg problems by allowing Terraform to create the bucket first, then pass the resulting ARN to GET, and use the policy statements returned by GET afterwards to actually allow access to the S3 bucket.