CI link checker (#494)

* CI link checker

* fix link errors

* fix ci

* wildcard on *.md
This commit is contained in:
Ludovico Magnocavallo 2022-02-04 13:28:07 +01:00 committed by GitHub
parent 0cef15301b
commit 02e8a3927d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 93 additions and 7 deletions

View File

@ -57,6 +57,11 @@ jobs:
run: |
python3 tools/check_documentation.py examples modules fast
- name: Check documentation links (fabric)
id: documentation-links-fabric
run: |
python3 tools/check_links.py .
# markdown-link-check:
# runs-on: ubuntu-latest
# steps:

View File

@ -144,7 +144,7 @@ This configuration is battle-tested, and flexible enough to lend itself to simpl
## How to run this stage
This stage is meant to be executed after the [resman](../01-resman) stage has run. It leverages the automation service account and the storage bucket created there, and additional resources configured in the [bootstrap](../00-boostrap) stage.
This stage is meant to be executed after the [resman](../01-resman) stage has run. It leverages the automation service account and the storage bucket created there, and additional resources configured in the [bootstrap](../00-bootstrap) stage.
It's possible to run this stage in isolation, but that's outside of the scope of this document. Please, refer to the previous stages for the environment requirements.
@ -152,7 +152,7 @@ Before running this stage, you need to make sure you have the correct credential
### Providers configuration
The default way of making sure you have the right permissions, is to use the identity of the service account pre-created for this stage, during the [resource management](./01-resman) stage, and that you are a member of the group that can impersonate it via provider-level configuration (`gcp-devops` or `organization-admins`).
The default way of making sure you have the right permissions, is to use the identity of the service account pre-created for this stage, during the [resource management](../01-resman) stage, and that you are a member of the group that can impersonate it via provider-level configuration (`gcp-devops` or `organization-admins`).
To simplify the setup, the previous stage pre-configures a valid providers file in its output and optionally writes it to a local file if the `outputs_location` variable is set to a valid path.
@ -289,7 +289,7 @@ Variables managing L7 Internal Load Balancers (`l7ilb_subnets`) and Private Serv
VPC network peering connectivity to the `trusted landing VPC` is managed by the `vpc-peering-*.tf` files.
Copy `vpc-peering-prod.tf` to `vpc-peering-staging.tf` and replace "prod" with "staging", where relevant.
Configure the NVAs deployed or update the sample NVA config files ([ew1](data/nva-startup-script-ew1.tftpl) and [ew4](data/nva-startup-script-ew1.tftpl)), thus making sure they support the new subnets.
Configure the NVAs deployed or update the sample [NVA config file](data/nva-startup-script.tftpl) making sure they support the new subnets.
DNS configurations are managed in the `dns-*.tf` files.
Copy the `dns-prod.tf` to `dns-staging.tf` and replace within the files "prod" with "staging", where relevant.

View File

@ -121,7 +121,7 @@ This configuration is battle-tested, and flexible enough to lend itself to simpl
## How to run this stage
This stage is meant to be executed after the [resman](../01-resman) stage has run, as it leverages the automation service account and bucket created there, and additional resources configured in the [bootstrap](../00-boostrap) stage.
This stage is meant to be executed after the [resman](../01-resman) stage has run, as it leverages the automation service account and bucket created there, and additional resources configured in the [bootstrap](../00-bootstrap) stage.
It's of course possible to run this stage in isolation, but that's outside the scope of this document, and you would need to refer to the code for the previous stages for the environmental requirements.
@ -129,7 +129,7 @@ Before running this stage, you need to make sure you have the correct credential
### Providers configuration
The default way of making sure you have the right permissions, is to use the identity of the service account pre-created for this stage during the [resource management](./01-resman) stage, and that you are a member of the group that can impersonate it via provider-level configuration (`gcp-devops` or `organization-admins`).
The default way of making sure you have the right permissions, is to use the identity of the service account pre-created for this stage during the [resource management](../01-resman) stage, and that you are a member of the group that can impersonate it via provider-level configuration (`gcp-devops` or `organization-admins`).
To simplify setup, the previous stage pre-configures a valid providers file in its output, and optionally writes it to a local file if the `outputs_location` variable is set to a valid path.
@ -209,7 +209,7 @@ To add a new firewall rule, create a new file or edit an existing one in the `da
### DNS architecture
The DNS ([`dns`](https://github.com/terraform-google-modules/cloud-foundation-fabric/tree/master/modules/dns)) infrastructure is defined in [`dns.tf`](dns.tf).
The DNS ([`dns`](https://github.com/terraform-google-modules/cloud-foundation-fabric/tree/master/modules/dns)) infrastructure is defined in the respective `vpc-xxx.tf` files.
Cloud DNS manages onprem forwarding, the main GCP zone (in this example `gcp.example.com`) and is peered to environment-specific zones (i.e. `dev.gcp.example.com` and `prod.gcp.example.com`).
@ -226,7 +226,7 @@ DNS queries sent to the on-premises infrastructure come from the `35.199.192.0/1
#### On-prem to cloud
The [Inbound DNS Policy](https://cloud.google.com/dns/docs/server-policies-overview#dns-server-policy-in) defined in module `landing-vpc` ([`landing.tf`](./landing.tf)) automatically reserves the first available IP address on each created subnet (typically the third one in a CIDR) to expose the Cloud DNS service so that it can be consumed from outside of GCP.
The [Inbound DNS Policy](https://cloud.google.com/dns/docs/server-policies-overview#dns-server-policy-in) defined in module `landing-vpc` ([`landing.tf`](./vpc-landing.tf)) automatically reserves the first available IP address on each created subnet (typically the third one in a CIDR) to expose the Cloud DNS service so that it can be consumed from outside of GCP.
### Private Google Access

View File

@ -1,2 +1,3 @@
click
marko
yamale

80
tools/check_links.py Executable file
View File

@ -0,0 +1,80 @@
#!/usr/bin/env python3
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
'''Recursively check link destination validity in Markdown files.
This tool recursively checks that local links in Markdown files point to valid
destinations. Its main use is in CI pipelines triggered by pull requests.
'''
import collections
import pathlib
import urllib.parse
import click
import marko
BASEDIR = pathlib.Path(__file__).resolve().parents[1]
DOC = collections.namedtuple('DOC', 'path relpath links')
LINK = collections.namedtuple('LINK', 'dest valid')
def check_docs(dir_name):
'Traverse dir_name and check links in Markdown files.'
dir_path = BASEDIR / dir_name
for readme_path in sorted(dir_path.glob('**/*.md')):
if '.terraform' in str(readme_path) or '.pytest' in str(readme_path):
continue
links = []
for el in marko.parser.Parser().parse(readme_path.read_text()).children:
if not isinstance(el, marko.block.Paragraph):
continue
for subel in el.children:
if not isinstance(subel, marko.inline.Link):
continue
link_valid = None
url = urllib.parse.urlparse(subel.dest)
if url.scheme:
link_valid = True
else:
link_valid = (readme_path.parent / url.path).exists()
links.append(LINK(subel.dest, link_valid))
yield DOC(readme_path, str(readme_path.relative_to(dir_path)), links)
@ click.command()
@ click.argument('dirs', type=str, nargs=-1)
def main(dirs):
'Check links in Markdown files contained in dirs.'
errors = 0
for dir_name in dirs:
print(f'----- {dir_name} -----')
for doc in check_docs(dir_name):
state = '' if all(l.valid for l in doc.links) else ''
print(f'[{state}] {doc.relpath} ({len(doc.links)})')
if state == '':
errors += 1
for l in doc.links:
if not l.valid:
print(f' {l.dest}')
if errors:
raise SystemExit('Errors found.')
if __name__ == '__main__':
main()