FSL CI infrastructure
FSL conda packages are automatically built and published using GitLab CI pipeline rules implemented in the fsl/conda/fsl-ci-rules
repository. FSL releases are built and published using CI rules implemented in the fsl/conda/manifest-rules
repository. This page describes the infrastructure on which these jobs are executed.
GitLab runners
macOS jobs are executed on physical, manually managed Apple hardware. Linux and platform-agnostic jobs are executed on automatically managed AWS infrastructure.
All GitLab CI jobs are executed on the following infrastructure:
- Linux and platform-agnostic jobs are executed on AWS EC2 infrastructure, using a gitlab runner configured for auto-scaling.
- Intel macOS jobs are executed on gitlab runners installed on old macbooks managed by Paul.
- M1 macOS jobs are executed on a gitlab runner installed on a mac mini, managed by Duncan.
- Package publishing jobs are executed on the channel server, using a gitlab runner managed by Duncan.
These runners and any newly added runners MUST have the following tags:
- Intel macOS runners have the tags fsl-ci,macOS-64
- Apple Silicon macOS runners have the tags fsl-ci,macOS-M1
- Linux x86/64 runners have the tags fsl-ci,linux,linux-x64
- Linux aarch64 runners have the tags fsl-ci,linux,linux-aarch64
- The package publishing runner has the tag fslconda-channel-host
The Linux runners are automatically configured with Terraform - see below.
More runners (with the same tags as above) can be added as needed. For macOS runners, it is assumed that:
- git
and git-lfs
are installed
- A conda environment is installed at ~/micromamba/
(for whichever user the gitlab-runner
is running under), which has conda-build
and the fsl-ci-rules
dependencies installed.
When new Linux/platform-agnostic jobs are scheduled, the auto-scaling runner uses docker+machine to create one or more EC2 instances (provisioned with docker), and dispatches the jobs to those instances. When there are no more jobs to execute, the EC2 instances are destroyed.
Docker images used for the jobs are hosted on the Amazon Elastic Container Registry, at https://gallery.ecr.aws/fsl/. These images are built from Dockerfiles
, and using CI jobs, in the fsl/conda/fsl-ci-rules
repository.
Jobs submitted to the Linux runners use the fsldevelopment/fsl-almalinux-64
Docker image. This is a multi-platform docker image, so can be used to run platform-agnostic jobs in addition to arm64/amd64 jobs.
CUDA projects are built using the fsldevelopment/fsl-almalinux-64-cuda-11.0
docker image (also a multi-platform image).
AWS infrastructure
Linux and platform-agnostic CI jobs are executed on AWS infrastructure, which is defined and managed using a Terraform configuration located in the fsl/conda/fsl-ci-rules
repository. All of the infrastructure runs within the eu-west-2
region.
An AWS IAM user account called fsldevelopment
has been created specifically for use within this system. This account has the following permissions:
AmazonS3FullAccess
for the gitlab runner S3 cacheAmazonEC2FullAccess
for creating/managing EC2 instancesAmazonEC2ContainerRegistryFullAccess
for pulling from / pushing to private ECR repositories (not presently used)AmazonElasticContainerRegistryPublicFullAccess
for pushing to public ECR repositories.
A S3 bucket called fsldevelopment-bucket
is used for storing Terraform state, and used by the GitLab runners.
Follow these steps whenever the infrastructure needs to be re-configured:
-
Make sure you have Terraform installed on your local machine.
-
Register two new GitLab runners on the
fsl/
group via the GitLab web UI, giving them these tags:fsl-ci,linux,linux-x64
fsl-ci,linux,linux-aarch64
Make a note of the runner registration tokens.
-
Delete the old GitLab runners with the same tags via the GitLab web UI.
-
Set environment variables containing the AWS credentials for the
fsldevelopment
IAM user: -
Clone the
fsl-ci-rules
repository, and change into theterraform
directory: -
Run
terraform init
(editbackend.config
if you need to change the AWS region or S3 bucket name):
-
Generate a SSH key pair. This will be used to connect to the new runner manager instance. One method of generating a key pair is:
-
Fill in the necessary values in
configuration.tfvars
: - The GitLab runner registration tokens from step 3 above.
- Paths to your private and public SSH key files.
-
AMI IDs if you need to change the host OSes used for running jobs.
-
Destroy the existing infrastucture by running:
-
Re-create the infrastucture by running:
Channel host gitlab runner
In order to facilitate automatic deployment of built conda packages:
-
The Gitlab CI runner which is used to run the deployment jobs must be running on a server which has access to the conda channel directories.
-
The runner must have the
fslconda-channel-host
tag. -
In the
.gitlab-ci.yml
file of thefsl/conda/fsl-ci-rules
repository, the following variables must be set, denoting the URLs, and locally accessible directories, of the conda channels:FSLCONDA_PUBLIC_CHANNEL_URL
:https://
URL of the public channelFSLCONDA_DEVELOPMENT_CHANNEL_URL
:https://
URL of the development channelFSLCONDA_INTERNAL_CHANNEL_URL
:https://
URL of the internal channelFSLCONDA_PUBLIC_CHANNEL_DIRECTORY
: Locally accessible directory containing the public channelFSLCONDA_DEVELOPMENT_CHANNEL_DIRECTORY
: Locally accessible directory containing the development channelFSLCONDA_INTERNAL_CHANNEL_DIRECTORY
: Locally accessible directory containing the internal channel