Observatory Terraform Environment
This is a tutorial for deploying the Observatory Platform to Google Cloud with Terraform.
You should have installed the Observatory Platform before following this tutorial.
Install dependencies
The dependencies that are required include:
Packer: for automating the creation of the Google Cloud VM images.
Terraform: to automate the deployment of the various Google Cloud services.
Google Cloud SDK: the Google Cloud SDK including the gcloud command line tool.
If you installed the observatory platform through the installer script, and selected the Terraform configuration, the dependencies were installed for you.
If you wish to manually install the dependencies yourself, see the details below.
Linux
Install Packer:
sudo curl -L "https://releases.hashicorp.com/packer/1.9.2/packer_1.9.2_linux_amd64.zip" -o /usr/local/bin/packer
# When asked to replace, answer 'y'
unzip /usr/local/bin/packer -d /usr/local/bin/
sudo chmod +x /usr/local/bin/packer
Install Google Cloud SDK:
sudo curl -L "https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-330.0.0-linux-x86_64.tar.gz" -o /usr/local/bin/google-cloud-sdk.tar.gz
sudo tar -xzvf /usr/local/bin/google-cloud-sdk.tar.gz -C /usr/local/bin
rm /usr/local/bin/google-cloud-sdk.tar.gz
sudo chmod +x /usr/local/bin/google-cloud-sdk
/usr/local/bin/google-cloud-sdk/install.sh
Install Terraform:
sudo curl -L "https://releases.hashicorp.com/terraform/1.5.5/terraform_1.5.5_linux_amd64.zip" -o /usr/local/bin/terraform
# When asked to replace, answer 'y'
sudo unzip /usr/local/bin/terraform -d /usr/local/bin/
sudo chmod +x /usr/local/bin/terraform
Mac
Install Packer:
sudo curl -L "https://releases.hashicorp.com/packer/1.9.2/packer_1.9.2_darwin_amd64.zip" -o /usr/local/bin/packer
# When asked to replace, answer 'y'
unzip /usr/local/bin/packer -d /usr/local/bin/
sudo chmod +x /usr/local/bin/packer
Install Google Cloud SDK:
sudo curl -L "https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-330.0.0-darwin-x86_64.tar.gz" -o /usr/local/bin/google-cloud-sdk.tar.gz
mkdir /usr/local/bin/google-cloud-sdk
sudo tar -xzvf /usr/local/bin/google-cloud-sdk.tar.gz -C /usr/local/bin
rm /usr/local/bin/google-cloud-sdk.tar.gz
sudo chmod +x /usr/local/bin/google-cloud-sdk
/usr/local/bin/google-cloud-sdk/install.sh
Install Terraform:
sudo curl -L "https://releases.hashicorp.com/terraform/1.5.5/terraform_1.5.5_darwin_amd64.zip" -o /usr/local/bin/terraform
# When asked to replace, answer 'y'
unzip /usr/local/bin/terraform -d /usr/local/bin/
sudo chmod +x /usr/local/bin/terraform
Prepare Google Cloud project
Each environment (develop, staging, production) requires its own project. See Creating and managing projects for more details on creating a project. The following instructions are for one project only, repeat these steps for each environment you would like to use.
Prepare permissions for Google Cloud service account
A Google Cloud service account will need to be created and it’s service account key will need to be downloaded to your workstation. See the article Getting Started with Authentication for more details.
Development/test project
For the development and staging environments, the following permissions will need to be assigned to the service account so that Terraform and Packer are able to provision the appropriate services:
BigQuery Admin
Cloud Build Service Account (API)
Cloud Run Admin (API)
Cloud SQL Admin
Compute Admin
Compute Image User
Compute Network Admin
Create Service Accounts
Delete Service Accounts
Project IAM Admin
Service Account Key Admin
Service Account User
Service Management Administrator (API)
Secret Manager Admin
Service Usage Admin
Storage Admin
Storage Transfer Admin
Serverless VPC Access Admin
Production project
For the production environment, two custom roles with limited permissions need to be created to prevent storage buckets as well as the Cloud SQL database instance from accidentally being destroyed.
When running terraform destroy
with these roles, Terraform will produce an error, because the service account
doesn’t have the required permissions to destroy these resources (buckets and sql database instance). New roles can be
created in the Google Cloud Console, under ‘IAM & Roles’ and then ‘Roles’.
The two custom roles are:
Custom Cloud SQL editor Filter the Roles table on ‘Cloud SQL Editor’, select the role and click on ‘create role from selection’.
Click on ‘ADD PERMISSIONS’ and addcloudsql.users.create
andcloudsql.instances.create
.
This new role replaces the ‘Cloud SQL Admin’ role compared to the development environment above.Custom Storage Admin
Filter the Roles table on ‘Storage Admin’, select the role and click on ‘create role from selection’.
At the ‘assigned permissions’ section filter for and removestorage.buckets.delete
andstorage.objects.delete
.
This new role replaces the ‘Storage Admin’ role compared to the development environment above.
Custom Cloud SQL Editor
Custom Storage Admin
BigQuery Admin
Cloud Build Service Account (API)
Cloud Run Admin (API)
Compute Admin
Compute Image User
Compute Network Admin
Create Service Accounts
Delete Service Accounts
Project IAM Admin
Service Account Key Admin
Service Account User
Service Management Administrator (API)
Secret Manager Admin
Service Usage Admin
Storage Transfer Admin
Serverless VPC Access Admin
Prepare Google Cloud services
Enable the Compute Engine API for the google project. This is required for Packer to create the image. Other Google Cloud services are enabled by Terraform itself.
Add user as verified domain owner
The terraform service account needs to be added as a verified domain owner in order to map the Cloud Run domain that is created to a custom domain. The custom domain is used for the API service. See the Google documentation for more info on how to add a verified owner.
Switch to the branch that you would like to deploy
Enter the observatory-platform project folder:
cd observatory-platform
Switch to the branch that you would like to deploy, for example:
git checkout develop
Prepare configuration files
The Observatory Terraform configuration file needs to be created, to generate a default file run the following command:
observatory generate config terraform
The file is saved to ~/.observatory/config-terraform.yaml
. Customise the generated file, parameters with ‘<–’ need
to be customised and parameters commented out are optional.
See below for an example generated file:
# The backend type: terraform
# The environment type: develop, staging or production
backend:
type: terraform
environment: develop
# Apache Airflow settings
airflow:
fernet_key: 4yfYXnxjUZSsh1CefVigTuUGcH-AUnuKC9jJ2sUq-xA= # the fernet key which is used to encrypt the secrets in the airflow database
ui_user_email: my-email@example.com <-- # the email for the Apache Airflow UI's airflow user
ui_user_password: my-password <-- # the password for the Apache Airflow UI's airflow user
# Terraform settings
terraform:
organization: my-terraform-org-name <-- # the terraform cloud organization
# Google Cloud settings
google_cloud:
project_id: my-gcp-id <-- # the Google Cloud project identifier
credentials: /path/to/google_application_credentials.json <-- # the path to the Google Cloud service account credentials
region: us-west1 <-- # the Google Cloud region where the resources will be deployed
zone: us-west1-a <-- # the Google Cloud zone where the resources will be deployed
data_location: us <-- # the location for storing data, including Google Cloud Storage buckets and Cloud SQL backups
# Google Cloud CloudSQL database settings
cloud_sql_database:
tier: db-custom-2-7680 # the machine tier to use for the Observatory Platform Cloud SQL database
backup_start_time: '23:00' # the time for Cloud SQL database backups to start in HH:MM format
postgres_password: my-password <-- # the password for the airflow postgres database user
# Settings for the main VM that runs the Apache Airflow scheduler and webserver
airflow_main_vm:
machine_type: n2-standard-2 # the machine type for the virtual machine
disk_size: 50 # the disk size for the virtual machine in GB
disk_type: pd-ssd # the disk type for the virtual machine
create: true # determines whether virtual machine is created or destroyed
# Settings for the weekly on-demand VM that runs large tasks
airflow_worker_vm:
machine_type: n1-standard-8 # the machine type for the virtual machine
disk_size: 3000 # the disk size for the virtual machine in GB
disk_type: pd-standard # the disk type for the virtual machine
create: false # determines whether virtual machine is created or destroyed
# API settings
api:
domain_name: api.observatory.academy <-- # the custom domain name for the API, used for the google cloud endpoints service
subdomain: project_id # can be either 'project_id' or 'environment', used to determine a prefix for the domain_name
# User defined Apache Airflow variables:
# airflow_variables:
# my_variable_name: my-variable-value
# User defined Apache Airflow Connections:
# airflow_connections:
# my_connection: http://my-username:my-password@
# User defined Observatory DAGs projects:
# workflows_projects:
# - package_name: observatory-dags
# path: /home/user/observatory-platform/observatory-dags
# dags_module: observatory.dags.dags
The config file will be read when running observatory terraform create-workspace
and
observatory terraform update-workspace
and the variables are stored inside the Terraform Cloud workspace.
Fernet key
One of the required variables is a Fernet key, the generated default file includes a newly generated Fernet key that can be used right away. Alternatively, generate a Fernet key yourself, with the following command:
observatory generate fernet-key
Encoding airflow connections
Note that the login and passwords in the ‘airflow_connections’ variables need to be URL encoded, otherwise they will not be parsed correctly.
Building the Google Compute VM image with Packer
First, build and deploy the Observatory Platform Google Compute VM image with Packer:
observatory terraform build-image ~/.observatory/config-terraform.yaml
Use this command if you have:
Created, removed or updated user defined Observatory DAGs projects via the field
workflows_projects
, in the Observatory Terraform config file.Updated any code in the Observatory Platform.
Update the
backend.environment
variable in the Observatory Terraform config file: you need to make sure that an image is built for the other environment.
You will need to taint the VMs and update them so that they use the new image.
You do not need to run this command if:
You have created, removed or updated user defined Apache Airflow connections or variables in the Observatory Terraform config file: in this case you will need to update the Terraform workspace.
You have changed any other settings in the Observatory Terraform config file (apart from
backend.environment
): in this case you will need to update the Terraform workspace variables and runterraform apply
.
Use this command if:
This is the first time you are deploying the Terraform resources
You have updated any files in the API directory (
/home/user/workspace/observatory-platform/observatory-platform/observatory/platform/api
)
Building the Terraform files
To refresh the files that are built into the ~/.observatory/build/terraform
directory, without rebuilding the entire
Google Compute VM image again, run the following command:
observatory terraform build-terraform ~/.observatory/config-terraform.yaml
Use this command if you have:
Updated the Terraform deployment scripts, but nothing else.
Setting up Terraform
Enter the terraform directory:
cd ~/.observatory/build/terraform/terraform
Create token and login on Terraform Cloud:
terraform login
This should automatically store the token in /home/user/.terraform.d/credentials.tfrc.json
, this file is used during
the next commands to retrieve the token.
It’s also possible to explicitly set the path to the credentials file using the option ‘–terraform-credentials-file’.
Creating and updating Terraform workspaces
See below for instructions on how to run observatory terraform create-workspace and update-workspace.
Create a workspace
Create a new workspace (this will use the created token file):
See Observatory Terraform Environment for more info on the
usage of observatory terraform
.
observatory terraform create-workspace ~/.observatory/config-terraform.yaml
You should see the following output:
Observatory Terraform: all dependencies found
Config:
- path: /home/user/.observatory/config-terraform.yaml
- file valid
Terraform credentials file:
- path: /home/user/.terraform.d/credentials.tfrc.json
Terraform Cloud Workspace:
Organization: jamie-test
- Name: observatory-develop (prefix: 'observatory-' + suffix: 'develop')
- Settings:
- Auto apply: True
- Terraform Variables:
* environment: develop
* airflow: sensitive
* google_cloud: sensitive
* cloud_sql_database: sensitive
* airflow_main_vm: {"machine_type"="n2-standard-2","disk_size"=20,"disk_type"="pd-standard","create"=true}
* airflow_worker_vm: {"machine_type"="n2-standard-2","disk_size"=20,"disk_type"="pd-standard","create"=false}
* airflow_variables: {}
* airflow_connections: sensitive
Would you like to create a new workspace with these settings? [y/N]:
Creating workspace...
Successfully created workspace
Update a workspace
To update variables in an existing workspace in Terraform Cloud:
observatory terraform update-workspace ~/.observatory/config-terraform.yaml
Depending on which variables are updated, you should see output similar to this:
Config:
- path: /home/user/.observatory/config-terraform.yaml
- file valid
Terraform credentials file:
- path: /home/user/.terraform.d/credentials.tfrc.json
Terraform Cloud Workspace:
Organization: jamie-test
- Name: observatory-develop (prefix: 'observatory-' + suffix: 'develop')
- Settings:
- Auto apply: True
- Terraform Variables:
UPDATE
* airflow: sensitive -> sensitive
* google_cloud: sensitive -> sensitive
* cloud_sql_database: sensitive -> sensitive
* airflow_connections: sensitive -> sensitive
UNCHANGED
* api: {"domain_name"="api.observatory.academy","subdomain"="project_id"}
* environment: develop
* airflow_main_vm: {"machine_type"="n2-standard-2","disk_size"=20,"disk_type"="pd-standard","create"=true}
* airflow_worker_vm: {"machine_type"="n2-standard-2","disk_size"=20,"disk_type"="pd-standard","create"=false}
* airflow_variables: {}
Would you like to update the workspace with these settings? [y/N]: y
Updating workspace...
Successfully updated workspace
Deploy
Once you have created your Terraform workspace, you can deploy the system with Terraform Cloud.
Initialize Terraform using key/value pairs:
terraform init -backend-config="hostname="app.terraform.io"" -backend-config="organization="coki""
Or using a backend file:
terraform init -backend-config=backend.hcl
With backend.hcl:
hostname = "app.terraform.io"
organization = "coki"
If Terraform prompts to migrate all workspaces to “remote”, answer “yes”.
Select the correct workspace in case multiple workspaces exist:
terraform workspace list
terraform workspace select <environment>
To preview the plan that will be executed with apply (optional):
terraform plan
To deploy the system with Terraform:
terraform apply
To destroy the system with Terraform:
terraform destroy
Troubleshooting
See below for instructions on troubleshooting.
Undeleting Cloud Endpoints service
If your Cloud Endpoints service is deleted by Terraform and you try to recreate it again, you will get the following error:
Error: googleapi: Error 400: Service <your endpoints service name> has been deleted and will be purged after 30 days. To reuse this service, please undelete the service following https://cloud.google.com/service-infrastructure/docs/create-services#undeleting., failedPrecondition
To restore the Cloud Endpoints service, run the following:
gcloud endpoints services undelete <name of endpoints service>
Rebuild the VMs with a new Google Cloud VM image
If you have re-built the Google Cloud VM image, then you will need to manually taint the VMs and rebuild them:
terraform taint module.airflow_main_vm.google_compute_instance.vm_instance
terraform taint module.airflow_worker_vm.google_compute_instance.vm_instance
terraform apply
Manually destroy the VMs
Run the following commands to manually destroy the VMs:
terraform destroy -target module.airflow_main_vm.google_compute_instance.vm_instance
terraform destroy -target module.airflow_worker_vm.google_compute_instance.vm_instance
Logging into the VMs
To ssh into airflow-main-vm:
gcloud compute ssh airflow-main-vm --project your-project-id --zone your-compute-zone
To ssh into airflow-worker-vm (this is off by default, turn on using airflow DAG):
gcloud compute ssh airflow-worker-vm --project your-project-id --zone your-compute-zone
Viewing the Apache Airflow and Flower UIs
To view the Apache Airflow and Flower web user interfaces you must forward ports 8080 and 5555 from the airflow-main-vm into your local workstation.
To port forward with the gcloud command line tool:
gcloud compute ssh airflow-main-vm --project your-project-id --zone us-west1-c -- -L 5555:localhost:5555 -L 8080:localhost:8080
Syncing files with a VM
To sync your local Observatory Platform project with a VM run the following commands, making sure to customise the username and vm-hostname for the machine:
rsync --rsync-path 'sudo -u airflow rsync' -av -e ssh --chown=airflow:airflow --exclude='docs' --exclude='*.pyc' \
--exclude='*.tfvars' --exclude='*.tfstate*' --exclude='venv' --exclude='.terraform' --exclude='.git' \
--exclude='*.egg-info' /path/to/observatory-platform username@vm-hostname:/opt/observatory