[DCP - Terraform] Initializes the APIs, Service Account, Cloud Run Service and Spanner Instance and DB.#15
Conversation
Summary of ChangesHello @gmechali, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a comprehensive set of Terraform scripts designed to automate the initial setup and deployment of the Data Commons Platform (DCP) within a new Google Cloud Project. The changes streamline the provisioning of core GCP services, including API enablement, service account creation with appropriate permissions, Cloud Run service deployment, and Spanner instance and database setup, ensuring a consistent and repeatable infrastructure foundation for DCP. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces Terraform scripts for setting up the Data Commons Platform on GCP, including enabling APIs, creating a service account, and provisioning Cloud Run and Spanner resources. The primary security concern identified is the explicit granting of public access to the Cloud Run service via the allUsers IAM binding, which bypasses standard IAM-based access controls and requires careful evaluation. Other issues include the need to scope down IAM permissions, make public access and database deletion protection configurable (especially for production environments), correct the .gitignore for reproducible builds, and avoid the use of the :latest Docker image tag.
| variable "image_url" { | ||
| description = "Docker image URL to deploy" | ||
| type = string | ||
| default = "gcr.io/datcom-ci/datacommons-platform:latest" |
There was a problem hiding this comment.
The default image_url uses the :latest tag. Using mutable tags like :latest is not recommended for deployments as it can lead to unexpected code being deployed if the image is updated. This makes deployments less predictable and rollbacks harder. It is a best practice to use immutable image tags, such as a git commit SHA or a semantic version number (e.g., gcr.io/datcom-ci/datacommons-platform:v1.2.3 or gcr.io/datcom-ci/datacommons-platform:sha-a1b2c3d).
| * **Terraform**: Terraform installed locally (>= 1.0.0). | ||
| * **gcloud CLI**: GCP CLI installed and authenticated. | ||
|
|
||
| ## Setup |
There was a problem hiding this comment.
do users need to clone the repo? Is there a command/script we can provide such that they might pull the required files with curl instead having to clone? Curious to hear your general thoughts on this / what's feasible.
Like maybe we have a script that does the pulling of files. So user flow might look like:
- curl / pull "download_script"
- run "download_script" which downloads the rest of the required files
- edit variable file
- run "setup_script"
There was a problem hiding this comment.
We had a conversation about where this should live. For now I ll get this checked in here, but it's an ongoing conversation of whether we will move it to a Terraform Data Commons Repo.
This PR is pretty isolated so it will be very simple to refactor. I think we will get the conversationsettled shortly :)
dwnoble
left a comment
There was a problem hiding this comment.
Thanks Gabe! I had to make a few changes to the tf configuration to get it working on my end, but with these changes my cdc+dcp stacks are up and running
This group of terraform script controls the setup of a new Data Commons Platform deployment within a GCP Project that has nothing setup.
Note that after running this in datcom-website-dev, we succeeded with each of the following resources:
And after running the commands in the DCP setup: https://github.com/datacommonsorg/datacommons?tab=readme-ov-file#2-define-your-schema but changing the URL to the new Cloud Run Service, you can inspect the logs to find the requests, and inspect the DB to find the schema and nodes successfully saved!
Note - I have added more optional variables to control the Spanner DB + cloud run service setup but have not yet tested them. Thoughts on including those?
Lastly, it adds a setup script to create the GCS bucket for remote state management.