Genai-factory is a collection of end-to-end blueprints to deploy generative AI infrastructures in GCP, following security best-practices.
- Embraces IaC best practices. Infrastructure is implemented in Terraform, leveraging Terraform resources and Cloud Foundations Fabric modules.
- Leverages natively shared VPCs.
- Follows the least-privilege principle: no default service accounts, primitive roles, minimal permissions.
- Compatible with Cloud Foundation Fabric FAST project-factory and application templates.
Works with Cloud Foundation Fabric from v56.1.0. Compatibility with master is not guaranteed.
- Agent Engine - An instance of Agent Engine that privately access your VPC resources, accesses Internet via SWP HTTP Proxy, running ADK or ADK with A2A support.
- Single Cloud Run - A secure Cloud Run deployment to interact with Gemini, run an ADK agent, an ADK agent exposed via A2A, a self-hosted Gemma 3 model with Nvidia L4 GPUs or a sample MCP server
- Natural Language to SQL (NL2SQL) - A secure agent on Cloud Run that allows users to securely query data from BigQuery by using a natural language.
- RAG with Cloud Run and CloudSQL - A "Retrieval-Augmented Generation" (RAG) system leveraging Cloud Run, Cloud SQL and BigQuery.
- RAG with Cloud Run and AlloyDB - A "Retrieval-Augmented Generation" (RAG) system leveraging Cloud Run, AlloyDB and BigQuery.
- RAG with Cloud Run and Vector Search - A "Retrieval-Augmented Generation" (RAG) system leveraging Cloud Run and Vector Search.
- AI Application search (Vector AI Search) - An AI-based search engine, configured to search content from a connected data store, indexing web pages from public websites.
- GECX Agent Studio - A Gemini Enterprise - CX Agent Studio application connected to an unstructured data store backed by a GCS bucket.
- GECX Dialogflow) - A chat engine based on Dialogflow CX, backed by two data stores, reading csv and json data from a GCS bucket.
These sample infrastructure deployments and applications can be used to be further extended and to ship your own application code.
The quickstart assumes you have permissions to create and manage projects and link to the billing account.
# Enter your preferred factory, for example cloud-run-single
cd cloud-run-single
# Create the project, service accounts, and grant permissions.
cd 0-prereqs
cp terraform.tfvars.sample terraform.tfvars # Replace prefix, billing account and parent.
terraform init
terraform apply
cd ..
# Deploy the platform services.
cd 1-apps
cp terraform.tfvars.sample terraform.tfvars # Customize.
terraform init
terraform apply
# Deploy the application and follow the commands in the output.Each factory contains two stages:
It creates projects and service accounts, enables APIs, grants IAM roles and creates the networking stack (shared VPC, subnets, DNS zones, firewall policies, ...) using Fabric FAST project application templates.
The stage also creates some components in the service project to allow Terraform in 1-apps to run. This includes a service account dedicated to Terraform (with least privilege roles) and a GCS bucket where to store the Terraform state. Finally, the stage creates providers.tf and terraform.auto.tfvars files in the 1-apps folder so that it's ready to run.
Running this stage is optional. If you can create projects and manage shared VPCs, use it. Alternatively, give the yaml project templates to your infrastructure team. They can use it with their FAST project factory or easily derive the requirements and implement them with their own mechanism.
It deploys the core platform resources within the project and the generative AI sample application on top.
If you created the project outside genai-factory (instead of using 0-prereqs), make sure to provide the 1-apps stage with the projects, APIs, service accounts, roles and networking components it requires. We pass these information to 1-apps via terraform.auto.tfvars files that we automatically create when 0-prereqs runs.
By default, the 0-prereqs stages create the networking components that the generative AI applications need. These include shared VPCs, subnets, routes, firewall policies, DNS zones, Private Google Access (PGA), and more.
You also have the option to still create the service project through 0-prereqs but leverage existing networking infrastructure. To do so, you can set in every 0-prereqs stage var.networking_config.create = false and pass the details of your existing infrastructure.
You can find more details in the 0-prereqs stage of each factory.
Thanks to the Cloud Foundation Fabric community for ideas, inputs and useful tools.
Contributions are welcome! You can follow the guidelines in the Contributing section.