Cut GCP Costs on Autopilot: Meet the CloudOn GCP FinOps Guardian
Overview
The CloudOn GCP FinOps Guardian is a serverless solution that harnesses Google Cloud’s Recommender API to pinpoint cost savings and optimization opportunities across your entire GCP organization. It automatically detects idle resources, surfaces right-sizing recommendations, and delivers actionable insights directly to your Slack channel — all without any manual effort.
What’s Covered
Idle Resources
- Idle Compute Engine VM instances
- Idle persistent disks
- Idle Cloud SQL instances
- Unattached static IP addresses
Right-Sizing
- Overprovisioned VM instances
- Overprovisioned Cloud SQL instances
Cost Optimization
- Committed use discount recommendations
- Cloud Storage lifecycle policy suggestions
Architecture
The solution is built on fully managed GCP services and deployed entirely via Terraform.
| Component | Details |
|---|---|
| Cloud Function | Python 3.9 runtime, 512 MB memory, 300 s timeout, Pub/Sub trigger |
| Cloud Scheduler | Configurable cron schedule (default: daily) |
| Pub/Sub | Decouples scheduler from function invocation |
| Cloud Storage | Stores the function source archive |
| Service Account | Least-privilege identity for the function |
High-Level Flow
- Cloud Scheduler fires on the configured cron schedule and publishes a message to the Pub/Sub topic.
- The Pub/Sub trigger invokes the Cloud Function.
- The function iterates over every project in the GCP organization and calls the Recommender API for each supported recommender type.
- Findings are aggregated, formatted, and posted to the configured Slack webhook URL.
Supported Recommenders
| Recommender Name | API Resource Name |
|---|---|
| Idle VM | google.compute.instance.IdleResourceRecommender |
| VM Right-Sizing | google.compute.instance.MachineTypeRecommender |
| Idle Persistent Disk | google.compute.disk.IdleResourceRecommender |
| Idle Cloud SQL | google.cloudsql.instance.IdleRecommender |
| Cloud SQL Right-Sizing | google.cloudsql.instance.OverprovisionedRecommender |
| Idle Static IP | google.compute.address.IdleResourceRecommender |
| Committed Use Discounts (CPU) | google.compute.commitment.UsageCommitmentRecommender |
| Committed Use Discounts (Memory) | google.compute.commitment.UsageCommitmentRecommender |
| Cloud Storage Lifecycle | google.storage.bucket.ActivityInsight |
Key Features
Idle Resource Detection
The Guardian scans every project for resources that are provisioned but not actively used — including VMs, disks, Cloud SQL instances, and static IP addresses — and reports the projected monthly savings for each finding.
Right-Sizing
For overprovisioned VMs and Cloud SQL instances, the function surfaces the recommended machine type or tier along with the expected cost reduction so engineers can act immediately.
Cost-Saving Commitments
The function evaluates your historical usage patterns and recommends committed use discounts (CUDs) where they would reduce spend, including both CPU-based and memory-based commitments.
Slack Notifications
All recommendations are delivered to a Slack channel via an incoming webhook. Each message includes the project ID, resource name, recommender type, priority, and estimated monthly savings.
Prerequisites
Deployment Permissions
The identity deploying the Terraform configuration needs the following roles on the organization or target projects:
roles/resourcemanager.organizationViewerroles/iam.serviceAccountAdminroles/cloudfunctions.adminroles/pubsub.adminroles/cloudscheduler.adminroles/storage.admin
Service Account Roles
The Cloud Function’s service account requires the following IAM roles.
Mandatory roles:
| Role | Purpose |
|---|---|
roles/recommender.viewer | Read Recommender API results |
roles/resourcemanager.folderViewer | Enumerate folders in the organization |
roles/resourcemanager.projectViewer | Enumerate projects |
roles/browser | Browse the resource hierarchy |
Optional roles (for organization-wide scanning):
| Role | Purpose |
|---|---|
roles/resourcemanager.organizationViewer | View organization-level metadata |
Required APIs
Enable the following APIs before deployment:
gcloud services enable \
cloudfunctions.googleapis.com \
cloudscheduler.googleapis.com \
pubsub.googleapis.com \
recommender.googleapis.com \
cloudresourcemanager.googleapis.com \
storage.googleapis.com
Configuration
All runtime configuration is passed to the Cloud Function as environment variables, defined in main.tf.
resource "google_cloudfunctions_function" "finops_guardian" {
name = "gcp-finops-guardian"
runtime = "python39"
entry_point = "main"
available_memory_mb = 512
timeout = 300
trigger_http = false
event_trigger {
event_type = "google.pubsub.topic.publish"
resource = google_pubsub_topic.finops_trigger.id
}
environment_variables = {
ORGANIZATION_ID = var.organization_id
SLACK_WEBHOOK_URL = var.slack_webhook_url
PROJECT_ID = var.project_id
}
}
Deployment
The solution is defined as five Terraform resources:
google_storage_bucket— stores the function archivegoogle_storage_bucket_object— uploads the zipped source codegoogle_pubsub_topic— message bus between scheduler and functiongoogle_cloud_scheduler_job— triggers the function on a cron schedulegoogle_cloudfunctions_function— the core scanning function
Example Terraform Variables
organization_id = "123456789012"
project_id = "my-finops-project"
region = "us-central1"
slack_webhook_url = "https://hooks.slack.com/services/XXX/YYY/ZZZ"
scheduler_cron = "0 9 * * 1" # Every Monday at 09:00 UTC
Deploy
terraform init
terraform plan
terraform apply
Notifications
Each Slack message posted by the Guardian contains the following fields:
| Field | Description |
|---|---|
| Project | GCP project ID where the resource lives |
| Resource | Full resource name |
| Recommender | Human-readable recommender type |
| Priority | Recommendation priority (HIGH / MEDIUM / LOW) |
| Estimated Monthly Savings | Projected USD savings per month |
Code Structure
| File | Responsibility |
|---|---|
main.py | Entry point; orchestrates scanning and Slack delivery |
recommender.py | Wraps the Recommender API; handles pagination and error handling |
resource_recommender.py | Maps recommender names to resource types and formats findings |
Adding New Recommenders
- Add the new recommender’s API resource name to the
RECOMMENDERSlist inrecommender.py. - Add a human-readable label for it in
resource_recommender.py. - Test locally against a non-production project before deploying.
Contributing
- Fork the repository and create a feature branch.
- Submit a pull request with a clear description of the change and any relevant test evidence.
- Ensure all existing tests pass and add new tests for any new recommender mappings.
Get Started
Clone the repository, fill in your terraform.tfvars, run terraform apply, and your GCP organization will start receiving weekly FinOps recommendations automatically.