Question 1

What does the Cloud Run Config Auditor check?

Accepted Answer

It parses a gcloud run deploy command or a Knative Service YAML/JSON and grades it across security (secrets from Secret Manager not plain env vars, a least-privilege service account, scoped ingress), cost (max-instances bound, CPU allocation mode, the min-instances tradeoff, request timeout), and scaling (concurrency tuned to the instance's CPU and memory). Each finding explains why it matters on Cloud Run and gives the exact flag to change.

Question 2

Why should secrets use Secret Manager instead of --set-env-vars?

Accepted Answer

Environment variables set with --set-env-vars are stored in plaintext on the Cloud Run revision and are readable by anyone with the run.services.get permission. A real credential there is effectively published to everyone with view access. Mounting it from Secret Manager with --set-secrets keeps the value out of the revision spec, so the auditor fails a config when an env var name or value looks like a secret.

Question 3

Why does it flag a missing max-instances as high severity?

Accepted Answer

Without --max-instances, a service can scale up to the project default ceiling of 100 instances under a traffic spike or a retry storm. That turns a bad afternoon into a large bill and floods your database with connections. Setting an explicit max-instances you can afford caps both the cost and the blast radius, which is why the auditor treats it as a high-severity gap.

Question 4

What's the difference between request-based and always-on CPU billing?

Accepted Answer

By default Cloud Run allocates CPU only while a request is being handled (request-based), which is the cheapest mode for an API and throttles background work to near zero between requests. Passing --no-cpu-throttling keeps CPU allocated for the instance's whole life so it bills continuously, which is right for websockets, streaming or background work but pure idle spend for a plain request/response service. The tool flags always-on CPU combined with warm instances as likely waste.

Question 5

Why does high concurrency on a small instance get a warning?

Accepted Answer

Concurrency is how many requests one instance handles at once, and they all share that instance's CPU and memory. A concurrency of 80 on a 1 vCPU, 256Mi instance oversubscribes it, so tail latency climbs and a burst can push it into an out-of-memory kill. When the auditor can see both the concurrency and a small CPU or memory limit, it warns and suggests either lowering concurrency or giving the instance more resources, confirmed with a load test.

Question 6

Can it read a YAML service definition, and is anything uploaded?

Accepted Answer

Yes for both gcloud commands and Cloud Run service definitions. There is no YAML parser bundled, so YAML support is targeted: it reads the specific Knative keys it grades (containerConcurrency, the autoscaling and cpu-throttling annotations, resource limits, serviceAccountName, probes and ports), while JSON definitions are parsed in full. Everything runs in your browser, so nothing you paste is uploaded, sent to a server, or stored.

Is your Cloud Run service configured right?

The config is where Cloud Run bills and breaks

Questions & answers

Want the rest of the deploy looked at?