Cloud Run

Introduction

Cloud Run is a managed compute platform that lets you run containers directly on top of Google's scalable infrastructure. You can deploy code written in any programming language on Cloud Run if you can build a container image from it.
Cloud Run manages everything else: generating a valid SSL certificate, configuring an SSL termination correctly with secure settings, handling incoming requests, decrypting them and forwarding them to your application.
There are 2 workflows

The pricing model on Cloud Run is unique as you only pay for the system resources you use while a container is handling Web requests with a granularity of 100 milliseconds and when it is starting or shutting down.
Comparing with app engine flexible environment, App Engine Flexible runs on top of Compute Engine, it can't scale to zero and runs inside of the VPC network. For Cloud Run, as soon as a container is handling no requests, the CPU is throttled to nearly zero

Cloud Run supports auto scaling. Cloud Run automatically increases capacity when necessary to make sure it handles all incoming requests. As soon as the demand decreases once more, Cloud Run stops sending traffic to these containers and shuts them down.
The maximum number of containers in a single Cloud Run service is 1,000, and you can increase this limit if you send a request to Google Support.
If a container breaks, Cloud Run will remove it.
Cloud Run integrates with a global load balancer, which allows you to expose a single global IP address in front of multiple regional Cloud Run services.

You can shift the traffic to the new revision gradually to reduce the impact of a failure and maintain application availability during the update.
It is not required to take an incremental approach to deployment. You can also choose to send 100% of your traffic to the new revision immediately.

Idle container will not serve the request. Cloud Run can shut down idle containers since they are not handing requests.
Your application exits-- for instance, due to an error in your application code-- or the container exceeds the memory limit, the default of which is 250 megabytes. It will cause the container be shut down

Cloud Run deploys your application after every change you make to the service resource. At the same time, it also makes an immutable copy of the service resource called a revision which is a copy of the service resource, and it also exists at runtime.
When Cloud Run is waiting, the old or current revision still serves production traffic. Now, as soon as the first container in the new revision is ready, Cloud Run thinks it is ready to accept a TCP connection. Cloud Run sends all production traffic to it. The container in the previously active revision will become idle.

You can also pin traffic to a specific revision, rather than the latest revision, decoupling deployment of a new revision from the migration of traffic.
You might want to tag all revisions automatically with the ID of the commit in version control that was used to create the revision. A tag can also be used to point to the latest healthy revision. This is how you can support a previous link to the latest revision.
So you run a few confidence checks before migrating production traffic over to the new revision, you can split the traffic to multiple revisions instead make all the traffic to the latest version

Operation

Deploy the image to the cloud run

gcloud run deploy product-service \
   --image gcr.io/qwiklabs-resources/product-status:0.0.1 \
   --tag test1 \
   --region $LOCATION \
   --allow-unauthenticated

Update the traffic

// Based on tags
gcloud run services update-traffic product-service \
  --to-tags test2=50 \
  --region=$LOCATION
// Output all the revision
gcloud run services describe product-service

PreviousCloud function NextInfra as Code

Last updated 3 years ago

Was this helpful?