How Kubernetes Updates Work on Container Engine

I often get asked when I talk about Container Engine (GKE):

How are upgrades to Kubernetes handled?


As we spell out in the documentation, upgrades to Kubernetes masters on GKE are handled by us. They get rolled out automatically.  However, you can speed that up if you would like to upgrade before the automatic update happens.  You can do it via the command line:

You can also do it via the web interface as illustrated below.

GKE notifies you that upgrades are available.
GKE notifies you that upgrades are available.
You can then upgrade the master, if the automatic upgrade hasn’t happened yet.
You can then upgrade the master, if the automatic upgrade hasn’t happened yet.
Once there, you’ll see that the master upgrade is a one way trip.
Once there, you’ll see that the master upgrade is a one way trip.


Updating nodes is a different story. Node upgrades can be a little more disruptive, and therefore you should control when they happen.  

What do I mean by “disruptive?”

GKE will take down each node of your cluster killing the resident pods.  If your pods are managed via a Replication Controller or part of a Replica Set deployment, then they will be rescheduled on other nodes of the cluster, and you shouldn’t see a disruption of the services those pods serve. However if you are running a Pet Set deployment, using a single Replica to serve a stateful service or manually creating your own pods, then you will see a disruption. Basically, if you are being completely “containery” then no problem.  If you are trying to run a Pet as a containerized service you can see some downtime if you do not intervene manually to prevent that downtime.  You can use a manually configured backup or other type of replica to make that happen.  You can also take advantage of node pools to help make that happen.   But even if you don’t intervene, as long as anything you need to be persistent is hosted on a persistent disk, you will be fine after the upgrade.

You can perform a node update via the command line:

Or you can use the web interface.

Again, you get the “Upgrade Available” prompt.
Again, you get the “Upgrade Available” prompt.
You have a bunch of options. (We recommend you stay within 2 minor revs of your master.)
You have a bunch of options. (We recommend you stay within 2 minor revs of your master.)

A couple things to consider:

  • As stated in the caption above, we recommend you say within 2 minor revs of your master. These recommendations come from the Kubernetes project, and are not unique to GKE.
  • Additionally, you should not upgrade the nodes to a version higher than the master. The web UI specifically prevents this. Again, this comes from Kubernetes.
  • Nodes don’t automatically update.  But the masters eventually do.  It’s possible that the masters could automatically update to a version more than 2 minor revs beyond the nodes. This can a cause compatibility issues. So we recommend timely upgrades of your nodes. Minor revs come out at about once every 3 months.  Therefore you are looking at this every 6 months or so.

As you can see, it’s pretty straightforward. There are a couple of things to watch out for, so please read the documentation.

Making Kubernetes IP addresses static on Google Container Engine

I’ve been giving a talk and demo about Kubernetes for a few months now, and during my demo, I have to wait for an ephemeral, external IP address from a load balancer to show off that Kubernetes does in fact work.  Consequently, I get asked “Is there any way to have a static address so that you can actually point a hostname at it?” The answer is: of course you can.

Start up your Kubernetes environment, making sure to configure a service with a load balancer.

Once your app is up, make note of the External IP using kubectl get services.


Now go to the Google Cloud Platform Console -> Networking -> External IP Addresses.

Find the IP you were assigned earlier. Switch it from “Ephemeral” to “Static.” You will have to give it a name and it would be good to give it a description so you know why it is static.


Then modify your service (or service yaml file) to point to this static address. I’m going to modify the yaml.   


Once your yaml is modified you just need to run it; use kubectl apply -f service.yaml.

To prove that the IP address works, you should kubectl delete the service and then kubectl apply, but you don’t have to do that. If you do that though, please be aware that although your IP address is locked in, your load balancer still needs a little bit of time to fire up.  

Instead of this method, you can create a static IP address ahead of time and create the forwarding rules manually. I think that’s its own blog post, and  I think it is just easier to let Container Engine do it.

I got lots of help for this post from wernight’s answer on StackOverflow, and the documentation on Kubernetes Services.

I can confirm this works with Google Container Engine. It should work with a Kubernetes cluster installed by hand on Google Cloud Platform.  I couldn’t ascertain if it works on other cloud providers.

Kubernetes Secrets Directly to Environment Variables

kubernetes-secretsI’ve found myself wanting to use Kubernetes Secrets for a while, but I every time I did, I ran into the fact that secrets had to be mounted as files in the container, and then you had to programmatically grab those secrets and turn them into environment variables.  This works and there are posts like this great one from my coworker, Aja Hammerly that tell you how to do it.

It always seemed a little suboptimal for me though.  Mostly because you had to alter your Docker image in order to use secrets. Then you lose some of the flexibility to use a Dockerfile in both Docker and Kubernetes. It’s not the end of the world – you can write a conditional script –  but I never liked doing this.  It would be awesome if you could just write Secrets directly to ENV variables.

Well it turns out you can. Right there in the documentation there’s a whole section on Using Secrets as Environment Variables. It’s pretty straightforward:

Make a Secrets file, remembering to base64 encode your secrets.

Then configure your pod definition to use the secrets.

That’s it. It’s a great addition to the secrets API.  I’m trying to track down when it was added. It looks like it came in 1.2.  The first reference I could find to it in the docs was in this commit  updating Kubernetes Documentation for 1.2.

Autoresizing Persistent Disks in Compute Engine

Got a challenge the other day:

Is it possible to automatically resize a Persistent Disk in Google Compute Engine?

The answer is yes – with a few caveats.  

This solution really only works with Persistent Disks that are not root. Root disks seem to need a reboot to make this work – and automatically rebooting seems like a bad idea. So if you run it on a root disk it will work, but the extra space won’t be available until you manually reboot the machine.

Be careful with quotas. My solution here has a default max disk size of 64TB because that is the max disk that GCE disks can be. You may want to be more conservative with your limits because disk size = money. Also you have a quota on your account for the amount of SSD you can assign.  As of this writing it is 2TB.  You can always raise it, but this script cannot get around your quota, and will fail if it tries to.

All that out of the way, let’s give this a shot.

Step 1 – Script it

The first step is to put together a script that:

  • Checks the utilization of a disk.
  • If the utilization is too high, resizes the disk in Google Cloud Platform
  • Then also resizes the disk on the host OS.

There are a couple of other things we want to configure in this script:

  • What is the threshold percent that is high enough to resize the disk?
  • What is the factor by that we’ll increase the disk? Double it? Triple it?
  • What is the maximum limit to which we will increase the disk?

Keeping all of that in mind, here is my solution in Bash for Debian (our default OS choice on Compute Engine.) As you can see it’s a mix of gcloud commands and df.

Source is also available in GitHub.

You can find the reference for the gcloud commands in the documentation.

Step 2 – Authorize it

The next step is to make sure this script can run at all.  To do that we have to delve into Cloud IAM.

First we want to create a service account. During this process we have the option to ‘Furnish a new private key’. This will cause a key file to be downloaded at the end of file creation. Choose JSON and keep track of the JSON file that gets downloaded after you click ‘Create’.


Add the service account to the IAM role – Compute Storage Admin. Then remove the service account from the project level role – Editor. We want it to have as little permission as it needs.    


Copy the JSON file to the Compute Engine machine to which the disk you wish to monitor is attached.

Authorize the service account using the following command.


My co-worker, Sandeep, has a good video tutorial about service accounts if you need more information.

Step 3 Test it

Assuming you have installed the autoscale-disk script from step 1,  and you set up permissions correctly, you are ready to test it.  

To check the permissions, run:

If you see the output of a gcloud compute disk list there, you got it right. If you do not, you will see a FAILURE message.

Step 4 – Cron it

Once you have the script installed, and you have tested it – it’s time to set it and forget it. Add it to crontab with your desired settings.


I’m setting this up to check every minute, because it’s pretty lightweight when it isn’t actually resizing disks. However do what you will. You might also want to pipe the output to a log. Again, your call.


There you have it, autoscaling a disk based on utilization with a cron job. What I love about this idea is that it is so very cloudy. On prem, even if you have a pool of storage, eventually you run out, so sizing up a disk isn’t a sure thing.  But in a cloud world, if you need more it’s always just an API call away.


Migrating App Engine Standard to Cloud SQL v2

I recently discovered that Google Cloud SQL v2 now supports App Engine standard runtime.  This is very exciting for me. I wanted to try out the process and make sure there were no gotchas.


  • I created a new Cloud SQL v2 instance.
  • I used my syncing script from my blog post Migrating between Cloud SQL databases to move the data to the new instance.
  • I created a new App Engine module by store a new version of my app using the old code base.
  • I changed the connection string from the old database to the new one. The pattern to make this happen has changed a bit, more down below.
  • That was it.  The new code served up just fine. I served up on the old module until I made the connection string config tweak to the old code base.

New connection string

The new connection strings are only a little different than the old ones, and should require just a change to one string in your config.

The directions will tell you to look for your Instance connection name in the Instance properties of your Cloud SQL Developer Console. There are two patterns that these strings come in. 

  • V1 connection names follow the pattern projectid:instancename
  • V2 connection names follow the pattern projectid:regionname:instancename.

It’s a pretty simple change, but I can see someone accidentally (or willfully) not reading the documentation and getting tripped up on this. The new connection strings require region name; that’s all there is to it.  I’ve tested this on PHP. I assume it works everywhere. But your mileage may vary.  Golang tests are coming soon; I will update when I make that change.

Cloud SQL v1 vs v2

Adri Van Houdt asked me on Twitter: @tpryan @googlecloud what are the major differences [between Cloud SQL v1 and v2]?

It demanded more than a 140-character answer.

Most of the differences are explained in the Cloud SQL Documentation, but to sum up salient details: Cloud SQL v2 has better performance, more capacity, and will probably run cheaper for the same amount of work.

  • Up to 7X throughput and 20X storage capacity of first generation instances
  • Less expensive than first generation for most use cases
  • Option to add high availability failover and read replication
  • Configurable backup period and maintenance window
  • Proxy support (for better security and access using dynamic IP addresses)

Wait what? Bigger, Faster, Cheaper? Is there a reason why I wouldn’t run on Cloud SQL v2?  It’s in beta, so there are some features missing.  Most are not deal breakers, unless, you know, they are for you:

  • No Service Level Agreement (SLA)
  • Only MySQL 5.6 is supported.
  • No point-in-time recovery from backups
  • No standard Persistent Disk (HDD) support
    • But yes Solid-state Persistent Disk (SSD)
  • No automatic scaling of storage capacity
    • But you can do manually increases with no downtime
  • No IPv6 connectivity

The big one here for most larger customers is going to be no SLA.

Not listed here, there is another important feature. If you are an App Engine customer and looked at v2 before, it did not support App Engine. However, that appears to have changed: you can now access Cloud SQL v2 from App Engine standard environment or App Engine flexible environment.

The pricing for v1 and the pricing for v2 are a bit different. The pricing for v2 looks a lot more like the pricing tier structure for Compute Engine now.  

Also, not sure if it is related but we have a new logo now.  Instead of the on-the-nose “SQL” on a blue hexagon, it’s a hard-angled symbol for database on a blue hexagon.  


So there you have it, bigger, better, cheaper, but no SLA. There are more subtle differences that are explained at length in the documentation, so you should definitely test out and see for yourself. I, for one, am moving my stuff over now that there is App Engine standard support.

Migrating between Cloud SQL databases

I run a bunch of SQL databases on Cloud SQL v1, and I wanted to move them over to Cloud SQL v2.   I like to automate this sort of thing.  I also like to have the new database essentially mirror the old one, until I’m ready to cut over.

I looked into writing a script that could do that with gcloud.  Turns out,  it is incredibly simple. The sql tools in gcloud can import and export directly to Cloud Storage.

There you go — three lines of commands.  The only thing you need to do to make the new DB work is make sure all of the database accounts are set up correctly on the new server, otherwise application calls will bomb.

Now keep in mind that your mileage (or kilometerage) may vary.  In this case, I am going between MySQL 5.5 and MySQL 5.6, and I had no issues. If there is a reason that your old DB won’t run in the new target, it will fail.  This script also assumes that you are in the same project with appropriate permissions to all.

There’s a lot more you can do with gcloud to manage your Cloud SQL installation. Make sure to check out the rest of the documentation.


Working with Cloud Vision API from JavaScript

I ran into a case where I wanted to fool around with Cloud Vision API from pure JavaScript. Not node.js, just JavaScript running in a browser. There were no samples, so I figured I’d whip up some. So here is a little primer on how to do this from JavaScript in a browser.

First you have to take care of a few prerequisites:

Once you do this you’re ready to start developing. Make sure you hold on to the API key you created above.

The first thing you need to do is create an upload form.  This is pretty basic in HTML5.

Note that I’m using a select box to drive the type of detection I am doing.  There are more choices, but I’m sticking with landmark detection for now.

Next you need to convert the image to Base64 encoding to transmit the image data via a REST API. I looked around for how to do this “properly” and the best I came up with was the “easy way” mentioned in this Stack Overflow post – Get Base64 encode file-data from Input Form.

I use  readAsDataURL(). 

Then I massage the content into the JSON format that the Cloud Vision API expects. Note that I strip out “data:image/jpeg;base64,”. Otherwise Cloud Vision sends you errors. And you don’t want that. 

And then I send. With the API key.  That’s it. Nothing to it really.

If you want to dig deeper into the Cloud Vision API you can

The code for all of this is now shared in the Cloud Vision repo on GitHub.

Working with Cloud Vision API from PHP

I have been very excited by the Cloud Vision API recently put into Beta by Google Cloud Platform. I haven’t had a chance to play with it much, and I wanted to fool around with it from PHP on App Engine (or vanilla PHP for that matter), but there is no documentation for PHP yet.

So here is a little primer on how to do this from PHP on App Engine.

First you have to complete a few prerequisites:

Once you do this you’re ready to start developing. Because I am running PHP on App Engine I want the App Engine SDK for PHP.

I’m going to use the GUI to run this app, but you can use the command line just as easily.


The first thing I need to do is write a php.ini that properly allows use of cURL and has a good limit on uploaded files.

Then I set up a page named creds.php to hold my API key for Cloud Storage and my Cloud Storage Bucket name.

Then I create a form page named index.php that creates an App Engine Upload URL for me. (If I wanted to not use App Engine, I could just skip the call to Cloud Storage Tools and post directly to the next file in the example: process.php.)

Then process.php does the hard work of taking the uploaded file, converting it to base64 and uploading to the Cloud Vision API.

Finally I have to create an app.yaml to serve up the two pages.

Use GoogleAppEngineLauncher to start your app.

You should get this.


Assuming you upload a picture from the top of the Eiffel Tower looking at the Champs de Mars, you’ll get something like this:


There you go, bare bones but simple Cloud Vision example in PHP.

If you want to dig deeper into the Cloud Vision API you can

The code for all of this is available on GitHub

Compute Engine and App Engine – a Comparison


I want to show you a little demo of how Compute Engine and App Engine work. Both techs have their strengths and weaknesses, and I wanted to make something to showcase them. 

Compute Engine allows you to spin up Virtual Machines (henceforth to be referred to as “VMs” due to the fact that I can’t be bothered to write “irtual” and “achine”.) VMs give you a lot of control over your system. You can run a number of OSes, with variable processor, memory, and disk configurations. You interact with it by configuring a VM through the Developer Console or on the command line. You then SSH into your VM.

App Engine on the other hand just takes code.  You upload it and we run it. No SSH, no machine, just an upload site and a URL. App Engine by default gives you no control over the hardware running the code. The trade off is that we can immediately scale from zero load to any load you muster.

So how do these compare? App Engine scales in milliseconds? What does that look like? Compute Engine starts up in 10s of seconds? What does that mean? This demo shows off how you can build Compute Engine machines vs how fast you can spin up App Engine instances. This isn’t a one-is-better-than-the-other comparison; there are reasons to use both of these techs, and they aren’t mutually exclusive. Let me know what you think.