Automate upgrades with Consul Enterprise
Enterprise Only
The functionality described in this tutorial is available only in Consul Enterprise. To explore Consul Enterprise features, you can sign up for a free 30-day trial from here.
Introduction
Upgrades are always a delicate moment for any production environment and it is important to have to have best practices in place that simplify the process where possible.
Consul Enterprise provides automated upgrades with the autopilot feature that enable you to test the upgrade process before making changes to production. The feature allows you to start new Consul servers alongside the old ones and automatically switch to the new servers, as soon as they are able to reach quorum.
This process automates the leader election process and ensures that the new leader is elected among the new nodes so that removing the old nodes from the datacenter does not trigger a leader election.
Warning
This tutorial describes a cluster upgrade process that deploys new servers and then removes the old ones. To upgrade the version of the consul
binary running on servers in your cluster without deploying new servers, follow the general upgrade process and disable autopilot's upgrade migration feature.
Using this tutorial you will test the functionality and get examples on how to:
- Upgrade Consul servers to a newer version
- Patch configuration for server nodes without changing Consul version
This will help you create a standardized and repeatable internal process for server upgrades.
Prerequisites
To test the automated upgrades feature explained in this tutorial you will need:
- A Consul Enterprise datacenter with three servers running Consul Enterprise 1.7.0.
- Three extra nodes with Consul Enterprise 1.7.2 binary installed to be used as the new servers once the upgrade is concluded.
You will also need a text editor, the curl
executable to test the API
endpoints, and optionally the jq
command to format the output for curl
.
Upgrade to a new Consul version
The automated upgrades feature is enabled by default in Consul Enterprise. You
can verify the configuration for your datacenter by using the consul operator autopilot
command.
$ consul operator autopilot get-config
##...DisableUpgradeMigration = falseUpgradeVersionTag = ""
In case the feature is disabled in your datacenter you can enable it using
the consul operator autopilot set-config -disable-upgrade-migration=false
command.
Add new servers
Once you have verified that DisableUpgradeMigration
is set to false
, you can
start the new servers and make sure they join the existing datacenter with the Consul 1.7.0 servers.
In this example the old servers with Consul version 1.7.0 are identified as server-1
,
server-2
, and server-3
. While the new ones with Consul version 1.7.2 are identified as server-1a
,
server-2a
, and server-3a
.
When autopilot detects that a server with a newer version of Consul has joined the datacenter, it will wait to promote the new server as voter until enough newer-versioned servers have been added to the datacenter to reach quorum. When the count of new servers equals or exceeds that of the old servers, autopilot will begin promoting the new servers to voters and demoting the old servers. After this is finished, the old servers can be safely removed from the datacenter.
To check the Consul version on the old servers, you can either use the autopilot
health endpoint or the consul members
command.
$ consul members
Node Address Status Type Build Protocol DC Segmentserver-1 10.20.10.11:8301 alive server 1.7.0+ent 2 dc1 <all>server-2 10.20.10.12:8301 alive server 1.7.0+ent 2 dc1 <all>server-3 10.20.10.13:8301 alive server 1.7.0+ent 2 dc1 <all>
Now that you have checked the autopilot configuration and Consul version on your existing three servers, start two new servers with Consul version 1.7.2. Be sure to join them to the existing datacenter.
Verify the new servers are not yet part of the voters pool.
$ consul operator raft list-peers
Node ID Address State Voter RaftProtocolserver-1 64858627-0565-0c81-234f-4d84b3455303 10.20.10.11:8300 leader true 3server-2 5b10789f-3fe6-2407-e6f2-9e85cf9cfeb3 10.20.10.12:8300 follower true 3server-3 b0ecba67-a422-e46d-ac53-8f48aa403944 10.20.10.13:8300 follower true 3server-1a 15a7b243-95ad-8c7a-4335-4c1690120bbd 10.20.10.21:8300 follower false 3server-2a fd1e7110-cc90-cf0c-d1e1-a22190208b0a 10.20.10.22:8300 follower false 3
Finally, start the third new server on Consul version 1.7.2. Once the third new server is started and autopilot detects the possibility to reach a quorum among the new servers only it promotes them as voters, triggers a new leader election, and demotes the old nodes as non-voters.
Verify the three new servers on Consul version 1.7.2 are now voters.
$ consul operator raft list-peers
Node ID Address State Voter RaftProtocolserver-1 64858627-0565-0c81-234f-4d84b3455303 10.20.10.11:8300 follower false 3server-2 5b10789f-3fe6-2407-e6f2-9e85cf9cfeb3 10.20.10.12:8300 follower false 3server-3 b0ecba67-a422-e46d-ac53-8f48aa403944 10.20.10.13:8300 follower false 3server-1a 15a7b243-95ad-8c7a-4335-4c1690120bbd 10.20.10.21:8300 leader true 3server-2a fd1e7110-cc90-cf0c-d1e1-a22190208b0a 10.20.10.22:8300 follower true 3server-3a 36d57c89-dc39-056d-f8eb-b1a63faa85b8 10.20.10.23:8300 follower true 3
Stop old servers
Once the upgrade is rolled out you can proceed by removing the old servers from
the datacenter using the consul leave
command. This will leave only the new
servers and will prevent a leader election during the process that you would
have experienced without autopilot.
Migrations without a Consul version change
In some environments, Consul is not the only component that needs to be upgraded to ensure security or stability for the node. For example, you might need to roll out node updates to apply OS patches or new configuration settings.
To cover these cases, Consul autopilot offers the possibility to set the
UpgradeVersionTag
parameter to override the version information used during
automated upgrades. The same logic can be used for updating the
datacenter nodes even without the need to change Consul version.
If the UpgradeVersionTag
is configured, Consul will use its value to look for
a version in each server's specified
-node-meta
tag.
For example, if UpgradeVersionTag
is set to build
, and -node-meta build:0.0.2
is used when starting a server, that server's version will be
0.0.2
when considered in a migration. The upgrade logic will follow semantic
versioning and the version string must be in the form of either X
, X.Y
, or
X.Y.Z
.
Configure autopilot for existing servers
For this example, you can use the three servers started in the previous example (server-1a
, server-2a
and server-3a
). To perform the migration you will start three extra servers called server-1b
, server-2b
and server-3b
. Using three new servers (server-1b
, server-2b
and server-3b
) will help you differentiate the command output from the previous example. If you do not want to start three new nodes you can reuse the ones used previously as long as the Consul binary on the servers is changed to match the one running on server-1a
, server-2a
and server-3a
.
Before proceeding with the upgrade it is necessary to prepare the existing
servers to be compared with the new ones. For all the existing servers, add the node_meta
option to the
configuration.
## ...node_meta { build = "0.0.1"}## ...
This change requires a reload.
$ consul reload
Configuration reload triggered
Once the configuration is updated, you should
modify the Consul autopilot configuration to reflect the node_meta
configuration.
$ consul operator autopilot set-config -upgrade-version-tag=build
Configuration updated!
Ensure the autopilot configuration has been successfully updated.
$ consul operator autopilot get-config
## ...DisableUpgradeMigration = falseUpgradeVersionTag = "build"
The UpgradeVersionTag
should be set to build
.
Start new servers
For the new servers, you will define the UpgradeVersionTag
directly in the
configuration.
## ...node_meta { build = "0.0.2"}autopilot { upgrade_version_tag = "build"}## ...
Once the third new server is started and autopilot detects the possibility to
reach a quorum among the new servers only it promotes them as voters, triggers a
new leader election, and demotes the old nodes as non-voters. This time the
version comparison happened not using the Consul internal version but the value
expressed in the build
value.
$ consul operator raft list-peers
Node ID Address State Voter RaftProtocolserver-1a dcc542fc-ff83-2cab-5bbb-7615584be019 10.20.10.11:8300 follower false 3server-2a f38a430b-dda0-ad3d-a394-08c59052723c 10.20.10.12:8300 follower false 3server-3a 647a98a1-d2ab-f5d3-e911-91c1c10a92f4 10.20.10.13:8300 follower false 3server-1b 7770d10c-fcca-f6f9-222a-e1777a7fd29e 10.20.10.21:8300 leader true 3server-2b bace6b4d-5e2e-8942-d553-ca527cbc2386 10.20.10.22:8300 follower true 3server-3b 221de624-0f5c-29da-ffaf-39481adcd234 10.20.10.23:8300 follower true 3
Stop old servers
Once the migration is rolled out you can proceed by removing the old servers from
the datacenter using the consul leave
command. This will leave only the new
servers but will prevent a leader election during the process that you would
have experienced without autopilot.
Next steps
In this tutorial you upgraded your Consul datacenter by using autopilot's automated upgrades functionality. The automated upgrade feature allows you to both upgrade the Consul version of your servers and to deploy updated version for your server instances without the need to change the Consul version to distribute patches. You now have an extra tool to use when planning upgrades of critical system and to reduce manual steps to a minimum.
Before planning any Consul upgrade make sure to review version specific instructions provided by:
You can learn more on other autopilot functionalities by checking our autopilot tutorial.