Updating Tungsten Fabric using the Zero Impact Upgrade Process in an Environment using Red Hat Openstack¶
This document provides the steps needed to update a Tungsten Fabric deployment that is using Red Hat Openstack as it’s orchestration platform. The procedure provides a zero impact upgrade (ZIU) with minimal disruption to network operations.
When to Use This Procedure¶
This procedure is used to upgrade Tungsten Fabric when it is running in environments using RHOSP13.
The procedure in this document has been validated for the following Tungsten Fabric upgrade scenarios:
Table 1: Tungsten Fabric with RHOSP13 Validated Upgrade Scenarios
Starting Tungsten Fabric Release
Target Upgraded Tungsten Fabric Networking Release
A different procedure is followed for upgrading to earlier target Tungsten Fabric releases in environments using RHOSP13 orchestration. See Upgrading Tungsten Fabric with Red Hat Openstack 13 using ISSU.
If you want to use this procedure to upgrade your Tungsten Fabric release to other releases, you must engage Juniper Networks professional services. Contact your Juniper representative for additional information.
This document makes the following assumptions about your environment:
A Tungsten Fabric deployment using Red Hat Openstack version 13 (RHOSP13) as the orchestration platform is already operational.
The overcloud nodes in the RHOSP13 environment have an enabled Red Hat Enterprise Linux (RHEL) subscription.
Your environment is running TF Release 1912 and upgrading to TF Release 1912-L1 or to TF Release 2003 or later.
If you are updating Red Hat Openstack simultaneously with Tungsten Fabric, we assume that the undercloud node is updated to the latest minor version and that new overcloud images are prepared for an upgrade if needed for the upgrade. See the Upgrading the Undercloud section of the Keeping Red Hat OpenStack Platform Updated guide from Red Hat.
If the undercloud has been updated and a copy of the heat templates are used for the deployment, update the copy of the heat template from the Red Hat’s core heat template collection at /usr/share/openstack-tripleo-heat-templates. See the Understanding Heat Templates document from Red Hat for information on this process.
Before You Begin¶
We recommend performing these procedures before starting the update:
Backup your TF configuration database before starting this procedure. See How to Backup and Restore TF databases in JSON Format.
Each compute node agent will go down during this procedure, causing some compute node downtime. The estimated downtime for a compute node varies by environment, but typically took between 12 and 15 minutes in our testing environments.
If you have compute nodes with workloads that cannot tolerate this downtime, consider migrating workloads or taking other steps to accommodate this downtime in your environment.
If you are updating Red Hat Openstack simultaneously with Tungsten Fabric, update Red Hat Openstack to the latest minor release version and ensure that the new overcloud images are prepared for the upgrade. See the Upgrading the Undercloud section of the Keeping Red Hat OpenStack Platform Updated guide from Red Hat for additional information.
If the undercloud has been updated and a copy of the heat templates are used for the deployment, update the Heat templates from Red Hat’s core Heat template collection at /usr/share/openstack-tripleo-heat-templates. See the Understanding Heat Templates document from Red Hat for additional information.
Updating Tungsten Fabric in an Environment using Red Hat Openstack¶
To update Tungsten Fabric in an environment that is using Red Hat Openstack as the orchestration platform:
Prepare your docker registry. The registry is often included in the undercloud, but it can also be a separate node.
Docker registry setup is environment independent. See Docker Registry from Docker for additional information on Docker registry setup.
Backup the TF TripleO Heat Templates. See Using the TF Heat Template.
Get the TF TripleO Heat Templates (Stable/Queens branch) from https://github.com/Juniper/contrail-tripleo-heat-templates.
(Optional) Update the TF TripleO Puppet module to the latest version and prepare Swift Artifacts, as applicable.
Below are sample commands entered in the undercloud:
[stack@queensa ~]$ mkdir -p ~/usr/share/openstack-puppet/modules/tripleo [stack@queensa ~]$ git clone -b stable/queens https://github.com/Juniper/contrail-tripleo-puppet usr/share/openstack-puppet/modules/tripleo [stack@queensa ~]$ tar czvf puppet-modules.tgz usr/ [stack@queensa ~]$ upload-swift-artifacts -c contrail-artifacts -f puppet-modules.tgz
Update the parameter
ContrailImageTagto the new version.
The location of the
ContrailImageTagvariable varies by environment. In the most commonly-used environments, this variable is set in the
You can obtain the
ContrailImageTagparameter from the README Access to Contrail Registry 20XX .
(Recommended) If you are upgrading to Tungsten Fabric Release 2005 or later, check and, if needed, enable kernel vRouter huge page support to support future compute node upgrades without rebooting.
You can enable or verify kernel-mode vRouter huge page support in the contrail-services.yaml file using either the ContrailVrouterHugepages1GB: and ContrailVrouterHugepages2MB: parameters:
parameter_defaults: … ContrailVrouterHugepages1GB: ‘2’
parameter_defaults: … # ContrailVrouterHugepages2MB: ‘1024’
Notes about kernel-mode vRouter huge page support in Red Hat Openstack environments:
Kernel-mode vRouter huge page support was introduced in TF Release 2005, and is configured to support 2 1GB huge pages by default in Tungsten Fabric Release 2005 or later.
A compute node has to be rebooted once for a huge page configuration to finalize. After this initial reboot, the compute node can perform future Tungsten Fabric software upgrades without rebooting.
Notably, a compute node in an environment running Tungsten Fabric 2005 or later has not enabled huge page support for kernel-mode vRouters until it is rebooted. The 2x1GB huge page support configuration is present by default, but it isn’t enabled until the compute node is rebooted.
We recommend only using 1GB or 2MB kernel-mode vRouter huge pages in most environments. You can, however, simultaneously enable 1GB or 2MB kernel-mode vRouter huge pages in Red Hat Openstack environments if your environment requires enablement of both huge page options.
Changing vRouter huge page configuration settings in a Red Hat Openstack environment typically requires a compute node reboot.
1 GB pages: Reboot required.
2 MB: Reboot usually required. The reboot is sometimes avoided in environments where memory isn’t highly fragmented or the required number of pages can be easily allocated.
We recommend allotting 2GB of memory—either using the default 1024x2MB huge page size setting or the 2x1GB size setting—for huge pages in most environments. Some larger environments might require additional huge page memory settings for scale. Other huge page size settings should only be set by expert users in specialized circumstances.
If the ContrailVrouterHugepages1GB: and ContrailVrouterHugepages2MB: parameters are set to empty value in the contrail-services.yaml file, vRouter huge pages are disabled.
Update the overcloud by entering the openstack overcloud update prepare command and include the files that were updated during the previous steps with the overcloud update.
openstack overcloud update prepare --templates tripleo-heat-templates/ --roles-file tripleo-heat-templates/roles_data_contrail_aio.yaml -e environment-rhel-registration.yaml -e tripleo-heat-templates/extraconfig/pre_deploy/rhel-registration/rhel-registrationresource-registry.yaml -e tripleo-heat-templates/environments/contrail/contrail-services.yaml -e tripleo-heat-templates/environments/contrail/contrail-net-single.yaml -e tripleo-heat-templates/environments/contrail/contrail-plugins.yaml -e misc_opts.yaml -e contrail-parameters.yaml -e docker_registry.yaml
Prepare the overcloud nodes that include TF containers for the update.
Pull the images in the repository onto the overcloud nodes.
There are multiple methods for performing this step. Commonly used methods for performing this operation include using the docker pull command for Docker containers and the openstack overcloud container image upload command for Openstack containers, or running the tripleo-heat-templates/upload.containers.sh and tools/contrail/update_contrail_preparation.sh scripts.
(Not required in all setups) Provide export variables for the script if the predefined values aren’t appropriate for your environment. The script location:
The following variables within the script are particularly significant for this upgrade:
CONTRAIL_NEW_IMAGE_TAG—The image tag of the target upgrade version of TF. The default value is latest.
If needed, you can obtain this parameter for a specific image from the README Access to Contrail Registry 20XX .
Some older deployments use the CONTRAIL_IMAGE_TAG variable in place of the CONTRAIL_NEW_IMAGE_TAG variable. Both variables are recognized by the update_contrail_preparation.sh script and perform the same function.
SSH_USER—The SSH username for logging into overcloud nodes. The default value is heat-admin.
SSH_OPTIONS—Custom SSH option values.
The default SSH options for your environment are typically pre-defined. You are typically only changing this value if you want to customize your update.
STOP_CONTAINERS—The list of containers that must be stopped before the upgrade can proceed. The default value is contrail_config_api contrail_analytics_api.
Run the script:
TF services stop working when the script starts running.
Update the Tungsten Fabric Controller nodes:
Run the openstack overcloud update run command on the first TF controller and, if needed, on a Tungsten Fabric Analytics node. The purpose of this step is to update one Tungsten Fabric Controller and one Tungsten Fabric Analytics node to support the environment so the other Tungsten Fabric Controllers and analytics nodes can be updated without incurring additional downtime.
openstack overcloud update run --nodes overcloud-contrailcontroller-0
Ensure that the TF status is ok on overcloud-contrailcontroller-0 before proceeding.
If the analytics and the analyticsdb nodes are on separate nodes, you may have to update the nodes individually:
openstack overcloud update run --nodes overcloud-contrailcontroller-0 openstack overcloud update run --roles ContrailAnalytics,ContrailAnalyticsDatabase
After the upgrade, check the docker container status and versions for the Tungsten Fabric Controllers and the Tungsten Fabric Analytics and AnalyticsDB nodes.
docker ps -a
Update the remaining Tungsten Fabric Controller nodes:
openstack overcloud update run --nodes overcloud-contrailcontroller-1 openstack overcloud update run --nodes overcloud-contrailcontroller-2 openstack overcloud update run --nodes overcloud-contrailcontroller-3 ...
Update the Openstack Controllers using the openstack overcloud update run commands:
openstack overcloud update run --nodes overcloud-controller-0 openstack overcloud update run --nodes overcloud-controller-1 openstack overcloud update run --nodes overcloud-controller-2 ...
Individually update the compute nodes.
The compute node agent will be down during this step. The estimated downtime varies by environment, but is typically between 1 and 5 minutes.
Consider migrating workloads that can’t tolerate this downtime before performing this step
openstack overcloud update run --nodes overcloud-novacompute-1 openstack overcloud update run --nodes overcloud-novacompute-2 openstack overcloud update run --nodes overcloud-novacompute-3 ...
Reboot your compute node to complete the update.
A reboot is required to complete this procedure only if a kernel update is also needed. If you would like to avoid rebooting your compute node, check the log files in the /var/log/yum.log file to see if kernel packages were updated during the compute node update. A reboot is required only if kernel updates occurred as part of the compute node update procedure.
Use the contrail-status command to monitor upgrade status. Ensure all pods reach the
runningstate and all services reach the
This contrail-status command provides output after a successful upgrade:
Some output fields and data have been removed from this contrail-status command sample for readability.
Pod Service Original Name State analytics api contrail-analytics-api running analytics collector contrail-analytics-collector running analytics nodemgr contrail-nodemgr running analytics provisioner contrail-provisioner running analytics redis contrail-external-redis running analytics-alarm alarm-gen contrail-analytics-alarm-gen running analytics-alarm kafka contrail-external-kafka running analytics-alarm nodemgr contrail-nodemgr running analytics-alarm provisioner contrail-provisioner running analytics-alarm zookeeper contrail-external-zookeeper running analytics-snmp nodemgr contrail-nodemgr running analytics-snmp provisioner contrail-provisioner running analytics-snmp snmp-collector contrail-analytics-snmp-collector running analytics-snmp topology contrail-analytics-snmp-topology running config api contrail-controller-config-api running <trimmed> == Contrail control == control: active nodemgr: active named: active dns: active == Contrail analytics-alarm == nodemgr: active kafka: active alarm-gen: active == Contrail database == nodemgr: active query-engine: active cassandra: active == Contrail analytics == nodemgr: active api: active collector: active == Contrail config-database == nodemgr: active zookeeper: active rabbitmq: active cassandra: active == Contrail webui == web: active job: active == Contrail analytics-snmp == snmp-collector: active nodemgr: active topology: active == Contrail config == svc-monitor: active nodemgr: active device-manager: active api: active schema: active
Enter the openstack overcloud update converge command to finalize the update.
The options used in the openstack overcloud update converge in this step will match the options used with the openstack overcloud update prepare command entered in step 7.
openstack overcloud update converge --templates tripleo-heat-templates/ --roles-file tripleo-heat-templates/roles_data_contrail_aio.yaml -e environment-rhel-registration.yaml -e tripleo-heat-templates/extraconfig/pre_deploy/rhel-registration/rhel-registrationresource-registry.yaml -e tripleo-heat-templates/environments/contrail/contrail-services.yaml -e tripleo-heat-templates/environments/contrail/contrail-net-single.yaml -e tripleo-heat-templates/environments/contrail/contrail-plugins.yaml -e misc_opts.yaml -e contrail-parameters.yaml -e docker_registry.yaml
Monitor screen messages indicating
SUCCESSto confirm that the updates made in this step are successful.