PART 1: UPDATE STRATEGIES | FULL & PARTIAL CLUSTER UPDATES
TL;DR
Offering users the options to update full or partial cluster can provide benefits such as reduced downtime, improved control plane functionality, and compatibility with custom configurations, while also requiring fewer resources. Hence, reduce the support tickets for long maintenance time.
1UNDERSTAND THE CLI WORKFLOW
To ensure a consistent user experience across both the web UI and CLI, I conducted research on how users typically their clusters using the CLI based on existing documentation. While it's true that there are numerous sub-steps involved in ensuring a successful update, this simplified version provided me with a clear understanding of the update workflow on the CLI, which enabled me to develop a workable workflow on the web UI.
Click on image to enlarge
![](img/cliworkflow.svg)
2DEVELOP WEB UI WORKFLOW
In collaboration with a back-end engineer and a PM, I defined various use cases for update strategies and created workflows for each of them. I particularly enjoy creating workflows as they provide a bird's-eye view that allows me to understand various use cases. Moreover, my workflow designs are helpful for collaborating with other designers and developers.
Click on image to enlarge
![](img/uiflow.svg)
3HI-FI DESIGNS
USE CASE 1: When everything looks good to update
Full cluster update
![](img/usecase1-full.png)
Partial cluster update
Click on image to enlarge
![](img/usecase1.png)
USE CASE 2: There are paused worker nodes, but users can still update
Click on image to enlarge
![](img/usecase2.png)
USE CASE 3: There are paused worker nodes, but users cannot update until paused worker nodes are resumed and done updating
Click on image to enlarge
![](img/usecase3.png)
PART 2: IMPLEMENTING CONDITIONAL UPDATES
1WHAT IS A CONDITIONAL UPDATE?
TL;DR
Prior to version 4.9, it was not recommended to use versions with known bugs. However, with the release of version 4.9 and onwards, known risks can be identified and supported if users choose to update to these versions. These versions are known as conditional updates.
Once again, providing users with more flexibility in choosing updates that suit their needs requires additional guardrails from Red Hat. We strive to strike a balance between user autonomy and ensuring the necessary safeguards are in place to maintain the stability and security of the system.
2CONDTIONAL UPDATE AND UPGRADEABLE=FALSE
The upgradeable=false
issue can hinder users from updating to specific minor versions due to a previously addressed bug associated with conditional updates. To effectively tackle this, understanding the relevant use cases through documentation or collaborating with peers is crucial. Let's shift our attention from technical terms to the user interactions necessary to proceed when encountering these use cases.
SCENARIO 1
Upgradeable=False
is most likely to happen in the next minor version
WHAT CAN USERS DO?
In order to update to the next minor version or a patch version (z-stream), users are required to take action to resolve this issue. The steps to resolve this issue will be listed in the alert banner.
SCENARIO 2
Conditional updates are most likely to happen in patch versions, but can also happen in BOTH patch and minor versions.
WHAT CAN USERS DO?
Once users become aware of known risks, they have the freedom to decide whether they want to proceed with the update, choose a different recommended version, or postpone the update entirely.
SCENARIO 3
When BOTH Upgradeable=False
and conditional updates happen to a minor version, Upgradeable=False
will precede conditional update issues.
WHAT CAN USERS DO?
Before updating to a certain minor version, users may encounter Upgradeable=False
issue that needs to be resolved first. After that, they may come across conditional update issues which they need to address before deciding to proceed with the update or opt for other recommended versions or wait.
3DESIGN
WORKFLOW
I will never stop talking about how much I love creating workflows.
(Click image to enlarge)
![](img/workflow.png)
USE CASES
![](img/updatemaster.png)
PART 3: ESTABLISHING A NOTIFICATION SYSTEM
Notifications are important during cluster update to keep users informed about the progress and any issues that may arise. This helps users to take necessary actions and avoid any potential problems.
I created a notification system for cluster updates with input from PM and internal SREs. It includes severity levels and actionable CTAs while avoiding intrusiveness, using PatternFly components.
![](img/alerts.png)