Solution Live: Echoes Lost in Orbit | 🟡 The Silent Canary

The solution for Level 2 (Intermediate) of the “Echoes Lost in Orbit” adventure is now available! If you’ve been wrestling with stuck Argo Rollouts, broken canary deployments, or mysterious Prometheus errors, this is your chance to check your approach, learn best practices, and take your GitOps skills to the next level.

:warning: Spoiler Alert: The following post contains a full walkthrough of the solution. If you want to solve the challenge on your own, consider coming back only if you get stuck or want to verify your work.

:memo: In a Nutshell: How to Fix The Silent Canary

This is a high-level overview of the solution steps. For detailed reasoning, troubleshooting tips, and a step-by-step walkthrough, check out the full guide: Intermediate Solution: The Silent Canary.

  1. Check the Objectives:
    • Ensure you understand the requirements: correct podinfo version, automatic canary progression, working health metrics, and successful rollouts.
  2. Diagnose Stuck Rollouts:
    • Use kubectl and the Argo Rollouts UI to inspect rollout status and error messages.
    • Identify configuration errors that prevent progression.
  3. Fix Prometheus Service Reference:
    • Update the AnalysisTemplate to use the correct Prometheus service name (prometheus-server instead of prom-server) to resolve connectivity issues for health checks.
  4. Correct and Implement Health Metrics:
    • For container-restarts, ensure the metric succeeds only when there are zero restarts:
      successCondition: result[0] == 0
      
    • For ready-containers, add the missing PromQL query to check that at least one container is ready:
      query: |
        sum(kube_pod_container_status_ready{
          namespace="{{args.namespace}}",
          pod=~"echo-server-.*"
        }) or vector(0)
      successCondition: result[0] >= 1
      
    • Use Prometheus queries and the UI to validate your logic before updating the manifest.
  5. Verify Rollout Progression:
    • Retry the rollout and confirm that analysis runs are successful and the deployment completes in both environments.
    • Use the Argo Rollouts UI and CLI for real-time feedback.

:open_book: Want the Full Walkthrough?

For a detailed, step-by-step guide (with screenshots, reasoning, and troubleshooting tips), check out the full solution docs:

:backhand_index_pointing_right: Intermediate Solution: The Silent Canary

Happy learning and good luck with the next challenge! :rocket:

1 Like