Table of Contents
Get the latest news

Shift Left with Multibranch Pipeline Using Argo Workflows

Gosha Dozoretz | Sr. DevOps Engineer

Table of Contents

This blog is a follow-up to our earlier discussion of multibranch pipelines and how they can help streamline software development processes. There we explored the benefits of managing pipelines in the same repository as code and how that gives developers the ability to version their pipelines alongside their code, ensuring they remain in sync. 

Now, we’ll dive deeper into this concept by looking at how to implement a multi-branch pipeline using Argo Workflows. After running a short POC without a multibranch pipeline architecture, we understood just how crucial this is. By leveraging this powerful tool, developers can define and manage pipelines for different branches of the codebase, which gives them the ability to debug specific branches efficiently, further automating and standardizing the development process. 

So, let’s get started and see how Argo Workflows can help us shift left and streamline our development workflows.

Seeder Workflow

I want to be real with you for a second. While I previously discussed using Argo Events to dynamically template a Workflow CRD for a multibranch pipeline, I haven’t actually utilized it for this purpose yet. That’s because one of the main challenges with unmarshalling a JSON payload into a struct is that we need to have a predefined struct in place. To simplify this process, I created a Workflow called Seeder which generates a new Workflow CRD that can be submitted to the Argo Workflow server. 

This offers a great deal of flexibility for generating new CRDs and can greatly simplify the process of implementing a multibranch pipeline. With the Seeder Workflow, we can automate the process of creating new workflows, making it much easier to manage pipelines in the same repository as the code.

How To Implement

The process of injecting parameters, labels, and steps into the Workflow CRD begins with a webhook from the source code repository that sends a payload to ArgoEvent’s events bus. A Sensor then collects this payload and creates a Seeder Workflow, which is responsible for injecting the necessary parameters, labels, and steps into the Workflow CRD.

To implement the Seeder Workflow, we begin by extracting important information from the payload, including the repository name, user, and pull request address. This information is then utilized to create a new Workflow CRD based on a predefined template CRD. The template CRD contains important workflow configurations such as TTL, volumes, and tolerations, as well as an exit handler and entry point. By leveraging this template, we can ensure that all new Workflow CRDs generated by the Seeder Workflow adhere to the same standards and guidelines, thus improving consistency and streamlining the development process.

Here’s an example:



apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
 generateName: WILLBEOVERWEITTEN-
 labels:
   label: WILLBEOVERWEITTEN
spec:
 volumes:
 - name: shared-volume
   emptyDir: { }
 activeDeadlineSeconds: 28800
 archiveLogs: true
 arguments:
   parameters:
     - name: WILLBEOVERWEITTEN
 artifactRepositoryRef:
   configMap: artifact-repositories
 onExit: exit-handler
 entrypoint: entrypoint
 nodeSelector:
   node_pool: workflows
 serviceAccountName: argo-wf
 tolerations:
   - effect: NoSchedule
     key: node_pool
     operator: Equal
     value: workflows
 ttlStrategy:
   secondsAfterCompletion: 259200
 templates:
   - name: entrypoint
     steps:
       - - name: pipeline-init
           template: main
   - name: exit-handler
     dag:
       tasks:
         - name: github-status
           templateRef:
             name: common-toolkit
             template: github-notify
         - name: slack-notify
           templateRef:
             name: common-toolkit
             template: slack-channel-notify

Next, the Seeder Workflow downloads the .workflows folder from the source code repository branch. This folder contains several YAML files that define the structure and contents of the workflow. The most important of these files is the main.yaml file, which contains a lean YAML configuration of the main DAG and references WorkflowTemplate templates or “local” ones from the template.yaml file, which holds workflow scope templates implementation of templates.

Main.yaml example:

- name: git-clone
templateRef:
  name: git-toolkit
  template: git-clone
arguments:
  parameters:
    - name: branch
      value: "{{ workflow.parameters.branch }}" #Workflow scope parameters
    - name: repo_name
      value: "{{ workflow.parameters.repo_name }}"
 - name: hello-world
Template: local-template
arguments:
  parameters:
    - name: msg
      value: "I am referencing a workflow scope template"

Template.yaml example:

- name: local-template
 inputs:
   parameters:
     - name: msg
 script:
   image: alpine
   command: [ bash ]
   source: |
     echo "{{inputs.parameters.msg}}"

The Seeder Workflow then merges these templates with the created Workflow CRD, injecting the necessary steps and parameters into the workflow. Additionally, the Seeder Workflow can use the parameters.yaml file to add additional workflow scope parameters, allowing for more fine-grained control over the workflow’s execution.

Finally, the Seeder Workflow submits the resulting Workflow CRD, after linting, to the Argo Workflow server, creating a new workflow instance that adheres to the template we have defined. By using the Seeder Workflow to automate the process of creating new workflows, we can reduce errors caused by manual entry, ensure that all workflows adhere to a standardized structure, and streamline the development process for our team.

Debug Pause

Enabling developers to debug and pause their pipelines is crucial in modern software development practices, as it gives them more control over the process. Argo Workflows already has a feature called debug pause that sleeps before and after executing target scripts. Still, the feature is limited as it only checks if the environment variable exists, making it difficult to turn it on and off as a feature flag for specific steps.

To enable the debugging of specific steps, I contributed to the Argo Workflows open-source project by adding a simple check for the value of environment variables ARGO_DEBUG_PAUSE_AFTER and ARGO_DEBUG_PAUSE_BEFORE. These environment variables were added to each of the templates with a default value of ‘false’ and now they are waiting to be changed for debug execution.

To enable this feature for a step, I added a debug.yaml file to the ‘.workflows’ folder in the source code repository. This file declares which steps to debug and ensures that the debug functionality is limited to a specific branch only. The Seeder pipeline injects ‘true’ values into the two environment variables at the specified step as declared in the debug.yaml file. By doing so, developers can easily debug and troubleshoot issues before and after executing target scripts.

Example of this file:

steps:
 - git-clone

Summing Up 

Implementing a multibranch pipeline using Argo Workflows can help stream software development processes by automating and standardizing the development process. The Seeder Workflow, which generates a new Workflow CRD that can be submitted to the Argo Workflow server, simplifies the process of injecting parameters, labels, and steps into the Workflow CRD. By leveraging the template CRD, we can ensure that all new Workflow CRDs adhere to the same standards and guidelines, improving consistency and streamlining the development process. 

Additionally, enabling developers to debug and pause their pipelines is crucial in modern software development practices, giving them more control over the process. By utilizing Argo Workflows’ features, developers can create a more efficient and effective development process for their team.

Rookout Sandbox

No registration needed

Play Now