If your project is using GitHub Actions for its continuous integration and continuous delivery (CI/CD) platform, it is likely that your build jobs will fail from time to time. Fear not, as GitHub Actions allows you to re-run failed jobs in a workflow resulting in a new workflow run that will start for all failed jobs and their dependents. You can certainly do so in your web browser using the GitHub Actions UI, but there are faster ways of doing this.
As documented by GitHub, one can use the Github CLI and use the run rerun
subcommand with the --failed
flag. You will need to find the ID of the run for which you want to re-run failed jobs. If you don’t specify an ID, GitHub CLI returns an interactive menu for you to choose a recent failed run.
gh run rerun $RUN_ID --failed
This works but it’s slightly tedious to find the ID. Can we find that automatically?
The following script attempts to find the latest failed job triggered off of the master
branch of a GitHub repository. It then tries to find the workflow run identifiers of the failed runs, and then invokes the same command as above to rerun the jobs:
function ghrfj() {
repo=${1:-org/repository}
branch=${2:-master}
workflow="$3"
sha=`gh api "/repos/$repo/branches/$branch" | jq -r '.commit.sha'`
echo "Latest master commit SHA is $sha for repository: $repo"
json_data=`gh api "repos/$repo/actions/runs?status=failure&per_page=1&page=1&branch=$branch&head_sha=$sha"`
fid=$(echo "$json_data" | jq --arg wf "$workflow" -r '.workflow_runs[] | select(.name == $wf) | .id' )
if [ -n "$fid" ]; then
echo "Rerunning failed workflow run with id $fid"
gh run rerun $fid --failed
else
echo "$workflow: Passing!"
fi
}
For example, assuming your repository has a single Build
workflow and to rerun the failed jobs in this workflow you can run:
# ghrfj: GitHub Run Failed Jobs
ghrfj org/repository master Build
Of course, replace org/repository
with your own.
The above approach is certainly a step forward and with small customizations, you can make it a lot more flexible. However, things tend to get slightly complicated when:
You’ll need to tweak the filtering logic above to find the right workflow runs and IDs. And what’s more, you’ll have to realize/monitor that a workflow has failed and then invoke the above function manually. That is tedious. Can we automate this process entirely?
One possible solution would be to define a GitHub Action’s workflow whose sole responsibility would be to find failed workflow runs and rerun those. This workflow only needs to run when certain designated workflows have been completed and failed. So it might be something like this:
name: Rerun Workflows
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
on:
workflow_run:
workflows:
- Build # Replace with your workflow(s)
types:
- completed
branches:
- master # Replace with your branch
jobs:
rerun-failed-jobs:
runs-on: ubuntu-latest
if: ${{ github.event.workflow_run.conclusion == 'failure' }}
steps:
- name: Rerunning ${{ github.event.workflow_run.name }}
run: |
echo "Workflow run ID: ${{ github.event.workflow_run.id }}"
# Rerun stuff here...
The github.event.workflow_run.id
is the id of the workflow that in fact has failed, i.e. Build
.
Finally, when you rerun the workflow you need to make sure the logic accounts for endless loops. The Build
workflow can continue to fail and run again endlessly if you’re not keeping tabs on the number of retry attempts. Your rerun logic might want to consider the current attempt count via github.event.workflow_run.run_attempt
and only rerun the job if that number is below a certain threshold.
If you have questions about the contents and the topic of this blog post, or if you need additional guidance and support, feel free to send us a note and ask about consulting and support services.
Happy Coding,
Monday-Friday
9am-6pm, Central European Time
7am-1pm, U.S. Eastern Time
Monday-Friday
9am-6pm, Central European Time