Putting The Rust In GitHub Actions
11-01-2024
GitHub Actions feels like a great idea that is poorly executed. Automating all secondary processes in software development (code formatting, linting, testing, build, release, etc.) is key to being able to work productively on large software projects. GitHub Actions solves this problem in the worst possible way. But since it is backed in GitHub, it is now the de facto standard for workflow automation.
What GitHub Actions is: Throw config files together to automatically run commands on certain repository events to automate processes that would otherwise be manual. What it could be: A metaprogramming environment that allows developers to program workflows to manage their codebases in the same way they write software.
Unfortunately, YAML files and a clunky UI don't make for a great developer experience. Awaiting a better solution, we can hide the pain by painting over GitHub Actions with a thick layer of Rust.
The idea is: Make a workflow in Rust with the excellent octocrab
crate. Use a minimal GitHub Actions workflow to run it.
For storytelling purposes, let us create a workflow that ensures that all open issues in our repo have at least one of a required set of labels.
I opted to put the Rust project inside the .github
directory, close to where the rest of the CI
stuff lives, in .github/actions/issue_labels_check
:
$ tree -a
.
└── .github
└── actions
└── issue_labels_check
├── Cargo.toml
└── src
└── main.rs
We will be using three crates:
anyhow
for error handling.octocrab
for interacting with GitHub.tokio
with a minimal feature set for our async executor (required foroctocrab
).
Let's start with the project Cargo.toml
:
[package]
name = "issue_labels_check"
version = "0.1.0"
edition = "2021"
[dependencies]
anyhow = "1.0"
octocrab = "0.32"
tokio = { version = "1.0", default_features = false, features = [
"macros",
"rt-multi-thread",
] }
This is the list of accepted labels:
static LABEL_SET: [&str; 3] = ["bug", "enhancement", "idea"];
Inside our main
function we'll load some environment variables that we are going to need to
interact with the GitHub API:
GITHUB_TOKEN
: The token we will use to authenticate with GitHub. This will be provided by GitHub Actions and we'll assume the github-actions bot identity with it. We must pass it manually from the workflow later.GITHUB_REPOSITORY
: This is the repository slug: organization/repo. It is set automatically in the GitHub Actions runner context so no need to pass it manually.
Here's the code:
let github_token = std::env::var("GITHUB_TOKEN").map_err(|_| {
anyhow!("missing github token: make sure to set the `GITHUB_TOKEN` environment variable",)
})?;
let repository = std::env::var("GITHUB_REPOSITORY")
.context("environment variable `GITHUB_REPOSITORY` not set")?;
// ...
}
There will be two triggers for this GitHub workflow:
- All issues and pull requests will be checked periodically.
- When an issue or pull request is opened it is checked.
The caller of our Rust program can choose to provide a number as argument. If they do, we only check the corresponding issue. If they don't pass a number, we go through all issues and PRs.
let issue_number = match std::env::args().nth(1) {
Some(issue_number) => Some(
issue_number
.parse::<u64>()
.context("invalid issue number")?,
),
None => None,
};
Here comes the part where we use octocrab
to connect to GitHub, and do all the magic.
First, we extract the owner and repository name from the repository slug we pulled from the
environment variable:
let (owner, repository) = repository
.split_once('/')
.ok_or(anyhow!("invalid repository"))?;
Second, we initialize octocrab
with the token and create an issue handler object:
let octocrab = octocrab::OctocrabBuilder::default()
.personal_token(github_token)
.build()
.context("failed to initialize octocrab")?;
let issue_handler = octocrab.issues(owner, repository);
Third, depending on whether the caller passed a specific issue number:
if let Some(issue_number) = issue_number {
let issue = issue_handler.get(issue_number).await?;
check_issue(&issue, &issue_handler).await?;
} else {
let issues = octocrab
.all_pages(
issue_handler
.list()
.state(octocrab::params::State::Open)
.per_page(100)
.send()
.await?,
)
.await?;
for issue in &issues {
check_issue(issue, &issue_handler).await?;
}
}
Note that we use the all_pages
function in octocrab
to take some of the page enumeration
boilerplate code away.
And finally the check_issue
function:
async fn check_issue(
issue: &octocrab::models::issues::Issue,
issue_handler: &octocrab::issues::IssueHandler<'_>,
) -> Result<()> {
let mut have_label = false;
for label in &issue.labels {
if LABEL_SET.contains(&&*label.name) {
have_label = true;
}
}
if !have_label {
issue_handler
.create_comment(
issue.number,
format!(
":warning: Missing one of the required labels: {}.",
labels
.into_iter()
.map(|label| format!("`{label}`"))
.collect::<Vec<_>>()
.join(", "),
),
)
.await?;
}
Ok(())
}
We still need a GitHub Actions workflow to actually run the thing though. Let's name it:
issue_labels_check.yaml
, and start of with:
name: "issue labels check"
env:
CARGO_TERM_COLOR: always
(I guess we want to see those sweet Cargo colors.)
Since our Rust project does not live in the root directory, we have to change the working directory:
defaults:
run:
working-directory: .github/actions/issue_labels_check
This is how we can configure the workflow to run periodically, as well as when an issue or pull request is created:
on:
schedule:
- cron: '0 8 * * 1' # every Monday at 8am
issues:
types: [opened]
pull_request:
types: [opened]
(Monday 8am is the perfect time to bother people.)
The workflow has a single job: check
. In it, we pass on the GitHub Actions built-in token to
the environment variable that our Rust program expects. For security reasons, GitHub does not do
this automatically. We also define an environment variable that contains the event name, which
will be either schedule
, issues
or pull_request
. Lastly, we put the issue or event number
in a variable.
I somehow never noticed that issues and pull requests share their IDs. If you create a new
repository and create an issue, then create a pull request, they will have numbers #1 and #2. The
same happens in the API. Listing issues with the GitHub API (/issues
) returns both issues and
pull requests. Listing pull requests does not return issues. Regardless, the issue opened event in
Actions is not triggered when a pull request is opened. You have to use pull_request.opened
for that. Moreover, to access the "number", you must use github.event.pull_request.number
if a
pull request event was triggered, github.event.issue.number
if it was triggered by an "issues"
event. Guess what API endpoint you have to use to post comments on a pull request? It's
/issues/comments
...
Anyway, let's initialize the check
job:
jobs:
check:
runs-on: ubuntu-latest
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_EVENT: ${{ github.event_name }}
GITHUB_EVENT_NUMBER: ${{ github.event.pull_request.number || github.event.issue.number }}
The job contains the steps needed to compile and run our Rust action. The last step has some logic to decide whether or not to pass on the event number, depending on what event triggered the workflow.
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Rust
uses: dtolnay/rust-toolchain@v1
with:
toolchain: stable
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.2.1
with:
workspaces: .github/actions/issue_labels_check
- name: Check issue labels
run: |
if [ "${GITHUB_EVENT}" == "schedule" ]; then
cargo run --release
elif [ "${GITHUB_EVENT}" == "issues" ]; then
cargo run --release "${GITHUB_EVENT_NUMBER}"
elif [ "${GITHUB_EVENT}" == "pull_request" ]; then
cargo run --release "${GITHUB_EVENT_NUMBER}"
fi
The rust-cache
step helps to speed up the workflow file. It still takes a bit too long for a
simple action. Especially compared to running a bash file. It would probably be better to compile
the action in a separate workflow, upload the binary as an artifact and then download and run it.
Anyway, our bot now works and posts comments on unlabeled issues:
I enjoyed this way of writing a GitHub workflow. Especially compared to the bash-based monstrosities that GitHub themselves proposes for a similar task:
name: Report remaining open issues
on:
schedule:
# Daily at 8:20 UTC
- cron: '20 8 * * *'
jobs:
track_pr:
runs-on: ubuntu-latest
steps:
- run: |
numOpenIssues="$(gh api graphql -F owner=$OWNER -F name=$REPO -f query='
query($name: String!, $owner: String!) {
repository(owner: $owner, name: $name) {
issues(states:OPEN){
totalCount
}
}
}
' --jq '.data.repository.issues.totalCount')"
echo 'NUM_OPEN_ISSUES='$numOpenIssues >> $GITHUB_ENV
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
OWNER: ${{ github.repository_owner }}
REPO: ${{ github.event.repository.name }}
- run: |
gh issue create --title "Issue report" --body "$NUM_OPEN_ISSUES issues remaining" --repo $GITHUB_REPOSITORY
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Please don't do this. Write Rusty workflows instead!