Testing Azure Pipeline Artifacts

Azure Pipelines supports several types of artifacts. This is about Pipeline Artifacts, the ones managed with the publish and download tasks.

Any pipeline that builds code should do at least these two things:

  • Build an artifact that can be deployed later.
  • Test that artifact.

Specifically, it should test the artifact. Not just the same version of application code used to build the artifact, but the actual artifact file that was built. If it only tests the same version of code, it won’t detect bugs in how that code was built.

This is a case where more errors are better. If we hide errors at build time, we’ll have to diagnose them at deploy time. That could cause release failures, and maybe outages. Like the Zen of Python says:

Errors should never pass silently.

Testing the artifact itself is a best practice for any build pipeline. It’s better to find out right away that the code wasn’t built correctly.

First we’ll create a pipeline that tests its code but not the artifact it builds. We’ll include an artificial bug in the build process and show that the pipeline passes tests but still creates a broken artifact. Then we’ll rework to point the tests at the artifact so the build bug gets caught by the tests and becomes visible.

These examples use Python’s tox, but the principles are the same for any tooling. Tox is a testing tool that creates isolated Python environments, installs the app being tested into those environments, and runs test commands.

Setup

First we need a Python package. We’ll make a super-simple one called app:

.
├── app
│   ├── __init__.py
│   └── main.py
├── setup.py
└── tox.ini

app is a single package with an empty __init__.py. The package contains one main.py module that defines one function:

def main():
    print('Success!')

setup.py contains config that lets us build app into a Python wheel file (an artifact that can be installed into a Python environment):

from setuptools import setup

setup(
    author='Demo Author',
    license='MIT',
    description='Demo app.',
    name='app',
    packages=['app'],
    version='0.0.1'
)

tox.ini tells tox to call our main() function:

[tox]
envlist = py38

[testenv]
commands = python -c 'from app.main import main; main()'

That’s not a real test, but it’ll be enough to show the difference between exercising source code and built artifacts. A real project would use the unittest library or pytest or another tool here.

This test passes locally:

(env3) PS /Users/adam/Local/laboratory/pipelines/testing_pipeline_artifacts> tox -e py38
GLOB sdist-make: /Users/adam/Local/laboratory/pipelines/testing_pipeline_artifacts/setup.py
py38 recreate: /Users/adam/Local/laboratory/pipelines/testing_pipeline_artifacts/.tox/py38
py38 inst: /Users/adam/Local/laboratory/pipelines/testing_pipeline_artifacts/.tox/.tmp/package/1/app-0.0.1.zip
py38 installed: app @ file:///Users/adam/Local/laboratory/pipelines/testing_pipeline_artifacts/.tox/.tmp/package/1/app-0.0.1.zip
py38 run-test-pre: PYTHONHASHSEED='3356214888'
py38 run-test: commands[0] | python -c 'from app.main import main; main()'
Success!
___________________________________________________________________________ summary ____________________________________________________________________________
  py38: commands succeeded
  congratulations :)

Negative Case

Our code works locally, now we need a build pipeline to make an artifact we can deploy. We’ll start with the negative case, a broken build that still passes tests:

jobs:
- job: Build
  pool:
    vmImage: ubuntu-20.04
  workspace:
    clean: outputs
  steps:
  - task: UsePythonVersion@0
    displayName: Use Python 3.8
    inputs:
      versionSpec: '3.8'
  - pwsh: pip install --upgrade pip setuptools wheel
    displayName: Install build tools
  - pwsh: Remove-Item app/main.py
    workingDirectory: $(Build.SourcesDirectory)/pipelines/testing_pipeline_artifacts
    displayName: BUILD BUG
  - pwsh: python setup.py bdist_wheel --dist-dir $(Build.BinariesDirectory)
    workingDirectory: $(Build.SourcesDirectory)/pipelines/testing_pipeline_artifacts
    displayName: Build wheel
  - publish: $(Build.BinariesDirectory)/app-0.0.1-py3-none-any.whl
    displayName: Publish wheel
    artifact: wheel

- job: Test
  dependsOn: Build
  pool:
    vmImage: ubuntu-20.04
  steps:
  - task: UsePythonVersion@0
    displayName: Use Python 3.8
    inputs:
      versionSpec: '3.8'
  - pwsh: pip install --upgrade tox
    displayName: Install tox

    # This tests the version of code used to build the 'wheel' artifact, but it
    # doesn't test the artifact itself.
  - pwsh: tox -e py38
    workingDirectory: $(Build.SourcesDirectory)/pipelines/testing_pipeline_artifacts
    displayName: Run tox on code

The Build and Test jobs both succeed even though the BUILD BUG task ran:

An artifact is published to the pipeline:

We can download it:

But, if we install it and try to import from app, we get errors:

(env3) PS /Users/adam/Downloads> pip install ./app-0.0.1-py3-none-any.whl
Processing ./app-0.0.1-py3-none-any.whl
Installing collected packages: app
Successfully installed app-0.0.1
(env3) PS /Users/adam/Downloads> python
Python 3.8.3 (default, Jul  1 2020, 07:50:15) 
[Clang 11.0.0 (clang-1100.0.33.16)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from app.main import main
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'app.main'

It couldn’t find app.main because that module doesn’t exist. Our BUILD BUG task deleted it before the artifact was built. Later, the Test job checked out a fresh copy of the code, which included a fresh copy of the file we accidentally deleted in the Build job. Tox ran in that fresh environment and passed because all the files it needed were present. It was testing the code from the repo, not the artifact we built.

The Fix

If the artifact created by the pipeline doesn’t work, the pipeline should fail. To make tox test the app-0.0.1-py3-none-any.whl file built in the Build job, we need to do two things:

  • Download the artifact in the Test job.
  • Tell tox to test that artifact instead of the files from the repo. Normally, tox builds its own artifacts from source when it runs (that’s what you want when you’re testing locally). We can override this and tell it to install our pipeline’s artifact with the --installpkg flag.

First we need to modify the Test job from our pipeline:

- job: Test
  dependsOn: Build
  pool:
    vmImage: ubuntu-20.04
  steps:
  - task: UsePythonVersion@0
    displayName: Use Python 3.8
    inputs:
      versionSpec: '3.8'
  - pwsh: pip install --upgrade tox
    displayName: Install tox
  - download: current
    displayName: Download wheel
    artifact: wheel

    # This tests the version of code used to build the 'wheel' artifact, but it
    # doesn't test the artifact itself.
  - pwsh: tox -e py38
    workingDirectory: $(Build.SourcesDirectory)/pipelines/testing_pipeline_artifacts
    displayName: Run tox on code

    # This tests the artifact built in the build job above.
    # https://tox.readthedocs.io/en/latest/config.html#conf-sdistsrc
  - pwsh: tox --installpkg $(Pipeline.Workspace)/wheel/app-0.0.1-py3-none-any.whl -e py38
    workingDirectory: $(Build.SourcesDirectory)/pipelines/testing_pipeline_artifacts
    displayName: Run tox on artifact

We kept the old (invalid) test so we can compare it to the new test in the next run.

These two changes are needed for pretty much any build and test system. For Python and tox in this lab specifically, we also need to:

  • Recreate the Python environments between tests. In the “Run tox on code” task, tox will automatically build and install version 0.0.1 of app from the code in the repo. Unless we get rid of that, the “Run tox on artifact” task will see that version 0.0.1 of the app is already installed, so it won’t install the artifact we pass with --installpkg.
  • Change directory away from the repo root. Otherwise the test may import files from the current directory instead of the artifact we pass with --installpkg.

We can do this with two changes to tox.ini:

[tox]
envlist = py38

[testenv]
# Recreate venvs so previously-installed packages aren't importable.
# https://tox.readthedocs.io/en/latest/config.html#conf-recreate
recreate = true

# Change directory so packages in the current directory aren't importable.
# https://tox.readthedocs.io/en/latest/config.html#conf-changedir
# It's convenient to use the {toxworkdir}, but other directories work.
# https://tox.readthedocs.io/en/latest/config.html#globally-available-substitutions
changedir = {toxworkdir}

commands = python -c 'from app.main import main; main()'

The new test fails:

We get the same ModuleNotFoundError we got when we tried to install the artifact manually and import from it. That shows us the new test is exercising the artifact built in the Build job, not just the code that’s in the repo.

Now when there’s a bug in the build, the pipeline will fail at build time. Fixes can be engineered before release, and broken artifacts won’t go live.

Happy building!

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

PowerShell: Sorting Version Strings

Recently we had a large array of version strings we needed to sort. Like this, but way too long to sort by hand:

$Versions = @(
    '2.1.3',
    '1.2.3',
    '1.2.12'
)

Piping this array to Sort-Object changes the order, but not correctly.

$Versions | Sort-Object
1.2.12
1.2.3
2.1.3

It thinks 1.2.12 comes before 1.2.3. Comparing character-by-character, that’s true. 1 is less than 3. We need it to interpret everything after the period as one number. Then it’ll see that 3 is less than 12.

We can do this by casting the elements of the array to version before sorting.

[version[]]$Versions | Sort-Object

Major  Minor  Build  Revision
-----  -----  -----  --------
1      2      3      -1
1      2      12     -1
2      1      3      -1

The components are parsed out and stored individually as Major, Minor, and Build. Now that we’re sending versions instead of strings to Sort-Object, it compares the 3 build to the 12 build and gets the order right.

Of course, now we have version objects instead of the strings we started with. We can convert back with the ToString() method.

[version[]]$Versions | Sort-Object | foreach {$_.ToString()}
1.2.3
1.2.12
2.1.3

That one-liner is usually all that’s needed. The main limitation is the version class. It works with up to four integer components delimited by dots. That doesn’t handle some common conventions.

Versions are often prefixed with v, like v1.2.3. Fortunately, that doesn’t change the sorting. Just trim it out.

'v1.2.3'.TrimStart('v')
1.2.3

TrimStart() removes the v from the start of the string if it’s present, otherwise it’s a no-op. It’s safe to call on a mix of prefixed and non-prefixed strings. Run it on everything and sort like before.

Some of the patterns defined in the ubiquitous semver allow more characters and delimiters.

  • 1.0.0-alpha.1
  • 1.0.0+21AF26D3—-117B344092BD

One of these adds build metadata and semver doesn’t consider build metadata in precedence, so depending on your situation you might be able to just trim off the problem characters. If not, you’ll need a different parser.

Hope this helped!

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

Azure Pipelines: Loops

This is about Azure YAML Pipelines, not Azure Classic Pipelines. The docs explain the differences.

Everything shown here is in the docs, but the pieces are scattered and the syntax is fiddly. It took some digging and testing to figure out the options. This article collects all the findings.

In Azure Pipelines, the matrix job strategy and the each keyword both do loop operations. This article shows how to use them in five patterns that dynamically create jobs and steps, but you’re not limited to these examples. The each keyword, especially, can be adapted to many other cases.

Table of Contents

Jobs Created by a Hardcoded Matrix

Pipeline jobs with a matrix strategy dynamically create copies of themselves that each have different variables. This is essentially the same as looping over the matrix and creating one job for each set of those variables. Microsoft uses it for things like testing versions in parallel.

jobs:
- job: MatrixHardcoded
  pool:
    vmImage: ubuntu-20.04
  strategy:
    matrix:
      Thing1:
        thing: foo
      Thing2:
        thing: bar
  steps:
  - pwsh: Write-Output $(thing)
    displayName: Show thing

This creates MatrixHardcoded Thing1 and Matrix Hardcoded Thing2 jobs that each print the value of their thing variable in a Show thing step.

Jobs Created by an Each Loop over an Array

Pipelines have an each keyword in their expression syntax that implements loops more similar to what’s in programming languages like PowerShell and Python. Microsoft has great examples of its uses in their azure-pipelines-yaml repo.

parameters:
- name: thingsArray
  type: object
  default:
  - foo
  - bar

jobs:
- ${{each thing in parameters.thingsArray}}:
  - job: EachArrayJobsThing_${{thing}}
    pool:
      vmImage: ubuntu-20.04
    steps:
    - pwsh: Write-Output ${{thing}}
      displayName: Show thing

Fiddly details:

  • The ${{ }} syntax resolves into values. Since those values are prefixed with a dash (-), YAML interprets them as elements of an array. You need that dash on both the expression and job definition lines (highlighted). This feels like it’ll create an array of arrays that each contain one job, instead of a flat array of jobs, which seems like it would break. Maybe the pipeline interprets this syntax as a flat array, maybe it handles a nested one. Either way, you need both those dashes.
  • The each line has to end with a colon (:), but references to the ${{thing}} loop variable after it don’t.
  • Parameters are different from variables. Parameters support complex types (like arrays we can loop over). Variables are always strings.
  • If you need variables in your loop code, you can reference them in the expression syntax.
  • Parameters are mostly documented in the context of templates, but they can be used directly in pipelines.

This is mostly the same as a hardcoded matrix, but it creates jobs from a parameter that can be passed in dynamically.

There are some cosmetic differences. Since we used an array of values instead of a map of keys and values, there are no ThingN keys to use in the job names. They’re differentiated with values instead (foo and bar). The delimiters are underscores because job names don’t allow spaces (we could work around this with the displayName property).

We still get two jobs that each output their thing variable in a Show thing step.

Jobs Created by an Each Loop over a Map

This is the same as the previous pattern except it processes a map instead of an array.

parameters:
- name: thingsMap
  type: object
  default:
    Thing1: foo
    Thing2: bar

jobs:
- ${{each thing in parameters.thingsMap}}:
  - job: EachMapJobs${{thing.key}}
    pool:
      vmImage: ubuntu-20.04
    steps:
    - pwsh: Write-Output ${{thing.value}}
      displayName: Show thing

Since it’s processing a map, it references thing.key and thing.value instead of just thing. Again it creates two jobs with one step each.

Jobs Created by a Matrix Defined by an Each Loop over a Map

This combines the previous patterns to dynamically define a matrix using an each loop over a map parameter.

parameters:
- name: thingsMap
  type: object
  default:
    Thing1: foo
    Thing2: bar

jobs:
- job: MatrixEachMap
  pool:
    vmImage: ubuntu-20.04
  strategy:
    matrix:
      ${{each thing in parameters.thingsMap}}:
        ${{thing.key}}:
          thing: ${{thing.value}}
  steps:
  - pwsh: Write-Output $(thing)
    displayName: Show thing

Fiddly details:

  • We don’t need the YAML dashes (-) like we did in the two previous examples because we’re creating a map of config for the matrix, not an array of jobs. The ${{ }} syntax resolves to values that we want YAML to interpret as map keys, not array elements.
  • The each line still has to end with a colon (:).
  • We need a new colon (:) after ${{thing.key}} to tell YAML these are keys of a map.

The is the same as a hardcoded matrix except that its variables are dynamically referenced from a map parameter.

Steps with an Each Loop over an Array

The previous patterns used loops to dynamically create multiple jobs. This statically defines one job and dynamically creates multiple steps inside of it.

parameters:
- name: thingsArray
  type: object
  default:
  - foo
  - bar

jobs:
- job: EachArraySteps
  pool:
    vmImage: ubuntu-20.04
  steps:
  - ${{each thing in parameters.thingsArray}}:
    - pwsh: Write-Output ${{thing}}
      displayName: Show thing

As expected, we get one job that contains two Show thing steps.

The differences between these patterns are syntactically small, but they give you a lot of implementation options. Hopefully these examples help you find one that work for your use case.

Happy automating!

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

PowerShell: The Programmer’s Shell

A couple years ago, I switched all my workstations to PowerShell. Folks often ask me why I did that. What made it worth the trouble of learning a new syntax? Here are the things about Posh that made me switch:

Mac, Linux, and Windows Support

I usually run PowerShell on Mac OS, but it also supports Linux and Windows. It gives me a standardized interface to all three operating systems.

Object Oriented Model

This is the big reason. It’s the thing that makes PowerShell a programming language like Python or Ruby instead of just a scripting language like Bash.

In Bash everything is a string. Say we’ve found something on the filesystem:

bash-3.2$ ls -l | grep tmp
drwxr-xr-x   4 adam  staff   128 Oct 15 18:10 tmp

If we need that Oct 15 date, we’d parse it out with something like awk:

bash-3.2$ ls -l | grep tmp | awk '{print $6, $7}'
Oct 15

That splits the line on whitepace and prints out the 6th and 7th fields. If the whitespacing of that output string changes (like if you run this on someone else’s workstation and they’ve tweaked their terminal), this will silently break. It won’t error, it just won’t parse out the right data. You’ll get downstream failures in code that expected a date but got something different.

PowerShell is object oriented, so it doesn’t rely on parsing strings. If we find the same directory on the filesystem:

PS /Users/adam> Get-ChildItem | Where-Object Name -Match tmp

    Directory: /Users/adam

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d----          10/15/2020  6:10 PM                tmp

It displays similarly, but that’s just formatting goodness. Underneath, it found an object that represents the directory. That object has properties (Mode, LastWriteTime, Length, Name). We can get them by reference:

PS /Users/adam> Get-ChildItem | Where-Object Name -Match tmp | Select-Object LastWriteTime

LastWriteTime
-------------
10/15/2020 6:10:55 PM

We tell the shell we want the LastWriteTime property and it gets the value. It’ll get the same value no matter how it was displayed. We’re referencing a property not parsing output strings.

This makes Posh less fragile, but also gives us access to the standard toolbox of programming techniques. Its functions and variable scopes and arrays and dictionaries and conditions and loops and comparisons and everything else work similarly to languages like Python and Ruby. There’s less Weird Stuff. Ever have to set and unset $IFS in Bash? You don’t have to do that in PowerShell.

Streams

Streams are a huge feature of PowerShell, and there are already great articles that cover the details. I’m only going to highlight one thing that makes me love them: they let me add informative output similar to a DEBUG log line in Python and other programming languages. Let’s convert our search for tmp into a super-simple script:

[CmdletBinding()]
param()

function Get-Thing {
    [CmdletBinding()]
    param()
    $AllItems = Get-ChildItem
    Write-Verbose "Filtering for 'tmp'."
    return $AllItems | Where-Object Name -Match 'tmp'
}

Get-Thing

If we run this normally we just get the tmp directory:

PS /Users/adam> ./streams.ps1

    Directory: /Users/adam

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d----          10/15/2020  6:10 PM                tmp

If we run it with -Verbose, we also see our message:

PS /Users/adam> ./streams.ps1 -Verbose
VERBOSE: Filtering for 'tmp'.

    Directory: /Users/adam

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d----          10/15/2020  6:10 PM                tmp

We can still pipe to the same command to get the LastWriteTime:

PS /Users/adam> ./streams.ps1 -Verbose | Select-Object LastWriteTime
VERBOSE: Filtering for 'tmp'.

LastWriteTime
-------------
10/15/2020 6:10:55 PM

The pipeline reads objects from a different stream, so we can send whatever we want to the verbose stream without impacting what the user may pipe to later. More on this in a future article. For today, I’m just showing that scripts can present information to the user without making it harder for them to use the rest of the output.

The closest you can get to this in Bash is stderr, but that stream is used for more than just information, and realistically you can’t guess the impact of sending messages to it. Having a dedicated stream for verbose messages makes it trivial to provide information without disrupting behavior.

PowerShell is a big language and there’s a lot more to it than what I’ve covered here. These are just the things that I get daily value from. To me, they more than compensate for the (minimal) overhead of learning a new syntax.

Happy scripting!

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

Passing Parameters to Docker Builds

Hello!

When I’m building Docker images, sometimes I need to pass data from the build agent (e.g. my CI pipeline) into the build process. Often, I also want to echo that data into the logs so I can use it for troubleshooting or validation later. Docker supports this!

These examples were all tested in Docker for Mac:

docker --version
Docker version 19.03.13, build 4484c46d9d

First, declare your build-time data as an ARG in your Dockerfile:

FROM alpine:3.7
 
ARG USEFUL_INFORMATION
ENV USEFUL_INFORMATION=$USEFUL_INFORMATION
RUN echo "Useful information: $USEFUL_INFORMATION"

In this example, I’ve also set an ENV variable so I can RUN a command to print out the new ARG.

Now, just build like ususal:

docker build --tag test_build_args --build-arg USEFUL_INFORMATION=1337 .
Sending build context to Docker daemon  10.24kB
Step 1/4 : FROM alpine:3.7
 ---> 6d1ef012b567
Step 2/4 : ARG USEFUL_INFORMATION
 ---> Using cache
 ---> 18d20c437445
Step 3/4 : ENV USEFUL_INFORMATION=$USEFUL_INFORMATION
 ---> Using cache
 ---> b8bbdd03a1d1
Step 4/4 : RUN echo "Useful information: $USEFUL_INFORMATION"
 ---> Running in a2161bfb75cd
Useful information: 1337
Removing intermediate container a2161bfb75cd
 ---> 9ca56256cc19
Successfully built 9ca56256cc19
Successfully tagged test_build_args:latest

If you don’t pass in a value for the new ARG, it resolves to an empty string:

docker build --tag test_build_args .
Sending build context to Docker daemon  10.24kB
Step 1/4 : FROM alpine:3.7
 ---> 6d1ef012b567
Step 2/4 : ARG USEFUL_INFORMATION
 ---> Using cache
 ---> 18d20c437445
Step 3/4 : ENV USEFUL_INFORMATION=$USEFUL_INFORMATION
 ---> Running in 63e4b0ce1fb7
Removing intermediate container 63e4b0ce1fb7
 ---> 919769a93b7d
Step 4/4 : RUN echo "Useful information: $USEFUL_INFORMATION"
 ---> Running in 73e158d1bfa6
Useful information:
Removing intermediate container 73e158d1bfa6
 ---> f928fc025270
Successfully built f928fc025270
Successfully tagged test_build_args:latest

Some details:

That’s it! Happy building,

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

PowerShell Scripts with Arguments

Hello!

I write a lot of utility scripts. Little helpers to automate repetetive work. Like going through all those YAML files and updating that one config item, or reading through all those database entries and finding the two that are messed up because of that weird bug I just found.

These scripts are usually small. I often don’t keep them very long. I also usually have to run them against multiple environments, and sometimes I have to hand them to other engineers. They need to behave predictably everywhere, and they need to be easy to read and run. They can’t be hacks that only I can use.

In my work, that means a script that takes arguments and passes them to internal functions that implement whatever I’m trying to do. Let’s say I need to find a thing with a known index, then reset it. Here’s the pattern I use in PowerShell:

[CmdletBinding()]
param(
    [int]$Index
)

function Get-Thing {
    [CmdletBinding()]
    param(
        [int]$Index
    )
    return "Thing$Index"
}

function Reset-Thing {
    [CmdletBinding()]
    param(
        [string]$Thing
    )
    # We'd do the reset here if this were a real script.
    Write-Verbose "Reset $Thing"
}

$Thing = Get-Thing -Index $Index
Reset-Thing -Thing $Thing

We can run that from a prompt with the Index argument:

./Reset-Thing.ps1 -Index 12 -Verbose
VERBOSE: Reset Thing12

Some details:

  • The param() call for the script has to be at the top. Posh throws errors if you put it down where the functions are invoked.
  • CmdletBinding() makes the script and its functions handle standard arguments like -Verbose. More details here.
  • This uses Write-Verbose to send informative output to the verbose “stream”. This is similar to setting the log level of a Python script to INFO. It allows the operator to select how much output they want to see. More details here.
  • As always, use verbs from Get-Verb when you’re naming things.
  • I could have written this with just straight commands instead of splitting them into Get and Reset functions, especially for an example this small, but it’s almost always better to separate out distinct pieces of logic. It’ll be easier to read if I have to hand it to someone else who’s not familiar with the operation. Same if I have to put it aside for a while and come back to it after I’ve forgotten how it works.

This is my starting point when I’m writing a helper script. It’s usually enough to let me sanely automate a one-off without getting derailed into full-scale application development.

Happy scripting,

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

PowerShell on OS X: Git Hooks

Hello!

PowerShell works great on Mac OS X. It’s my default shell. I usually only do things the Posh way, but sometimes the underlying system bubbles back up. Like when I’m writing git hooks.

In Posh, git hooks live in the same place and still have to be executable on your platform. That doesn’t change. But, the scripts themselves can be different. You have two options.

Option 1: Don’t Use PowerShell

Your existing hooks written in bash or zsh or whatever Linux-ey shell you were using will still work. That’s great if you already have a bunch and you don’t want to port them all.

If you’re writing anything new, though, use PowerShell. When I get into a mess on my Posh Apple, it’s usually because I mixed PowerShell with the legacy shell. You’re better off using just one.

Option 2: Update the Shebang

The shebang (#!) is the first line of executable scripts on Unix-like systems. It sets the program that’s used to run the script. We just need to write one in our hook script that points at pwsh (the PowerShell executable):

#!/usr/local/microsoft/powershell/7/pwsh
 
Write-Verbose -Verbose "We're about to commit!"

If you don’t have the path to your pwsh, you can find it with Get-Command pwsh.

After that, our hook works like normal:

git commit --allow-empty -m "Example commit."
VERBOSE: We're about to commit!
[master a905079] Example commit.

If you don’t set the shebang at all (leaving nothing but the Write-Verbose command in our example), your hook will run but OS X won’t treat it like PowerShell. You get “not found” errors:

git commit --allow-empty -m "Example commit."
.git/hooks/pre-commit: line 2: Write-Verbose: command not found
[master 1b2ebac] Example commit.

That’s actually good. If you have old hook scripts without shebang lines, they won’t break. Just make sure any new Posh scripts do have a shebang and everything should work.

Enjoy the Posh life!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

A Checklist for Submitting Pull Requests

Reviewing code is hard, especially because reviewers tend to inherit some responsibility for problems the code causes later. That can lead to churn while they try to develop confidence that new submissions are ready to merge.

I submit a lot of code for review, so I’ve been through a lot of that churn. Over the years I’ve found a few things that help make it easier for my reviewers to develop confidence in my submissions, so I decided to write a checklist. ✔️

The code I write lives in diverse repos governed by diverse requirements. A lot of the items in my checklist are there to help make sure I don’t mix up the issues I’m working on or the requirements of the repos I’m working in.

This isn’t a guide on writing good code. You can spend a lifetime on that topic. This is a quick checklist I use to avoid common mistakes.

This is written for Pull Requests submitted in git repos hosted on GitHub, but most of its steps are portable to other platforms (e.g. Perforce). It assumes common project features, like a contributing guide. Adjust as needed.

The Checklist

Immediately before submitting:

  1. Reread the issue.
  2. Merge the latest changes from the target branch (e.g. master).
  3. Reread the diff line by line.
  4. Rerun all tests. If the project doesn’t have automated tests, you can still:
    • Run static analysis tools on every file you changed.
    • Manually exercise new functionality.
    • Manually exercise existing functionality to make sure it hasn’t changed.
  5. Check if any documentation needs to be updated to reflect your changes.
  6. Check the rendering of any markup files (e.g. README.md) in the GitHub UI.
    • There are remarkable differences in how markup files render on different platforms, so it’s important to check them in the UI where they’ll live.
  7. Reread the project’s contributing guide.
  8. Write a description that:
    1. Links to the issue it addresses.
    2. Gives a plain English summary of the change.
    3. Explains decisions you had to make. Like:
      • Why you didn’t clean up that one piece of messy code.
      • How you chose the libraries you used.
      • Why you expanded an existing module instead of writing a new one.
      • How you chose the directory and file names you did.
      • Why you put your changes in this repo, instead of that other one.
    4. Lists all the tests you ran. Include relevant output or screenshots from manual tests.

There’s no perfect way to submit code for review. That’s why we still need humans to do it. The creativity and diligence of the engineer doing the work are more important than this checklist. Still, I’ve found that these reminders help me get code through review more easily.

Adam

You might also want to check out these related articles:

How to Grep in PowerShell

Hello!

In oldschool Linux shells, you search files for a string with grep. You’re probably used to commands like this (example results from a random temp directory I had lying around):

grep -r things .
./terraform.tfstate.backup:              "./count_things.py"
./count_things.py:def count_things(query):
./count_things.py:    count_things()
./terraform.tf:  program = ["python", "${path.module}/count_things.py"]

It outputs strings that concatenate the filename and the matching line. You can pipe those into awk or whatever other command to process them. Standard stuff.

You can achieve the same results in PowerShell, but it’s pretty different. Here’s the basic command:

Get-ChildItem -Recurse | Select-String 'things'
 
count_things.py:7:def count_things(query):
count_things.py:17:    count_things()
terraform.tf:6:  program = ["python", "${path.module}/count_things.py"]
terraform.tfstate.backup:25:              "./count_things.py"

This part is similar. Get-ChildItem recurses through the filesystem and passes the results to Select-String, which searches those files for the string things. The output looks the same. File on the left, matching line on the right. That’s just friendly formatting, though. Really what you’re getting is an array of objects that each represent one match. Posh summarizes that array with formatting that’s familiar, but actually processing these results is completely different.

We could parse out details the Linux way by piping into Out-String to convert the results into strings, splitting on :, and so on, but that’s not idiomatic PowerShell. Posh is object-oriented, so instead of manipulating strings we can just process whichever properties contain the information we’re searching for.

First, we need to know what properties are available:

Get-ChildItem -Recurse | Select-String 'things' | Get-Member
 
   TypeName: Microsoft.PowerShell.Commands.MatchInfo
 
Name               MemberType Definition
----               ---------- ----------
Equals             Method     bool Equals(System.Object obj)
GetHashCode        Method     int GetHashCode()
GetType            Method     type GetType()
RelativePath       Method     string RelativePath(string directory)
ToEmphasizedString Method     string ToEmphasizedString(string directory)
ToString           Method     string ToString(), string ToString(string directory)
Context            Property   Microsoft.PowerShell.Commands.MatchInfoContext Context {get;set;}
Filename           Property   string Filename {get;}
IgnoreCase         Property   bool IgnoreCase {get;set;}
Line               Property   string Line {get;set;}
LineNumber         Property   int LineNumber {get;set;}
Matches            Property   System.Text.RegularExpressions.Match[] Matches {get;set;}
Path               Property   string Path {get;set;}
Pattern            Property   string Pattern {get;set;}

Get-Member tells us the properties of the MatchInfo objects we piped into it. Now we can process them however we need.

Select One Property

If we only want the matched lines, not all the other info, and we can filter out the Line property with Select-Object.

Get-ChildItem -Recurse | Select-String 'things' | Select-Object 'Line'
 
Line
----
def count_things(query):
    count_things()
  program = ["python", "${path.module}/count_things.py"]
              "./count_things.py"

Sort Results

We can sort results by the content of a property with Sort-Object.

Get-ChildItem -Recurse | Select-String 'things' | Sort-Object -Property 'Line'
 
terraform.tfstate.backup:25:              "./count_things.py"
count_things.py:17:    count_things()
terraform.tf:6:  program = ["python", "${path.module}/count_things.py"]
count_things.py:7:def count_things(query):

Add More Filters

Often, I search for a basic pattern like ‘things’ and then chain in Where-Object to filter down to more specific results. It can be easier to chain matches as I go than to write a complex match pattern at the start.

Get-ChildItem -Recurse | Select-String 'things' | Where-Object 'Line' -Match 'def'
 
count_things.py:7:def count_things(query):

We’re not limited to filters on the matched text, either:

Get-ChildItem -Recurse | Select-String 'things' | Where-Object 'Filename' -Match 'terraform'
 
terraform.tf:6:  program = ["python", "${path.module}/count_things.py"]
terraform.tfstate.backup:25:              "./count_things.py"

There are tons of things you can do. The main detail to remember is that you need Get-Member to tell you what properties are available, then you can use any Posh command to process those properties.

Enjoy freedom from strings!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

Tox: Testing Multiple Python Versions with Pyenv

Hello!

I use Python’s tox to orchestrate a lot of my tests. It lets you set a list of versions in a tox.ini file (in the same directory as your setup.py), like this:

[tox]
envlist = py37, py38
 
[testenv]
allowlist_externals = echo
commands = echo "success"

Then you can run the tox command, it’ll create a venv for each version, and run your tests in each of those environments. It’s an easy way to ensure your code works across all the versions of Python you want to support.

But, if I install tox into a 3.8 environment and run the tox command in the directory where we created the tox.ini above, I get this:

tox
GLOB sdist-make: /Users/adam/Local/fiddle/setup.py
py37 create: /Users/adam/Local/fiddle/.tox/py37
ERROR: InterpreterNotFound: python3.7
py38 create: /Users/adam/Local/fiddle/.tox/py38
py38 inst: /Users/adam/Local/fiddle/.tox/.tmp/package/1/example-0.0.0.zip
py38 installed: example @ file:///Users/adam/Local/fiddle/.tox/.tmp/package/1/example-0.0.0.zip
py38 run-test-pre: PYTHONHASHSEED='2325607949'
py38 run-test: commands[0] | echo success
success
___________________________________________________________________________ summary ____________________________________________________________________________
ERROR:  py37: InterpreterNotFound: python3.7
  py38: commands succeeded

It found the 3.8 interpreter I ran it with, but it couldn’t find 3.7.

pyenv can get you past this. It’s a utility for installing and switching between multiple Python versions. I use it on OS X (⬅️ instructions to get set up, if you’re not already). Here’s how it looks when I have Python 3.6, 3.7, and 3.8 installed, and I’m using 3.8:

pyenv versions
  system
  3.6.11
  3.7.9
* 3.8.5 (set by /Users/adam/.pyenv/version)

Just having those versions installed isn’t enough, though. You still get the error from tox about missing versions. You have to specifically enable each version:

pyenv local 3.8.5 3.7.9
pyenv versions
  system
  3.6.11
* 3.7.9 (set by /Users/adam/Local/fiddle/.python-version)
* 3.8.5 (set by /Users/adam/Local/fiddle/.python-version)

This will create a .python-version file in the current directory that sets your Python versions. pyenv will read that file whenever you’re in that directory. You can also set versions that’ll be picked up in any folder with the pyenv global command.

Now, tox will pick up both versions:

tox
GLOB sdist-make: /Users/adam/Local/fiddle/setup.py
py37 inst-nodeps: /Users/adam/Local/fiddle/.tox/.tmp/package/1/example-0.0.0.zip
py37 installed: example @ file:///Users/adam/Local/fiddle/.tox/.tmp/package/1/example-0.0.0.zip
py37 run-test-pre: PYTHONHASHSEED='1664367937'
py37 run-test: commands[0] | echo success
success
py38 inst-nodeps: /Users/adam/Local/fiddle/.tox/.tmp/package/1/example-0.0.0.zip
py38 installed: example @ file:///Users/adam/Local/fiddle/.tox/.tmp/package/1/example-0.0.0.zip
py38 run-test-pre: PYTHONHASHSEED='1664367937'
py38 run-test: commands[0] | echo success
success
___________________________________________________________________________ summary ____________________________________________________________________________
  py37: commands succeeded
  py38: commands succeeded
  congratulations :)

That’s it! Now you can run your tests in as many verions of Python as you need.

Happy testing,

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles: