• Introduction
  • Why?
    • Speed
    • Cost
    • Containers + k8s
  • Rejected Strategies
    • Using x86_64-pc-windows-gnu
    • Using wine to run the MSVC toolchain
  • How?
    • Prerequisites
    • 1. Setup toolchain(s)
    • 2. Acquire Rust std lib
    • 3. Acquire CRT and Windows 10 SDK
    • 4. Override cc defaults
    • 5. Profit
  • Bonus: Headless testing
    • 1. Install
    • 2. Specify runner
    • 3. Test
  • Final image definition
  • Common issues
    • CMake
    • MASM
    • Compiler Target Confusion
  • Conclusion


Last November I added a new job to our CI to cross compile our project for x86_64-pc-windows-msvc from an x86_64-unknown-linux-gnu host. I had wanted to blog about that at the time but never got around to it, but after making some changes and improvements last month to this, in addition to writing a new utility, I figured now was as good of a time as any to share some knowledge in this area for those who might be interested.


Before we get started with the How, I want to talk about why one might want to do this in the first place, as natively targeting Windows is a "known quantity" with the least amount of surprise. While there are reasons beyond the following, my primary use case for why I want to do cross compilation to Windows is our Continuous Delivery pipeline for my main project at Embark.


It's fairly common knowledge that, generally speaking, Linux is faster than Windows on equivalent hardware. From faster file I/O to better utilization of high core count machines, and faster process and thread creation, many operations done in a typical CI job such as compilation and linking tend to be faster on Linux. And since I am lazy, I'll let another blog post about cross compiling Firefox from Linux to Windows actually present some numbers in defense of this assertion.


Though we're now running a Windows VM in our on-premise data center for our normal Windows CD jobs, we actually used to run it in GCP. It was 1 VM with a modest 32 CPU count, but the licensing costs (Windows Server is licensed by core) alone accounted for >20% of our total costs for this particular GCP project.

While this single VM is not a huge deal relative to the total costs of our project, it's still a budget item that provides no substantive value, and on principle I'd rather have more/better CPUs, RAM, disk, or GPUs, that provide immediate concrete value in our CI, or just for local development.

Containers + k8s

This one is probably the most subjective, so strap in!

While fast CI is a high priority, it really doesn't matter how fast it is if it gives unreliable results. Since I am the (mostly) sole maintainer, (which yes, we're trying to fix) for our CD pipeline in a team of almost 40 people, my goal early on was to get it into a reliably working state that I could easily maintain with a minimal amount of my time, since I have other, more fun, things to do.

The primary way I did this was to build buildkite-jobify (we use Buildkite as our CI provider). This is just a small service that spawns Kubernetes (k8s) jobs for each of the CI jobs we run on Linux, based on configuration from the repo itself.

This has a few advantages and disadvantages over a more typical VM approach, which we use for x86_64-pc-windows-msvc (for now?), x86_64-apple-darwin, and aarch64-apple-darwin.


  • Consistency - Every job run from the same container image has the exact same starting environment.
  • Versioned - The image definitions are part of our monorepo, as well as the k8s job descriptions, so we get atomic updates of the environment CI jobs execute in with the code itself. This also makes rollbacks trivial if needed.
  • Scalability - Scaling a k8s cluster up or down is fairly easily (especially in eg GKE, because $) as long as you have the compute resources. k8s also makes it easy to specify resource requests so that individual jobs can dynamically spin up on the most appropriate node at the time based on the other workloads currently running on the cluster.
  • Movability - Since k8s is just running containers, it's trivial to move build jobs between different clusters, for example in our case, from GKE to our on-premise cluster.


  • Clean builds - Clean builds are quite slow compared to incremental builds, however we mitigate this by using cargo-fetcher for faster crate fetching and sccache for compiler output caching.
  • Startup times - Changing the image used for a build job means that every k8s node that runs an image it doesn't have needs to pull it before running. For example, the pull can take up to almost 2m for our aarch64-linux-android which is by far our largest image at almost 3GiB (the Android NDK/SDK are incredibly bloated). However, this is generally a one time cost per image per node and we don't update images so often that it is actually a problem in practice.

Rejected Strategies

Before we get into the how I just wanted to show two other strategies that could be used for cross compilation that you might want to consider if your needs are different than ours.

Using x86_64-pc-windows-gnu

To be honest, I rejected this one pretty much immediately simply because the gnu environment is not the "native" msvc environment for Windows. Targeting x86_64-pc-windows-gnu would not be representative for actual builds used by users, and it would be different from the local builds built by developers on Windows, which made it an unappealing option. That being said, generally speaking, Rust crates tend to support x86_64-pc-windows-gnu fairly well, which as we'll see later is a good thing due to my chosen strategy.

Using wine to run the MSVC toolchain

I briefly considered using wine to run the various components of the MSVC compiler toolchain, as that would be the most accurate way to match the native compilation for x86_64-pc-windows-msvc. However, we already use LLD when linking on Windows since it is vastly faster than the MSVC linker, so why not just replace the rest of the toolchain while we're at it?

