Commit Graph

395 Commits

Author SHA1 Message Date
Dylan Conway
ee0aed4836 [publish windows images] 2026-02-13 21:23:26 -08:00
Dylan Conway
5590cd7d47 D16 VMs, pin Node 24.3.0, vs-shell for tests, ARM64 installers
- Use D16 VMs (16 vCPU) for all Windows CI runners
- Pin Node.js to 24.3.0 (ABI 137) for duckdb prebuilt + test compat
- Wrap test runner in vs-shell.ps1 so node-gyp has cl.exe
- Revert ccache persistent config (VMs are ephemeral)
- Restore Uninstall-Windows-Defender (reboot clears pending state)
- Add Windows ARM64 to install.ps1, install.sh, bun-release
- Set parallelism to 2 for Windows tests
- Clarify Packer vs CI runner VM sizes in comments

[build images]
2026-02-13 14:08:40 -08:00
Dylan Conway
b4908e7415 Fix Unix tools PATH: use Machine scope, add git usr/bin
- Cygwin: fix path to root/bin, use Machine scope (survives Sysprep)
- Git: add usr/bin to Machine PATH (ships cat, head, tail, echo, etc.)

Previously cygwin used User scope PATH which was wiped by Sysprep,
and the path was wrong (missing root/ prefix).

[build images]
2026-02-13 11:29:39 -08:00
Dylan Conway
1c2ecc63e9 Use D4 VMs (4 vCPU) for all Windows CI
D16 (16 cores) only allows 6 VMs per 100-core quota, not enough
for builds + 4 test shards. D4 (4 cores) allows 25 VMs.

[build images]
2026-02-13 11:02:36 -08:00
Dylan Conway
08b348935d Fix image naming: publish-image must match ci.mjs getImageName()
[publish images] and normal CI expect 'windows-x64-2019-v13' but
machine.mjs was publishing as 'windows-x64-2019-build-v13'.

Now image_name is passed directly to Packer, matching ci.mjs:
  - publish-image: windows-x64-2019-v13
  - create-image:  windows-x64-2019-build-37194

[build images]
2026-02-13 10:59:50 -08:00
Dylan Conway
f21e99a8dc Copy bun to System32 so it survives Sysprep
bun.sh/install.ps1 installs to user profile PATH which is lost
after Sysprep generalizes the image. Copy to C:\Windows\System32
like we do for ARM64.

[build images]
2026-02-13 09:07:48 -08:00
Dylan Conway
e739d3e0ac Add windows-restart provisioner before Sysprep
Registry key clearing alone doesn't satisfy spopk.dll's reboot
check. A real reboot between VS install and Sysprep clears the
Component Based Servicing pending state.

[build images]
2026-02-13 02:58:15 -08:00
Dylan Conway
f6f75ce393 Clear pending reboot flags before Sysprep
VS Build Tools installer and Windows Updates leave RebootPending
and PendingFileRenameOperations flags that cause Sysprep validation
to fail with 'one or more Windows updates that require a reboot'.

[build images]
2026-02-13 02:16:18 -08:00
Dylan Conway
f00eed8c6d Fix sysprep: reset LASTEXITCODE before running (stale from cygwin)
$LASTEXITCODE was polluted by cygwin setup failure. Clear it before
sysprep to prevent false positive exit code check. The timeout-based
polling loop catches real sysprep failures.

[build images]
2026-02-13 01:40:12 -08:00
Dylan Conway
1f1118940a Fix nssm fallback: check if installed instead of try/catch
Install-Scoop-Package uses SilentlyContinue which never throws,
so the catch block (blob storage mirror) was unreachable. Now we
check if nssm is actually on PATH after the scoop attempt.

[build images]
2026-02-13 01:13:01 -08:00
Dylan Conway
e10529e4ae Escape PowerShell variables in HCL: $${var} not ${var}
[build images]
2026-02-13 00:44:06 -08:00
Dylan Conway
f4e9cdef6f Fix sysprep hang: exit code check, timeout, Panther cleanup, pin plugin 2.5.0
- Check sysprep exit code and dump Panther logs on failure
- Add 5-minute timeout to IMAGE_STATE polling loop
- Clean stale Panther directory before sysprep
- Pin azure plugin to 2.5.0 (2.5.1 has WinRM regression #568)

[build images]
2026-02-13 00:41:42 -08:00
Dylan Conway
8e9aeb7ad4 Don't uninstall Windows Defender feature — reboot blocks Sysprep
Uninstall-WindowsFeature -Name Windows-Defender returns
ExitCode=SuccessRestartRequired, and the pending reboot prevents
Sysprep from transitioning to IMAGE_STATE_GENERALIZE_RESEAL_TO_OOBE.
Disable-Windows-Defender already disables real-time monitoring.

[build images]
2026-02-13 00:38:12 -08:00
Dylan Conway
f1539f843e Route LLVM install through Install-Scoop-Package for error suppression
[build images]
2026-02-12 22:35:19 -08:00
Dylan Conway
aa098cfa6b Suppress scoop post_install errors for ALL packages
Move error suppression from Install-7zip to Install-Scoop-Package so
all scoop installs are resilient to non-fatal post_install errors
(7zip 7zr.exe locked, llvm-arm64 missing Uninstall.exe, etc).

[build images]
2026-02-12 22:22:24 -08:00
Dylan Conway
8f37779b8f Fix 7zip ARM64: suppress post_install error with SilentlyContinue
The Remove-Item error on 7zr.exe happens on ARM64 regardless of user
context (packer user, not just SYSTEM). Temporarily set ErrorActionPreference
to SilentlyContinue and convert all output streams to strings so the
error doesn't propagate to PowerShell's exit code.

[build images]
2026-02-12 22:11:27 -08:00
Dylan Conway
04528baa4a Revert manual 7zip install — Packer WinRM doesn't have the SYSTEM error
The ARM64 Remove-Item error only happens under Azure Run Command
SYSTEM context. With Packer's WinRM, scoop runs as the packer user
and the post_install cleanup works fine.

[build images]
2026-02-12 22:06:20 -08:00
Dylan Conway
f5fe6da89a Fix Packer: pass directory (not single file) + -only flag
Packer needs the directory to find variables.pkr.hcl. Use -only to
select the right source (windows-x64 or windows-arm64).

[build images]
2026-02-12 21:50:23 -08:00
Dylan Conway
cd3734895d Fix Packer integration: version, CI flag, gallery setup, build RG
- Packer 1.15.0 (1.12.0 had plugin checksum issues)
- bootstrap.ps1 reads CI from env var (Packer sets environment_vars)
- machine.mjs creates gallery image definition before Packer build
- Build RG is BUN-CI-EASTUS2 (eastus2 quota), gallery stays in BUN-CI
- ARM64 bun installed from blob storage

[build images]
2026-02-12 21:48:29 -08:00
Dylan Conway
e477c3a33b Fix Packer templates from local testing
- Remove managed_image (incompatible with TrustedLaunch)
- Use build_resource_group_name for eastus2
- Remove VNet config (WinRM needs public IP)
- Add gallery_resource_group variable (gallery in BUN-CI, build in BUN-CI-EASTUS2)

[build images]
2026-02-12 21:44:06 -08:00
Dylan Conway
34c893a3ed Install native ARM64 bun from blob storage on ARM64 machines 2026-02-12 21:43:22 -08:00
Dylan Conway
8d34a9303c Add Packer-based Windows image building
Replaces Azure Run Command approach with Packer for Windows CI images.
Packer connects via WinRM (native, no x64 emulation on ARM64),
handles sysprep automatically, and provides full output logging.

- scripts/packer/windows-x64.pkr.hcl: Windows Server 2019 x64
- scripts/packer/windows-arm64.pkr.hcl: Windows 11 ARM64 (direct to gallery)
- scripts/packer/variables.pkr.hcl: shared variables
- machine.mjs: routes Azure Windows builds through Packer

[build images]
2026-02-12 21:12:00 -08:00
Dylan Conway
cb23d5ced0 Upgrade to D16 VMs (16 vCPU, 64 GiB) for faster builds
100 core quota in eastus2 allows 6 concurrent D16 VMs.

[build images]
2026-02-12 20:45:59 -08:00
Dylan Conway
a44351d839 Refresh PATH before running agent.mjs install
Azure Run Command sessions have stale PATH that doesn't include
buildkite-agent added by bootstrap. Reload from registry first.

[build images]
2026-02-12 20:40:15 -08:00
Dylan Conway
71a5a1bf4d Don't treat stderr as error — rustup/cargo write info to stderr
Only use Azure Run Command displayStatus to detect real failures.
stderr contains non-error output from rustup, cargo, PowerShell
warnings, etc.

[build images]
2026-02-12 20:21:38 -08:00
Dylan Conway
288b247f11 Re-launch bootstrap as native ARM64 PowerShell
Azure Run Command uses x64-emulated PowerShell on ARM64 VMs, which
causes all tools to think they're on x64. Use Sysnative path to
re-launch bootstrap as native ARM64 so PROCESSOR_ARCHITECTURE,
package installs, cmake, and everything else sees the real arch.

[build images]
2026-02-12 20:01:34 -08:00
Dylan Conway
da3246fd6b Fix ARM64 detection: use registry instead of env var
Azure Run Command runs x64-emulated PowerShell on ARM64 VMs, so
$env:PROCESSOR_ARCHITECTURE reports AMD64. The registry value at
HKLM:\SYSTEM\CurrentControlSet\Control\Session Manager\Environment
always reports the real architecture.

[build images]
2026-02-12 19:46:42 -08:00
Dylan Conway
85607af74f Debug: log PROCESSOR_ARCHITECTURE value
[build images]
2026-02-12 19:38:34 -08:00
Dylan Conway
4d3425702f Fix ARM64 detection: use PROCESSOR_ARCHITECTURE env var
RuntimeInformation.OSArchitecture doesn't exist in PowerShell 5.1
(which Azure Run Command uses), so IsARM64 was always false. This
caused ALL ARM64-specific code paths to be skipped.

[build images]
2026-02-12 19:29:50 -08:00
Dylan Conway
be321246fb Install 7zip manually on ARM64 to avoid Scoop post_install error
Scoop's 7zip ARM64 post_install script has an unfixable Remove-Item
error. Install 7zip directly from 7-zip.org instead.

[build images]
2026-02-12 19:19:12 -08:00
Dylan Conway
0620e2dba3 Fix 7zip ARM64: stringify error output to prevent re-throw
PowerShell 2>&1 | Out-Host passes ErrorRecord objects which re-trigger
ErrorActionPreference=Stop. Convert to strings first.

[build images]
2026-02-12 19:18:09 -08:00
autofix-ci[bot]
d208189f17 [autofix.ci] apply automated fixes 2026-02-13 03:04:50 +00:00
Dylan Conway
57a7345a39 Fix agent install: use full node path, throw on spawnSafe errors
- machine.mjs: use C:\Scoop\apps\nodejs\current\node.exe instead of
  bare 'node' which isn't in Azure Run Command PATH
- azure.mjs: spawnSafeFn now throws on non-zero exit code so bootstrap
  failures actually stop the build instead of capturing broken images

[build images]
2026-02-12 19:03:01 -08:00
Dylan Conway
4ec90d886b Fix bootstrap: run scoop in-process, isolate only 7zip ARM64
Child process scoop installs break PATH/shims. Only use child process
for 7zip on ARM64 (post_install error). Everything else runs in-process.
nssm falls back to blob storage mirror when nssm.cc is down.

[build images]
2026-02-12 17:56:47 -08:00
Dylan Conway
7a0c9047be Fix bootstrap: VS 3010 exit code, mingw command name
- VS installer exit code 3010 (reboot required) is not a real error
- mingw command is gcc, not mingw
- Use powershell -Command for scoop install isolation

[build images]
2026-02-12 17:44:23 -08:00
Dylan Conway
632b5bb2e8 Make cygwin install non-fatal
Cygwin mirror can be unreachable from Azure VMs. Don't block
the entire bootstrap if cygwin fails to install.

[build images]
2026-02-12 17:17:32 -08:00
Dylan Conway
0cefa2c753 Run scoop install in child process to isolate errors
Scoop's 7zip post_install throws a terminating error on ARM64 that
propagates regardless of ErrorActionPreference. Running scoop in a
child PowerShell process isolates the error completely.

[build images]
2026-02-12 17:08:23 -08:00
Dylan Conway
24e3598d2b Also pre-delete 7zr.exe before 7zip install
[build images]
2026-02-12 17:08:09 -08:00
Dylan Conway
4f0546f919 Fix 7zip ARM64: install before git + suppress scoop errors
7zip is a Scoop dependency of git. On ARM64 SYSTEM, 7zip's post_install
fails trying to delete 7zr.exe from TEMP. Fix by:
1. Moving Install-7zip before Install-Git so it's already installed
2. Using SilentlyContinue in Install-Scoop-Package then verifying

[build images]
2026-02-12 17:04:50 -08:00
Dylan Conway
928fc7888b Fix 7zip: use SilentlyContinue for scoop install
The Remove-Item error in Scoop's 7zip post_install is a terminating
error that propagates through try/catch. Use SilentlyContinue to
suppress it, then verify 7z is actually installed.

[build images]
2026-02-12 16:54:38 -08:00
Dylan Conway
1980c8e12a Fix 7zip install: use try/catch for post_install cleanup error
[build images]
2026-02-12 16:43:56 -08:00
Dylan Conway
684d577d14 Update default gallery to bunCIGallery2 (eastus2) 2026-02-12 16:30:20 -08:00
Dylan Conway
9546b6d395 Switch Azure to eastus2 (100 core quota) and restore D8 VMs
[build images]
2026-02-12 16:27:13 -08:00
Dylan Conway
4056e5927f Fix bootstrap failures on Azure VMs
- 7zip: tolerate post_install cleanup error (access denied on TEMP)
- Rust: set CARGO_HOME/RUSTUP_HOME before rustup-init to avoid
  SYSTEM profile path issue (Move-Item failure)
- nssm: fall back to blob storage mirror when nssm.cc returns 503
2026-02-12 16:12:20 -08:00
Dylan Conway
3313ff96b8 [build images] 2026-02-12 15:53:02 -08:00
autofix-ci[bot]
48c0d5d3b5 [autofix.ci] apply automated fixes 2026-02-12 22:00:58 +00:00
Dylan Conway
346d662547 [build images] 2026-02-12 13:58:13 -08:00
Dylan Conway
ecd6d8ea68 [build images] 2026-02-12 11:38:12 -08:00
Dylan Conway
4aafb2a29f [build images] 2026-02-11 21:56:09 -08:00
Dylan Conway
76c2b12821 [build images] 2026-02-11 19:45:06 -08:00