Implementation Plan

Key Enabler: Zero-Risk Deployment

GlobalProtect stays installed and running on all endpoints throughout the entire migration. NetBird is purely additive. If NetBird fails for any user, they continue using GlobalProtect exactly as they do today. There is no scenario where a NetBird deployment causes a user outage.

Critical Flag: `--network-monitor=false`

All NetBird clients must be deployed with the --network-monitor=false flag during parallel operation with GlobalProtect. This prevents a known coexistence conflict (GitHub issue #5077) where GlobalProtect’s network changes trigger WireGuard tunnel restarts, dropping TCP sessions. The flag is already included in the TRMM deployment script. Remove it only after GlobalProtect is fully decommissioned.

Phase 1: Pre-Work (Monday — Thursday)

All prerequisites are staged before the migration weekend. Nothing user-facing changes during this phase.

Monday

#	Task	Details
1	Verify Entra ID license tier	P1 minimum required for SSPR with password writeback (included in M365 Business Premium or E3)
2	Enable SSPR + password writeback	Entra admin center > Protection > Password reset. Enable for All or targeted group. Open Entra Connect wizard > Optional features > check “Password writeback.”
3	Test SSPR with one IT account	Reset password at `aka.ms/sspr`, verify writeback to on-prem AD
4	Provision Azure B2s VM	Ubuntu 24.04 LTS, West US 3 (Phoenix), 2 vCPU, 4 GB RAM
5	Configure NSG rules	Inbound: TCP 80, 443 + UDP 3478
6	Assign static public IP	Required for DNS stability
7	Install Docker + docker-compose	On the Azure VM
8	Create DNS A record	`netbird.gsisg.com` pointing to VM public IP

Tuesday

#	Task	Details
1	Create Entra ID App Registration	Name: “NetBird”, Single tenant, Redirect URIs (SPA): `https://netbird.gsisg.com/auth` and `https://netbird.gsisg.com/silent-auth`, Mobile/desktop redirect: `http://localhost:53000`
2	Configure App Registration details	Create application scope “api”, grant `User.Read.All` permission (admin consent), set `accessTokenAcceptedVersion` = 2, generate client secret
3	Record credentials	Application (client) ID, Directory (tenant) ID, Object ID, Client Secret
4	Pre-build Honolulu routing peer VM	Ubuntu 24.04, 1 vCPU, 1 GB RAM on Hyper-V (DATA003 or DATA004). Do NOT connect to NetBird yet.
5	Pre-build Boulder routing peer VM	Ubuntu 24.04, 1 vCPU, 1 GB RAM on Hyper-V (DATA001 or DATA007). Do NOT connect to NetBird yet.

Wednesday

#	Task	Details
1	Finalize TRMM deployment script	Existing: `trmm-deploy-netbird.ps1`. Verify `--network-monitor=false` flag is present.
2	Test MSI install on 1 IT machine	Via TRMM. Will fail to connect (management server not yet up) — verify MSI install + service creation only.
3	Push AV/EDR exclusions	`C:\Program Files\NetBird\` excluded on ALL endpoints via TRMM
4	Push GPO firewall rules	Allow NetBird `wt0` interface in Windows Firewall (prevents GPO from overriding NetBird’s auto-created rules)
5	Prepare communications	Pilot user briefing (email/Slack). All-staff Monday email with SSPR instructions (`aka.ms/sspr`).
6	Confirm DNS propagation	Verify `netbird.gsisg.com` resolves to the Azure VM public IP

Thursday

#	Task	Details
1	Final pre-flight check	VM accessible via SSH, Docker running, DNS resolved, Entra app registration complete
2	Brief helpdesk team	What NetBird is, what to tell users, escalation path
3	Confirm pilot users	Available Saturday for testing (8-10 users across 5 scenarios)
4	Prepare network route configs	Print/document subnets (`10.15.0.0/24`, `10.100.7.0/24`), groups, ACL policies

Phase 2: Friday Evening (4 hours: 6 PM — 10 PM)

Infrastructure deployment. No users are affected.

Time	Task	Duration
6:00 PM	Deploy NetBird management server — `docker-compose up -d` on Azure VM. Verify dashboard loads at `https://netbird.gsisg.com`.	15 min
6:15 PM	Configure Entra OIDC integration — Populate `setup.env` with Entra variables (client ID, tenant ID, secret, OIDC endpoint). Run configure script, restart containers. Test SSO login.	30 min
6:45 PM	Create break-glass local admin account	5 min
6:50 PM	Create setup keys + groups — “Routing-Peers” (no expiration), “Company-Laptops” (with expiration), “IT-Admins”, “Hawaii-Engineers”, “Boulder-Engineers”	15 min
7:05 PM	Deploy routing peers (PARALLEL):	45 min
	Honolulu — SSH to pre-built VM, install NetBird, connect with routing peer setup key, enable IP forwarding (`sysctl -w net.ipv4.ip_forward=1`), enable systemd service
	Boulder — Install NetBird on Hyper-V VM (gsi-nb-bld-01), connect with routing peer setup key, verify peer in dashboard
7:50 PM	Configure network routes — Honolulu: `10.100.7.0/24` via Honolulu peer (masquerade ON). Boulder: `10.15.0.0/24` via Boulder peer (masquerade ON). Distribution group: “Company-Laptops”.	20 min
8:10 PM	Configure access control policies (see ACL table below)	15 min
8:25 PM	IT team self-test — Install NetBird on 2-3 IT laptops. Test: ping DCs at both sites, SMB share access, RDP to a test VM, `netbird status`, SSPR password reset + cached credential update.	90 min
9:55 PM	Go/No-Go decision for Saturday pilot — All routes working? OIDC login working? SMB/RDP verified? If No: debug or postpone to next weekend. Zero user impact.	5 min

Access Control Policies

Policy	Source Group	Destination	Protocols
All Staff — DC Access	All Users	DCs at both sites	TCP/UDP 53, 88, 123, 135, 389, 445, 464, 636, 3268, 3269
Hawaii Engineers	Hawaii-Engineers	Honolulu network	All
Boulder Engineers	Boulder-Engineers	Boulder network	All
IT Full Access	IT-Admins	All networks	All

Phase 3: Saturday Pilot (8 hours: 9 AM — 5 PM)

Deploy to the pilot group (8-10 users) via TRMM. Validate all 5 scenarios.

Time	Task	Duration
9:00 AM	Deploy to pilot group via TRMM — push script to pilot users’ machines, monitor for successful installs	30 min
9:30 AM	Contact pilot users — brief each on what to test, provide direct Slack/Teams/phone support	30 min
10:00 AM	Scenario 1: Hawaii remote to Boulder SMB — map drive to `\\10.15.0.x\share`, copy 100 MB file, compare performance to GlobalProtect	60 min
11:00 AM	Scenario 2: Maryland to Boulder RDP — RDP session, assess input lag, run CAD/GIS/SAGE applications	60 min
12:00 PM	Lunch break	60 min
1:00 PM	Scenario 3: Honolulu field to local Sage on cellular — access Sage at `10.100.7.40` from cellular, walk between locations to test handoff	60 min
2:00 PM	Scenario 4: Boulder office to Honolulu files — access `\\10.100.7.15\share` via routing peers	60 min
3:00 PM	Scenario 5: Password reset (SSPR) — navigate to `aka.ms/sspr`, reset password, Win+L, unlock with new password, verify cached credentials updated through NetBird tunnel	60 min
4:00 PM	Collect pilot feedback — connection failures? DNS issues? Performance problems?	30 min
4:30 PM	Go/No-Go decision for Sunday full deployment — if all 5 scenarios pass: proceed. If issues found: fix and re-test, or postpone. Postponing has ZERO user impact (GP still works).	30 min

Pilot Group Design

Recommended size: 8-10 users — large enough to cover all 5 scenarios with redundancy, small enough to provide hands-on support.

Scenario	User Profile	Selection Criteria	Count
1. Hawaii remote to Boulder SMB	Hawaii-based worker who regularly accesses Boulder file shares	Tech-comfortable, good at reporting issues, tests hairpin elimination	1-2
2. Maryland to Boulder RDP	East Coast worker who RDPs to Boulder VMs for CAD/GIS/SAGE	Tests maximum latency improvement, likely to notice and report performance difference	1-2
3. Honolulu field to local Sage	Field worker on cellular accessing Sage (10.100.7.40)	Tests cellular handoff, WireGuard roaming, relay performance	1-2
4. Boulder office to Honolulu files	Boulder office worker accessing FILES server (10.100.7.15) or GIS data	Tests site-to-site via routing peers	1-2
5. Password reset	Any general staff user (the 90% group)	Tests the primary use case for the majority of users; needs only SSPR + tunnel to DC	2-3

Selection criteria for all pilot users:

Tech-savvy enough to report issues clearly (able to describe what happened vs. what they expected)
Good communicators — willing to provide feedback, respond to Slack/Teams messages
Variety of OS versions — include at least one Windows 10 and one Windows 11 machine
Variety of hardware ages — include at least one older machine to catch edge cases
At least 1-2 IT staff members for deep troubleshooting
Voluntary participation — interested and engaged, not coerced

Phase 4: Sunday Full Deployment (4 hours: 10 AM — 2 PM)

Time	Task	Duration
10:00 AM	Update TRMM script with any changes from pilot feedback — adjust setup key, management URL, flags if needed	15 min
10:15 AM	Full deployment via TRMM — bulk execution to all remaining ~90 agents. Monitor for install success/failure. Expect ~90% success on first push.	30-60 min
11:15 AM	Troubleshoot failures — re-run script on failed endpoints. Check for AV blocking, network issues, disk space.	60 min
12:15 PM	Verify dashboard — peer count matches expected, routing peers healthy, spot-check `netbird status` on endpoints via TRMM	30 min
12:45 PM	Send Monday communication — “A new always-on network service (NetBird) has been deployed to all company laptops. No action needed. For password resets, use `aka.ms/sspr`. Contact helpdesk if you experience any connectivity issues.”	15 min
1:00 PM	Configure monitoring — Zabbix alerts for Azure VM, Docker restart policies (`unless-stopped`), login expiration policy (24h recommended)	30 min
1:30 PM	Final status check — document issues encountered and resolutions, update helpdesk troubleshooting guide	30 min

Phase 5: Monday — Monitor and Support

Monitor NetBird dashboard throughout the day — watch for disconnected peers
Check TRMM for any endpoints that failed to install
Be available on Slack/Teams for user questions
Track helpdesk tickets related to NetBird (expect very few since it is a silent install)
Verify field workers on cellular have connectivity
Run netbird status spot checks via TRMM on random endpoints
Address stragglers (machines that were off during Sunday deployment)
Schedule follow-up TRMM task to catch machines that were offline

Phase 6: Decommission GlobalProtect

Begin after 2+ weeks of stable operation with all users on NetBird.

Step	Task	Notes
1	Disable GP auto-connect on endpoints	Do NOT uninstall yet
2	Remove `--network-monitor=false` flag	GP removal eliminates the coexistence trigger
3	Monitor for 30 days of sole NetBird operation	Confirm stability without GP fallback
4	Uninstall GlobalProtect client from all endpoints via TRMM	Bulk execution
5	Back up PA-2020 configuration	Power down, retain 90 days before disposal
6	Remove `vpn.gsisg.com` DNS record
7	Notify cyber insurance broker	Migration to ZTNA architecture
8	Document final architecture

Rollback Plan

The fundamental safety net: NetBird is additive. GlobalProtect stays installed, configured, and running on all endpoints throughout the migration and beyond. There is no scenario where a NetBird failure causes a user outage.

Level 1: User Self-Remediation (0 minutes)

User simply uses GlobalProtect as before. No action needed — GP is still installed and running.

Level 2: Individual Endpoint Removal via TRMM (5 minutes)

Run as Administrator via TRMM on a single machine:

Stop-Service "NetBird" -Force -ErrorAction SilentlyContinue
Start-Sleep -Seconds 2
& "C:\Program Files\NetBird\netbird_uninstall.exe" /S
Start-Sleep -Seconds 5
Remove-Item -Path "C:\ProgramData\Netbird" -Recurse -Force -ErrorAction SilentlyContinue
Write-Host "NetBird removed. GlobalProtect remains active."

Level 3: Mass Uninstall via TRMM (15-30 minutes)

Create the uninstall script above in TRMM
Create an Automation Policy targeting all clients/sites
Run as “Fire and Forget” on all agents
Verify removal via TRMM script that checks for the NetBird service
GlobalProtect continues to function — users experience zero disruption

Level 4: Infrastructure Teardown

Stop NetBird containers on Azure VM: docker compose down
Remove Honolulu routing peer: sudo netbird down && sudo apt remove netbird
Remove Boulder routing peer: sudo apt remove netbird on the Hyper-V VM
Delete Azure VM to stop billing (optional)
Remove netbird.gsisg.com DNS record

Timeline Summary

Day	Phase	Hours	Milestone
Mon-Thu	Phase 1: Pre-Work	~8 hrs total	Azure VM, DNS, Docker, Entra App Registration, SSPR, TRMM script, AV exclusions, GPO rules, communications
Friday	Phase 2: Infrastructure	4 hrs (6-10 PM)	Management server, OIDC, routing peers, routes, ACLs, IT self-test, Go/No-Go
Saturday	Phase 3: Pilot	8 hrs (9 AM-5 PM)	8-10 pilot users, 5 scenarios validated, Go/No-Go
Sunday	Phase 4: Full Deploy	4 hrs (10 AM-2 PM)	TRMM bulk deployment to ~90 remaining endpoints, monitoring configured
Monday	Phase 5: Support	Full day	Monitor dashboard, catch stragglers, support users
Week 3+	Phase 6: Decommission	30+ days	Disable GP, remove `--network-monitor=false`, uninstall GP, decommission PA-2020

Total active deployment time: ~24 hours across one week (pre-work + weekend). The compressed timeline is possible because NetBird deployment is silent, additive, and GlobalProtect provides a complete fallback for every failure scenario.