Azure Sizing & GP Coexistence

Azure VM Sizing and GlobalProtect Coexistence

Executive Summary

B2s (2 vCPU / 4 GB RAM) is the correct VM size for the NetBird management server. B1ms (1 vCPU / 2 GB) meets the official minimum for the management plane but is risky under relay load or mass reconnection events. The extra $15/month eliminates scaling risk.

SQLite is adequate for 100-150 peers with modest policy sets. PostgreSQL becomes necessary at 300+ peers. A separate relay server is unnecessary — the embedded relay handles the ~5-15% of connections that require relaying. Total Azure cost: ~$39/month pay-as-you-go or ~$28/month with 1-year reserved instance.

Regarding GlobalProtect coexistence during migration: the issue is real but well-understood. GP triggers NetBird’s network monitor to restart the WireGuard interface, killing active sessions. The workaround (--network-monitor=false) is safe with manageable side effects. Both VPNs CAN run simultaneously because their routes do not overlap (100.64.0.0/10 overlay vs. corporate subnets).

Azure VM Sizing for 100-150 Peers

Why B2s Wins

Factor	B1ms (1 vCPU / 2 GB)	B2s (2 vCPU / 4 GB)
Normal operation	Sufficient	Comfortable
Mass reconnection event	CPU-constrained; peers queue	Handles gracefully
Relay under load	Risk of CPU starvation	Adequate headroom
Future growth (150-250 peers)	Requires migration	Handles without changes
Docker overhead (4 containers)	Tight on 2 GB RAM	Comfortable on 4 GB RAM
Monthly cost (pay-as-you-go)	~$15.11	~$30.37
Cost difference	Baseline	+$15.23/mo ($183/yr)

Management plane only: B1ms is sufficient. The management server is a lightweight Go binary. Official docs confirm “1 CPU and 2 GB of memory” as the minimum. One user reports running 1,000 active users successfully.

With relay traffic: B1ms becomes risky. The v0.62+ QUIC relay is more CPU-efficient than the old Coturn relay, but relay spikes during network events can saturate a single vCPU.

Policy complexity matters more than peer count. With simple policies (5-15 for GSISG), compute overhead is trivial. The pathological case (70 peers, 480 policies requiring 8 vCPU) does not apply here.

Resource Consumption Estimates

Component	CPU	RAM
Management server (peer sync)	~5-15% of 1 vCPU	~200-400 MB
Signal server	<5% of 1 vCPU	~50-100 MB
Embedded relay	0-30% of 1 vCPU	~50-150 MB
Dashboard + Traefik	<5% of 1 vCPU	~100-200 MB
Total (normal)	~15-50% of 1 vCPU	~400-850 MB
Total (peak, all reconnecting)	~80-100% of 1 vCPU	~1-1.5 GB

Embedded Relay Sufficiency

A separate relay server is not needed for GSISG’s deployment.

GSISG’s network profile:

User Category	Count	Expected Connection Type
Office workers (Honolulu)	~60-70	P2P in most cases
Office workers (Boulder)	~20-30	P2P in most cases
Remote workers (home)	~20-30	P2P (home router = easy NAT)
Field workers (cellular)	~5-10	Likely relayed (CGNAT)

Estimated relay percentage: 5-15% of active connections. At peak: 1-3 simultaneous relayed connections at 2-5 Mbps each = 5-15 Mbps total relay bandwidth. Well within B2s capacity.

A separate relay makes sense only at >50 peers simultaneously relaying, geographic distribution across Azure regions, or >100 Mbps sustained relay bandwidth. None of these apply to GSISG.

SQLite vs. PostgreSQL

SQLite is fine for 100-150 peers. PostgreSQL is overkill.

The primary SQLite concern is events.db growth — event logging can grow to multiple GB within months with 100+ peers connecting/disconnecting daily. Mitigation: periodic archive/truncate via cron job.

Threshold	Action
<150 peers, <20 policies	Stay on SQLite
150-300 peers	Consider PostgreSQL or events cleanup
300+ peers OR 100+ policies	Migrate to PostgreSQL

Azure Cost Breakdown

Component	Specification	Monthly	Annual
VM: Standard_B2s	2 vCPU, 4 GB RAM, Linux	$30.37	$364.44
OS Disk: P4 Premium SSD	32 GB	$5.28	$63.36
Public IP: Standard Static	Required for peer connectivity	$3.65	$43.80
Bandwidth (egress)	2-6 GB/mo (within 100 GB free tier)	$0.00	$0.00
Total (pay-as-you-go)		~$39.30	~$471.60
Total (1-yr reserved)		~$28.06	~$336.72

GlobalProtect + NetBird Coexistence

The Problem (GitHub #5077)

When a user activates GlobalProtect while NetBird is running:

GP creates “PANGP Virtual Ethernet Adapter Secure” and adds routes
NetBird’s network monitor detects the route change as a “significant network change”
NetBird restarts the WireGuard interface, destroying all active TCP sessions
NetBird reconnects, but established SSH/RDP sessions are lost

Root cause: NetBird’s network monitor watches for default route changes and does not distinguish between physical network changes and VPN virtual adapter route additions.

The `--network-monitor=false` Workaround

Scenario	With network-monitor (default)	With —network-monitor=false
Switch Wi-Fi to Ethernet	Auto-reconnects in seconds	May take 25+ seconds
Switch between Wi-Fi networks	Auto-reconnects in seconds	May lose connection; manual toggle needed
VPN connects/disconnects	Interface restart (the bug)	No disruption (desired behavior)
Resume from sleep/hibernate	Quick reconnection	May require manual reconnection

Assessment for GSISG migration: Side effects are manageable. Most users are on stable office Ethernet or home Wi-Fi. The 25-second WireGuard keepalive timeout provides passive reconnection.

PR Status (March 2026)

PR	Status	Notes
#5155	CLOSED (not merged)	Test failures and regression risk
#5156	OPEN (not merged)	Passed quality gate; awaiting maintainer approval

No version includes the fix yet. --network-monitor=false remains the only workaround.

Routing and DNS Coexistence

The routes do NOT conflict:

GlobalProtect: Corporate subnets (e.g., 10.x.x.x)
NetBird: Overlay network (100.64.0.0/10) + configured routes

NetBird does NOT add a default route (unless exit node is configured). If GP is in full-tunnel mode (0.0.0.0/0), add NetBird management server IP to GP’s split-tunnel exclusion list.

For DNS: configure NetBird with match-domain DNS for the AD domain; leave primary DNS to GP or the system default.

Recommended Migration Sequence

Phase 1: Install NetBird (GP Remains Primary) — Week 1-2

Deploy management server, configure Entra ID
Install NetBird on pilot machines with --network-monitor=false
Verify overlay connectivity; GP handles all production traffic

Phase 2: Configure Routing (GP Still Active) — Week 2-3

Deploy routing peers at both offices
Pilot users have TWO paths to resources
Test disconnecting GP on pilot machines

Phase 3: Expand to All Users (GP as Fallback) — Week 3-5

Deploy via TacticalRMM to all endpoints
GP remains installed and functional

Phase 4: Disable GP (NetBird Primary) — Week 5-8

Disable GP auto-connect (do NOT uninstall)
Remove --network-monitor=false if PR #5156 has merged
Monitor for 2 weeks

Phase 5: Decommission GP — Week 8+

Uninstall GP from all endpoints
Remove --network-monitor=false flag
Power down PA-2020 (retain 30 days before decommission)

Critical Rules:

NEVER remove GP before NetBird is verified for all users
Always use --network-monitor=false while both VPNs installed
Test rollback before expanding beyond pilot
Field workers on cellular should be in a later wave
Keep PA-2020 powered on for 30 days after full migration

Gaps & Uncertainties

Gap	Impact	Mitigation
PR #5156 merge timeline	MEDIUM	Flag is safe to run indefinitely
SQLite events.db growth rate	LOW	Implement cleanup cron job
GP full-tunnel vs split-tunnel mode	MEDIUM	Ask GSISG IT admin
Exact Azure reserved pricing	LOW	Use Azure Pricing Calculator

Sources

GitHub: #5077, #5155, #5156, #4488, #1473, #1824

Azure Pricing: cloudprice.net, Azure bandwidth pricing, Azure managed disks pricing

Community: HN NetBird discussion, carlpearson.net self-hosting guide, Cloudron forum

Official: docs.netbird.io (scaling, NAT, how-netbird-works, CLI reference)