IQ 仓网络双线自动切换 — 部署手册

Dual-WAN Auto-Failover Deployment Guide · Cisco 4500-X · IP SLA + Tracking
致本地 IT 团队 / For Local IT Team — 每一步含验证命令 + 回滚方案 · Every Step Verified with Rollback

§0 当前拓扑 Current Topology

Horizon Fiber  +  Nexus Wireless   →   Cisco 4500-X   →   MikroTik RB4011iGS+   →   Cisco 9400 / 2960X   →   LAN + WLAN
目标:Horizon 光纤(主)→ 断线自动切 Nexus 无线(备)→ 恢复后自动切回 · 切换不丢已有连接
Goal: Horizon fiber (primary) → auto-failover to Nexus wireless (backup) → auto-revert on recovery. No connection drops during switch.

§1 安全守则 Safety Rules — Read First

禁止 / DO NOT: reload · 重启 · 拔线做测试。所有操作均不影响现有流量。 Never reload, reboot, or unplug cables. All operations are non-disruptive to live traffic.
禁止 / DO NOT: 删除现有路由条目。只加新路由,不删旧路由。 Never remove existing routes. Add new routes only — never delete old ones.
禁止 / DO NOT: 在业务高峰期操作。选午休 13:00-14:00 或 21:00 后。 Never work during peak hours. Choose lunch break (13:00-14:00) or after 21:00 local time.
必须 / MUST: 每步做完立即验证。看到预期输出才继续下一步。 Verify after every step. Only proceed when expected output is confirmed.
必须 / MUST: 提前复制当前 running-config 到记事本。出问题 30 秒内能回滚。 Copy current running-config to a text file before starting. 30-second rollback available at all times.
全程无感 / Zero Downtime: 所有配置是 add(新增)。最坏情况:新路由未生效,流量走原路。 All operations are add-only. Worst case: new routes don't activate, traffic flows via original path.

§2 第一步:现状快照 Pre-flight — Document Current State

1 进入 4500-X,跑下面 5 条命令,截图保存 / Run these 5 commands, save all output

# ① 当前路由表 / Current routing table show ip route 0.0.0.0 # ② 所有接口状态 / All interface status (note ISP interface names & IPs) show ip interface brief | exclude unassigned # ③ 确认 ip routing 是否已开启 / Check if ip routing is enabled show running-config | include ip routing # ④ 当前 ISP 接口配置 / Current ISP interface config show running-config interface TenGigabitEthernet1/1/1 show running-config interface TenGigabitEthernet1/1/2 # ⑤ VLAN / switchport 状态 / Interface status show interfaces status | include Te1/1/1|Te1/1/2
从这 5 条输出中确认 / Confirm from output: ① ip routing 是否已开启 / Is ip routing enabled?(已有输出 → 跳过 §3 第 2 步)
② 两条 ISP 线分别接在哪个口 / Which interfaces connect to Horizon and Nexus?(记录全名 / full interface name, e.g. Te1/1/1)
③ 两个接口的当前 IP 和子网掩码 / IP addresses and netmasks for both interfaces
④ 接口是否已是 L3 / Are ports already L3?(show run 里有 ip address = L3,有 switchport = L2)
⑤ 当前默认路由走哪个网关 / Which gateway is the current default route pointing to?

§3 第二步:配置自动切换 Configuration — Add Only, Never Remove

⚠️ 替换占位符 / Replace Placeholders: 下面命令里的 <...> 请替换为 §2 中记下的实际值 / Replace with actual values from Step §2。
核对清单 / Checklist: Horizon interface & IP & GW / Nexus interface & IP & GW — 共 6 个值 / 6 values total.

2 启用 IP 路由 / Enable IP Routing (skip if already enabled)

如果 §2 第③条已有输出 → 跳过此步 / SKIP.

没有输出才做 / Only run if no output from §2③:

configure terminal ip routing end show running-config | include ip routing # 验证:应有输出 / Verify: should see "ip routing"

3 确认 ISP 接口是 L3 模式 / Confirm ISP Ports are L3 Mode

从 §2 第④条判断 / Check §2④: 如果有 ip address x.x.x.x已是 L3 / Already L3 — 跳过 / SKIP.

如果有 switchport → 需要切换 / Must convert. 该链路会中断 2-3 秒 / Link drops ~2-3s. 午休做 / Do during lunch. 一次只做一个口 / One port at a time:

# 先做 Horizon 口 / Horizon first — 做完验证再做 Nexus / verify, then Nexus configure terminal interface <Horizon-interface> no switchport ip address <Horizon-IP> <netmask> no shutdown end show ip interface brief | include <Horizon-interface> # 验证 / Verify: Protocol = up # 确认链路恢复后 / Confirm link is back, then do Nexus configure terminal interface <Nexus-interface> no switchport ip address <Nexus-IP> <netmask> no shutdown end show ip interface brief | include <Nexus-interface> # 验证 / Verify: Protocol = up
回滚 / Rollback: interface <name> → switchport → no ip address → end

4 配置 IP SLA 探测 / Configure IP SLA Probes (zero impact)

configure terminal ! Horizon 探测 / probe — 每 5s ping 8.8.8.8 / every 5s ip sla 1 icmp-echo 8.8.8.8 source-interface <Horizon-interface> frequency 5 timeout 1000 ip sla schedule 1 life forever start-time now ! Nexus 探测 / probe — 每 10s ping 8.8.4.4 / every 10s (wireless tolerates jitter) ip sla 2 icmp-echo 8.8.4.4 source-interface <Nexus-interface> frequency 10 timeout 2000 ip sla schedule 2 life forever start-time now end ! 验证 / Verify: both should show "reachable" show ip sla statistics

5 配置 Track 跟踪对象 / Configure Tracking Objects (zero impact)

configure terminal track 1 ip sla 1 reachability delay down 15 up 10 track 2 ip sla 2 reachability delay down 30 up 20 end ! 验证 / Verify: both tracks should show "Up" show track brief

6 添加双默认路由 / Add Dual Default Routes (core step)

关键 / Critical: 只加不删 / Add only, never remove. 新路由加完后 distance=1 优先走 Horizon,distance=10 备用走 Nexus。如果旧默认路由也是 distance=1 且网关不同 → 删旧的或改 distance=5,避免冲突。

configure terminal ! 主路由 / Primary — Horizon, distance=1 (preferred), tracked ip route 0.0.0.0 0.0.0.0 <Horizon-GW> track 1 ! 备路由 / Backup — Nexus, distance=10 (only active if primary fails) ip route 0.0.0.0 0.0.0.0 <Nexus-GW> 10 end ! 验证 / Verify: primary shows "*" (active), backup listed but inactive show ip route 0.0.0.0
如果旧默认路由 distance 也是 1 且网关不同 / If old default route also has distance=1 with different GW:no ip route 0.0.0.0 0.0.0.0 <old-GW> 删旧的,或把旧的改 distance=5。不要让两条 distance=1 的默认路由同时存在。 Remove old route or change its distance to 5. Never have two distance=1 default routes simultaneously.

§4 第三步:无感验证 Non-disruptive Verification

7 正常状态确认 / Confirm Normal State

# ① 主路由 active / Primary route active — shows via Horizon-GW with * show ip route 0.0.0.0 # ② SLA 都 reachable / Both SLAs reachable show ip sla statistics | include reachable|Latest # ③ Track 都 Up / Both tracks Up show track brief # ④ 从仓库一台 PC ping 8.8.8.8 / Ping from any warehouse PC — should work via Horizon ping 8.8.8.8

8 模拟切换测试(不影响业务) / Simulated Failover Test (non-disruptive)

不是拔线——用 management shutdown 关闭 Horizon 接口。WMS 系统会短暂卡 10-15 秒然后恢复。选午休做。

No cable pulling. Use admin shutdown on Horizon interface. WMS may hitch for 10-15 seconds then recover. Do during lunch break.

# ① 先确认当前路由 / Confirm current route first show ip route 0.0.0.0 # 应为 / Should be: Horizon # ② 禁用 Horizon 接口 / Shutdown Horizon (simulating fiber cut) configure terminal interface <Horizon-interface> shutdown end # ③ 等 10-15 秒,观察路由切换 / Wait, observe failover show ip route 0.0.0.0 # 应为 / Should be: Nexus show track brief # Track 1 = Down # ④ 从 PC ping 8.8.8.8 / Ping from PC — 应通 / should work via Nexus ping 8.8.8.8 # ⑤ 恢复 Horizon / Bring Horizon back configure terminal interface <Horizon-interface> no shutdown end # ⑥ 等 10 秒,确认自动切回 / Wait, confirm auto-revert show ip route 0.0.0.0 # 应为 / Should be: Horizon (recovered) show track brief # Track 1 = Up
切换中断时间 / Switch interruption: 10-15 秒 / Seconds. WMS 系统会有短暂重连 / brief reconnect. 如果 30 秒后还没切过去 / If not switched after 30s → 执行回滚 / Rollback (§5).

§5 回滚方案 Rollback Plan

方案 A / Plan A: 去激活新路由 / Remove new routes — 最快 30 秒 / Fastest: 30 seconds

configure terminal no ip route 0.0.0.0 0.0.0.0 <Horizon-GW> track 1 no ip route 0.0.0.0 0.0.0.0 <Nexus-GW> 10 end show ip route 0.0.0.0 # 确认回到旧默认路由 / Confirm back to original route

方案 B / Plan B: 恢复接口 / Re-enable interface — if you shutdown a port and forgot no shutdown

configure terminal interface <Horizon-interface> no shutdown end

§6 部署后清单 Post-Deployment Checklist

  • Horizon 正常时,traceroute 8.8.8.8 走 Horizon 网关 / traceroute 8.8.8.8 goes via Horizon gateway when fiber is up
  • shutdown Horizon 口 15s 内自动切到 Nexus / Failover to Nexus within 15s of shutting down Horizon port
  • no shutdown 后 10s 内自动切回 Horizon / Auto-revert to Horizon within 10s of re-enabling port
  • WMS 系统在切换后能正常打开(刷新页面即可) / WMS works after switch (refresh browser)
  • 所有配置已 write memory 保存 / All config saved with write memory
  • 当前 running-config 已备份到文本文件 / Running-config backed up to text file(发一份给无重 / Send copy to 无重)
  • 已告知仓库:如果断网超过 30 秒,打 IT 电话 / Warehouse staff notified: call IT if network is down >30 seconds
# 保存配置 / Save configuration write memory # 或 / or: copy running-config startup-config

§7 常见问题 Troubleshooting

切换后不通? / No traffic after failover?

检查备路由 gateway 是否可达 / Check if backup gateway is reachable:
ping <Nexus-GW> source <Nexus-interface>

ping 不通 → ISP 无线链路有问题,与 4500-X 配置无关 / Gateway unreachable = ISP wireless issue, not 4500-X config problem.

SLA 始终 unreachable? / SLA always unreachable?

ISP 网关可能禁了 ICMP / ISP gateway may block ICMP. 把 probe IP 从 8.8.8.8 换成 ISP 网关 IP 本身 / Change probe target to ISP gateway IP.

Track 状态 flapping? / Track flapping?

无线链路波动大 / Wireless link unstable → 调大 delay down:
track 2 → delay down 60 up 30

避免瞬间波动反复切换 / Prevents frequent toggling from brief signal drops.

配置全丢了? / All config lost?

没 write memory / Didn't save. 重新 configure terminal 从头配一遍 / Re-enter config and redo all steps (all are add-only, won't break anything). 配完立刻 write memory / Save immediately after.

§8 速查卡片 Quick Reference Card — Print & Post on Rack

部署前填好 / Fill Before Deployment

Horizon 接口 / Interface___________ Horizon IP___________
Nexus 接口 / Interface___________ Nexus IP___________
Horizon 网关 / GW___________ Nexus 网关 / GW___________
验证 / Verify: show ip route 0.0.0.0 · show track brief · show ip sla statistics
模拟切换 / Test Failover: int <H> → shutdown → show ip route → no shutdown
回滚 / Rollback: no ip route 0.0.0.0 0.0.0.0 <GW> · 保存 / Save: write memory
核心设计 / Core Design Principle: 全程 add-only。不删路由、不重启设备、不拔线。最坏情况:新路由没生效,旧路由继续工作,仓库零感知。唯一有感知的时刻是 §4 第 8 步主动 shutdown 测试 — 选午休做,30 秒内完成验证和恢复。 Add-only from start to finish. No route removal, no reboot, no cable pull. Worst case: new routes don't activate, old routes keep working, warehouse notices nothing. The only planned interruption is §4 Step 8 — scheduled during lunch, verified and recovered within 30 seconds.