vrrp を経由させて Linux にログインできなくなった時の対処法
おうち kubernetes にアクセスする際に、作業端末の Mac とネットワークが分かれており、vyos を経由して k8s クラスタに ssh しています。
いつの間にか、sshが出来なくなっていたので原因を調べてみました。
設計情報
- mac は home network のみに属している
- kubernetes(k8s-master01) は public network のみに属している
- mac に static route が設定されており、public network への通信は vyos の VIP に向いている
ホスト | home network(VLAN1) | public network(VLAN100) |
---|---|---|
mac | 192.168.0.3 | N/A |
k8s-master01 | N/A | 192.168.100.4 |
vyos01 | 192.168.0.33(VIP: 192.168.0.4) | 192.168.100.2(VIP: 192.168.100.1) |
vyos02 | 192.168.0.34(VIP: 192.168.0.4) | 192.168.100.3(VIP: 192.168.100.1) |
$ ip r get 192.168.100.0/24
192.168.100.0 via 192.168.0.4 dev en0 src 192.168.0.3
ssh の結果
$ ssh kube@192.168.100.4 -i ~/.ssh/id_rsa.kube -v
OpenSSH_8.1p1, LibreSSL 2.7.3
debug1: Reading configuration data /Users/yumenomatayume/.ssh/config
debug1: /Users/yumenomatayume/.ssh/config line 1: Applying options for *
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 47: Applying options for *
debug1: Connecting to 192.168.100.4 [192.168.100.4] port 22.
debug1: Connection established.
debug1: identity file /Users/yumenomatayume/.ssh/id_rsa.kube type 0
debug1: identity file /Users/yumenomatayume/.ssh/id_rsa.kube-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.1
kex_exchange_identification: read: Connection reset by peer
ProxyCommand を使用するとアクセス出来ました。
$ ssh kube@192.168.100.4 -i ~/.ssh/id_rsa.kube -o ProxyCommand='ssh -W %h:%p vyos@192.168.0.4'
Warning: Permanently added '192.168.0.4' (ECDSA) to the list of known hosts.
Welcome to VyOS
vyos@192.168.0.4's password:
Warning: Permanently added '192.168.100.4' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 20.04.1 LTS (GNU/Linux 5.4.0-42-generic x86_64)
* Documentation: <https://help.ubuntu.com>
* Management: <https://landscape.canonical.com>
* Support: <https://ubuntu.com/advantage>
Last login: Thu Jul 22 11:52:21 2021 from 192.168.100.2
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.
kube@k8s-master01:~$
原因
macに繋がるネットワーク(VLAN1)と、k8sに繋がるネットワーク(VLAN100)でvrrpのStateが異なる事が原因でした。
vyos@vyos01:~$ show vrrp │vyos@vyos02:~$ show vrrp
Name Interface VRID State Priority Last Transition │Name Interface VRID State Priority Last Transition
------- ----------- ------ ------- ---------- ----------------- │------- ----------- ------ ------- ---------- -----------------
VLAN1 eth0 1 MASTER 150 26m13s │VLAN1 eth0 1 BACKUP 100 4h27m1s
VLAN100 eth1 100 BACKUP 150 29m15s │VLAN100 eth1 100 MASTER 100 4h30m45s
VLAN101 eth2 101 BACKUP 150 29m15s │VLAN101 eth2 101 MASTER 100 4h30m45s
vyos@vyos01:~$ │vyos@vyos02:~$
ただ、なぜ kex_exchange_identification
起こるのか?という根本原因は解決できておりません。
$ ssh kube@192.168.100.4 -i ~/.ssh/id_rsa.kube -o ProxyCommand='ssh -W %h:%p vyos@192.168.0.4' -v
OpenSSH_8.1p1, LibreSSL 2.7.3
debug1: Reading configuration data /Users/yumenomatayume/.ssh/config
debug1: /Users/yumenomatayume/.ssh/config line 1: Applying options for *
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 47: Applying options for *
debug1: Executing proxy command: exec ssh -W 192.168.100.4:22 vyos@192.168.0.4
debug1: identity file /Users/yumenomatayume/.ssh/id_rsa.kube type 0
debug1: identity file /Users/yumenomatayume/.ssh/id_rsa.kube-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.1
Warning: Permanently added '192.168.0.4' (ECDSA) to the list of known hosts.
Welcome to VyOS
vyos@192.168.0.4's password:
debug1: Remote protocol version 2.0, remote software version OpenSSH_8.2p1 Ubuntu-4ubuntu0.2
debug1: match: OpenSSH_8.2p1 Ubuntu-4ubuntu0.2 pat OpenSSH* compat 0x04000000
debug1: Authenticating to 192.168.100.4:22 as 'kube'
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: ecdsa-sha2-nistp256
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: Server host key: ecdsa-sha2-nistp256 SHA256:aVa3cvRE6m3Y5eu2dVCB8FuDbmvZKadl1wi00p53fDI
Warning: Permanently added '192.168.100.4' (ECDSA) to the list of known hosts.
debug1: rekey out after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: rekey in after 134217728 blocks
debug1: Will attempt key: /Users/yumenomatayume/.ssh/id_rsa.kube RSA SHA256:3LjTzRjOs44NxAxrbRf6DNA6+Q3yzNHm883DOrSxvnQ explicit
debug1: SSH2_MSG_EXT_INFO received
debug1: kex_input_ext_info: server-sig-algs=<ssh-ed25519,sk-ssh-ed25519@openssh.com,ssh-rsa,rsa-sha2-256,rsa-sha2-512,ssh-dss,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,sk-ecdsa-sha2-nistp256@openssh.com>
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey
debug1: Next authentication method: publickey
debug1: Offering public key: /Users/yumenomatayume/.ssh/id_rsa.kube RSA SHA256:3LjTzRjOs44NxAxrbRf6DNA6+Q3yzNHm883DOrSxvnQ explicit
debug1: Server accepts key: /Users/yumenomatayume/.ssh/id_rsa.kube RSA SHA256:3LjTzRjOs44NxAxrbRf6DNA6+Q3yzNHm883DOrSxvnQ explicit
debug1: Authentication succeeded (publickey).
Authenticated to 192.168.100.4 (via proxy).
debug1: channel 0: new [client-session]
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug1: pledge: proc
debug1: client_input_global_request: rtype hostkeys-00@openssh.com want_reply 0
debug1: Remote: /home/kube/.ssh/authorized_keys:1: key options: agent-forwarding port-forwarding pty user-rc x11-forwarding
debug1: Remote: /home/kube/.ssh/authorized_keys:1: key options: agent-forwarding port-forwarding pty user-rc x11-forwarding
debug1: Sending environment.
debug1: Sending env LANG = ja_JP.UTF-8
Welcome to Ubuntu 20.04.1 LTS (GNU/Linux 5.4.0-42-generic x86_64)
* Documentation: <https://help.ubuntu.com>
* Management: <https://landscape.canonical.com>
* Support: <https://ubuntu.com/advantage>
* Super-optimized for small spaces - read how we shrank the memory
footprint of MicroK8s to make it the smallest full K8s around.
<https://ubuntu.com/blog/microk8s-memory-optimisation>
Last login: Thu Jul 22 12:23:34 2021 from 192.168.100.2
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.
今回の場合だと、home network の vrrp は vyos01 の IP アドレスが選択されています。
つまり経路で言うと以下になりますが、
- mac -> vyos01(192.168.0.33) -> k8s-master01(192.168.100.4)
vyos@vyos01:~$ show ip route │vyos@vyos02:~$
Codes: K - kernel route, C - connected, S - static, R - RIP, │vyos@vyos02:~$
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, │vyos@vyos02:~$
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, │vyos@vyos02:~$ show ip route
F - PBR, f - OpenFabric, │Codes: K - kernel route, C - connected, S - static, R - RIP,
> - selected route, * - FIB route, q - queued, r - rejected, b │ O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
- backup │ T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
│ F - PBR, f - OpenFabric,
S>* 0.0.0.0/0 [210/0] via 192.168.0.1, eth0, weight 1, 08:29:46 │ > - selected route, * - FIB route, q - queued, r - rejected, b
C * 192.168.0.0/24 is directly connected, eth0, 08:29:42 │- backup
C>* 192.168.0.0/24 is directly connected, eth0, 08:29:46 │
C * 192.168.100.0/24 is directly connected, eth1, 08:29:42 │S>* 0.0.0.0/0 [210/0] via 192.168.0.1, eth0, weight 1, 08:28:50
C>* 192.168.100.0/24 is directly connected, eth1, 08:29:47 │C>* 192.168.0.0/24 is directly connected, eth0, 08:28:50
C * 192.168.101.0/24 is directly connected, eth2, 08:29:42 │C>* 192.168.100.0/24 is directly connected, eth1, 08:28:51
C>* 192.168.101.0/24 is directly connected, eth2, 08:29:47 │C>* 192.168.101.0/24 is directly connected, eth2, 08:28:51
vyos@vyos01:~$ │vyos@vyos02:~$
tracepath を実行しても、直接疎通が取れているように見えます。
vyos@vyos01:~$ traceroute 192.168.100.4 │vyos@vyos02:~$ traceroute 192.168.100.4
traceroute to 192.168.100.4 (192.168.100.4), 30 hops max, 60 byte pack│traceroute to 192.168.100.4 (192.168.100.4), 30 hops max, 60 byte pack
ets │ets
1 192.168.100.4 (192.168.100.4) 0.092 ms 0.096 ms 0.079 ms │ 1 192.168.100.4 (192.168.100.4) 0.048 ms 0.058 ms 0.053 ms
vyos@vyos01:~$ │vyos@vyos02:~$
解決方法
その1: sync-groupで複数interfaceのmaster/backupを揃える
master/backupを揃えたいgroupを、sync-groupに追加します。
今回は、全てのgroupを揃えてみたので、ルータ単位でmaster/backupを固定したことになります。
set high-availability vrrp sync-group MAIN member VLAN1
set high-availability vrrp sync-group MAIN member VLAN10
set high-availability vrrp sync-group MAIN member VLAN101
Priorityは揃える必要があります。
vyos@vyos01:~$ show vrrp │vyos@vyos02:~$ show vrrp
Name Interface VRID State Priority Last Transition │Name Interface VRID State Priority Last Transition
------- ----------- ------ ------- ---------- ----------------- │------- ----------- ------ ------- ---------- -----------------
VLAN1 eth0 1 MASTER 150 38s │VLAN1 eth0 1 BACKUP 100 41s
VLAN100 eth1 100 MASTER 150 38s │VLAN100 eth1 100 BACKUP 100 41s
VLAN101 eth2 101 MASTER 150 38s │VLAN101 eth2 101 BACKUP 100 41s
vyos@vyos01:~$ │vyos@vyos02:~$
ただし、以下のように別々のinterfaceがdownした場合はvrrpがFAULTになってしまう欠点があります。
- vyos01: eth1をdisable
- vyos02: eth2をdisable
vyos@vyos01:~$ show vrrp │vyos@vyos02:~$ show vrrp
Name Interface VRID State Priority Last Transition │Name Interface VRID State Priority Last Transition
------- ----------- ------ ------- ---------- ----------------- │------- ----------- ------ ------- ---------- -----------------
VLAN1 eth0 1 FAULT 150 2m37s │VLAN1 eth0 1 FAULT 100 2m37s
VLAN100 eth1 100 FAULT 150 2m37s │VLAN100 eth1 100 FAULT 100 2m37s
VLAN101 eth2 101 FAULT 150 2m37s │VLAN101 eth2 101 FAULT 100 2m37s
その2: preempt で master となる interface を固定する
現状はこれにしています。
各 Interface で preempt を有効にします。
そして、master となる vyos01 の priority を vyos02 のものより高くします。
delete high-availability vrrp group VLAN1 no-preempt
delete high-availability vrrp group VLAN100 no-preempt
delete high-availability vrrp group VLAN101 no-preempt
この場合だと、ある Active の Interface が disable になった時、その Interface に紐づくもののみが failover されます。
そして、enable に戻ったときは常に vyos01 が master となります。
Interface は Sync してないので、1 つの Interface が down した場合でも FAULT となることはありません。
ただし、全ての Interface の master が 1 つの vyos に固定された時のみ vyos を超えたネットワークに ssh できるようになります。