Closed4

vrrp を経由させて Linux にログインできなくなった時の対処法

yumenomatayumeyumenomatayume

おうち kubernetes にアクセスする際に、作業端末の Mac とネットワークが分かれており、vyos を経由して k8s クラスタに ssh しています。

いつの間にか、sshが出来なくなっていたので原因を調べてみました。

設計情報

  • mac は home network のみに属している
  • kubernetes(k8s-master01) は public network のみに属している
  • mac に static route が設定されており、public network への通信は vyos の VIP に向いている
ホスト home network(VLAN1) public network(VLAN100)
mac 192.168.0.3 N/A
k8s-master01 N/A 192.168.100.4
vyos01 192.168.0.33(VIP: 192.168.0.4) 192.168.100.2(VIP: 192.168.100.1)
vyos02 192.168.0.34(VIP: 192.168.0.4) 192.168.100.3(VIP: 192.168.100.1)
$ ip r get 192.168.100.0/24
192.168.100.0 via 192.168.0.4 dev en0  src 192.168.0.3

ssh の結果

$ ssh kube@192.168.100.4 -i ~/.ssh/id_rsa.kube -v
OpenSSH_8.1p1, LibreSSL 2.7.3
debug1: Reading configuration data /Users/yumenomatayume/.ssh/config
debug1: /Users/yumenomatayume/.ssh/config line 1: Applying options for *
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 47: Applying options for *
debug1: Connecting to 192.168.100.4 [192.168.100.4] port 22.
debug1: Connection established.
debug1: identity file /Users/yumenomatayume/.ssh/id_rsa.kube type 0
debug1: identity file /Users/yumenomatayume/.ssh/id_rsa.kube-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.1
kex_exchange_identification: read: Connection reset by peer

ProxyCommand を使用するとアクセス出来ました。

$ ssh kube@192.168.100.4 -i ~/.ssh/id_rsa.kube -o ProxyCommand='ssh -W %h:%p vyos@192.168.0.4'
Warning: Permanently added '192.168.0.4' (ECDSA) to the list of known hosts.
Welcome to VyOS
vyos@192.168.0.4's password: 
Warning: Permanently added '192.168.100.4' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 20.04.1 LTS (GNU/Linux 5.4.0-42-generic x86_64)

 * Documentation:  <https://help.ubuntu.com>
 * Management:     <https://landscape.canonical.com>
 * Support:        <https://ubuntu.com/advantage>

Last login: Thu Jul 22 11:52:21 2021 from 192.168.100.2
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

kube@k8s-master01:~$

yumenomatayumeyumenomatayume

原因

macに繋がるネットワーク(VLAN1)と、k8sに繋がるネットワーク(VLAN100)でvrrpのStateが異なる事が原因でした。

vyos@vyos01:~$ show vrrp                                              │vyos@vyos02:~$ show vrrp 
Name     Interface      VRID  State      Priority  Last Transition    │Name     Interface      VRID  State      Priority  Last Transition
-------  -----------  ------  -------  ----------  -----------------  │-------  -----------  ------  -------  ----------  -----------------
VLAN1    eth0              1  MASTER          150  26m13s             │VLAN1    eth0              1  BACKUP          100  4h27m1s
VLAN100  eth1            100  BACKUP          150  29m15s             │VLAN100  eth1            100  MASTER          100  4h30m45s
VLAN101  eth2            101  BACKUP          150  29m15s             │VLAN101  eth2            101  MASTER          100  4h30m45s
vyos@vyos01:~$                                                        │vyos@vyos02:~$

ただ、なぜ kex_exchange_identification 起こるのか?という根本原因は解決できておりません。

$ ssh kube@192.168.100.4 -i ~/.ssh/id_rsa.kube -o ProxyCommand='ssh -W %h:%p vyos@192.168.0.4' -v
OpenSSH_8.1p1, LibreSSL 2.7.3
debug1: Reading configuration data /Users/yumenomatayume/.ssh/config
debug1: /Users/yumenomatayume/.ssh/config line 1: Applying options for *
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 47: Applying options for *
debug1: Executing proxy command: exec ssh -W 192.168.100.4:22 vyos@192.168.0.4
debug1: identity file /Users/yumenomatayume/.ssh/id_rsa.kube type 0
debug1: identity file /Users/yumenomatayume/.ssh/id_rsa.kube-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.1
Warning: Permanently added '192.168.0.4' (ECDSA) to the list of known hosts.
Welcome to VyOS
vyos@192.168.0.4's password: 
debug1: Remote protocol version 2.0, remote software version OpenSSH_8.2p1 Ubuntu-4ubuntu0.2
debug1: match: OpenSSH_8.2p1 Ubuntu-4ubuntu0.2 pat OpenSSH* compat 0x04000000
debug1: Authenticating to 192.168.100.4:22 as 'kube'
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: ecdsa-sha2-nistp256
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: Server host key: ecdsa-sha2-nistp256 SHA256:aVa3cvRE6m3Y5eu2dVCB8FuDbmvZKadl1wi00p53fDI
Warning: Permanently added '192.168.100.4' (ECDSA) to the list of known hosts.
debug1: rekey out after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: rekey in after 134217728 blocks
debug1: Will attempt key: /Users/yumenomatayume/.ssh/id_rsa.kube RSA SHA256:3LjTzRjOs44NxAxrbRf6DNA6+Q3yzNHm883DOrSxvnQ explicit
debug1: SSH2_MSG_EXT_INFO received
debug1: kex_input_ext_info: server-sig-algs=<ssh-ed25519,sk-ssh-ed25519@openssh.com,ssh-rsa,rsa-sha2-256,rsa-sha2-512,ssh-dss,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,sk-ecdsa-sha2-nistp256@openssh.com>
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey
debug1: Next authentication method: publickey
debug1: Offering public key: /Users/yumenomatayume/.ssh/id_rsa.kube RSA SHA256:3LjTzRjOs44NxAxrbRf6DNA6+Q3yzNHm883DOrSxvnQ explicit
debug1: Server accepts key: /Users/yumenomatayume/.ssh/id_rsa.kube RSA SHA256:3LjTzRjOs44NxAxrbRf6DNA6+Q3yzNHm883DOrSxvnQ explicit
debug1: Authentication succeeded (publickey).
Authenticated to 192.168.100.4 (via proxy).
debug1: channel 0: new [client-session]
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug1: pledge: proc
debug1: client_input_global_request: rtype hostkeys-00@openssh.com want_reply 0
debug1: Remote: /home/kube/.ssh/authorized_keys:1: key options: agent-forwarding port-forwarding pty user-rc x11-forwarding
debug1: Remote: /home/kube/.ssh/authorized_keys:1: key options: agent-forwarding port-forwarding pty user-rc x11-forwarding
debug1: Sending environment.
debug1: Sending env LANG = ja_JP.UTF-8
Welcome to Ubuntu 20.04.1 LTS (GNU/Linux 5.4.0-42-generic x86_64)

 * Documentation:  <https://help.ubuntu.com>
 * Management:     <https://landscape.canonical.com>
 * Support:        <https://ubuntu.com/advantage>

 * Super-optimized for small spaces - read how we shrank the memory
   footprint of MicroK8s to make it the smallest full K8s around.

   <https://ubuntu.com/blog/microk8s-memory-optimisation>
Last login: Thu Jul 22 12:23:34 2021 from 192.168.100.2
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.
yumenomatayumeyumenomatayume

今回の場合だと、home network の vrrp は vyos01 の IP アドレスが選択されています。
つまり経路で言うと以下になりますが、

  • mac -> vyos01(192.168.0.33) -> k8s-master01(192.168.100.4)
vyos@vyos01:~$ show ip route                                          │vyos@vyos02:~$ 
Codes: K - kernel route, C - connected, S - static, R - RIP,          │vyos@vyos02:~$ 
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,             │vyos@vyos02:~$ 
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,      │vyos@vyos02:~$ show ip route 
       F - PBR, f - OpenFabric,                                       │Codes: K - kernel route, C - connected, S - static, R - RIP,
       > - selected route, * - FIB route, q - queued, r - rejected, b │       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
- backup                                                              │       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
                                                                      │       F - PBR, f - OpenFabric,
S>* 0.0.0.0/0 [210/0] via 192.168.0.1, eth0, weight 1, 08:29:46       │       > - selected route, * - FIB route, q - queued, r - rejected, b 
C * 192.168.0.0/24 is directly connected, eth0, 08:29:42              │- backup
C>* 192.168.0.0/24 is directly connected, eth0, 08:29:46              │
C * 192.168.100.0/24 is directly connected, eth1, 08:29:42            │S>* 0.0.0.0/0 [210/0] via 192.168.0.1, eth0, weight 1, 08:28:50
C>* 192.168.100.0/24 is directly connected, eth1, 08:29:47            │C>* 192.168.0.0/24 is directly connected, eth0, 08:28:50
C * 192.168.101.0/24 is directly connected, eth2, 08:29:42            │C>* 192.168.100.0/24 is directly connected, eth1, 08:28:51
C>* 192.168.101.0/24 is directly connected, eth2, 08:29:47            │C>* 192.168.101.0/24 is directly connected, eth2, 08:28:51
vyos@vyos01:~$                                                        │vyos@vyos02:~$ 

tracepath を実行しても、直接疎通が取れているように見えます。

vyos@vyos01:~$ traceroute 192.168.100.4                               │vyos@vyos02:~$ traceroute 192.168.100.4
traceroute to 192.168.100.4 (192.168.100.4), 30 hops max, 60 byte pack│traceroute to 192.168.100.4 (192.168.100.4), 30 hops max, 60 byte pack
ets                                                                   │ets
 1  192.168.100.4 (192.168.100.4)  0.092 ms  0.096 ms  0.079 ms       │ 1  192.168.100.4 (192.168.100.4)  0.048 ms  0.058 ms  0.053 ms
vyos@vyos01:~$                                                        │vyos@vyos02:~$ 
yumenomatayumeyumenomatayume

解決方法

その1: sync-groupで複数interfaceのmaster/backupを揃える

master/backupを揃えたいgroupを、sync-groupに追加します。

今回は、全てのgroupを揃えてみたので、ルータ単位でmaster/backupを固定したことになります。

set high-availability vrrp sync-group MAIN member VLAN1
set high-availability vrrp sync-group MAIN member VLAN10
set high-availability vrrp sync-group MAIN member VLAN101

Priorityは揃える必要があります。

vyos@vyos01:~$ show vrrp                                              │vyos@vyos02:~$ show vrrp 
Name     Interface      VRID  State      Priority  Last Transition    │Name     Interface      VRID  State      Priority  Last Transition
-------  -----------  ------  -------  ----------  -----------------  │-------  -----------  ------  -------  ----------  -----------------
VLAN1    eth0              1  MASTER          150  38s                │VLAN1    eth0              1  BACKUP          100  41s
VLAN100  eth1            100  MASTER          150  38s                │VLAN100  eth1            100  BACKUP          100  41s
VLAN101  eth2            101  MASTER          150  38s                │VLAN101  eth2            101  BACKUP          100  41s
vyos@vyos01:~$                                                        │vyos@vyos02:~$

ただし、以下のように別々のinterfaceがdownした場合はvrrpがFAULTになってしまう欠点があります。

  • vyos01: eth1をdisable
  • vyos02: eth2をdisable
vyos@vyos01:~$ show vrrp                                              │vyos@vyos02:~$ show vrrp
Name     Interface      VRID  State      Priority  Last Transition    │Name     Interface      VRID  State      Priority  Last Transition
-------  -----------  ------  -------  ----------  -----------------  │-------  -----------  ------  -------  ----------  -----------------
VLAN1    eth0              1  FAULT           150  2m37s              │VLAN1    eth0              1  FAULT           100  2m37s
VLAN100  eth1            100  FAULT           150  2m37s              │VLAN100  eth1            100  FAULT           100  2m37s
VLAN101  eth2            101  FAULT           150  2m37s              │VLAN101  eth2            101  FAULT           100  2m37s

その2: preempt で master となる interface を固定する

現状はこれにしています。

各 Interface で preempt を有効にします。
そして、master となる vyos01 の priority を vyos02 のものより高くします。

delete high-availability vrrp group VLAN1 no-preempt
delete high-availability vrrp group VLAN100 no-preempt
delete high-availability vrrp group VLAN101 no-preempt

この場合だと、ある Active の Interface が disable になった時、その Interface に紐づくもののみが failover されます。
そして、enable に戻ったときは常に vyos01 が master となります。

Interface は Sync してないので、1 つの Interface が down した場合でも FAULT となることはありません。
ただし、全ての Interface の master が 1 つの vyos に固定された時のみ vyos を超えたネットワークに ssh できるようになります。

このスクラップは2021/08/17にクローズされました