OVN-Kubernetes - My VirtualMachine can not access most external addresses

Let us switch gears and take a look at an OpenShift Virtualization use case where we have a Virtual Machine using a secondary network interface on a VLAN.

Issues with VirtualMachine external access

The VirtualMachine can only access certain websites and external resources.

For reference, we have one NodeNetworkConfigurationPolicy called br-ex-localnet on the br-ex bridge interface:

oc get nncp br-ex-localnet -o yaml
kind: NodeNetworkConfigurationPolicy
metadata:
  annotations:
  name: br-ex-localnet
spec:
  desiredState:
    ovn:
      bridge-mappings:
      - bridge: br-ex
        localnet: br-ex-localnet
        state: present
  nodeSelector:
    node-role.kubernetes.io/worker: ""

Our VM is using a NetworkAttachmentDefinition called vlan530 that specifies a physicalNetworkName of br-ex-localnet, matching the NodeNetworkConfigurationPolicy spec.desiredState.ovn.bridge-mappings.localnet field, to attach directly to the hosts physical network interface and tagged onto vlanId 530.

oc get network-attachment-definitions -n default vlan530 -o yaml
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: vlan530
  namespace: default
spec:
  config: '{
    "name":"vlan530",
    "type":"ovn-k8s-cni-overlay",
    "cniVersion":"0.4.0",
    "physicalNetworkName":"br-ex-localnet",
    "vlanId": 530,
    "topology":"localnet",
    "netAttachDefName":"default/vlan530"
    }'

Debug the Traffic Flow

In the previous modules we provided ovnkube-trace examples so you see how to easily run traces in a running OpenShift cluster. As of the writing of this lab, the ovnkube-trace wrapper does not support VMs or secondary interfaces.

If you think that functionality would be valuable, add your input or customer to the feature request: Debugging ovnkube-trace improvements - add ability to trace VMs on secondary networks

Running the Trace

Let’s craft a trace from scratch by filling in the following pieces:

  1. datapath → is the logical router or switch associated to the workload from the northbound table

microflow:

  1. inport → is the OVN port for instance we are tracing from

  2. eth.src → source pod mac address

  3. eth.dst → destination mac address

  4. ip4.src → source pod IP address

  5. ip4.dst → external destination address

  6. tcp.dst → destination port

  7. tcp.src → source port

First, find the datapath by looking at the logical switch list:

ovn-nbctl ls-list
cc63f8ab-f7ff-4906-a15b-fb7b24c62eb2 (br.ex.localnet_ovn_localnet_switch)
1524ecbe-0704-4e9f-b97d-9a09e984ef02 (cluster_udn_drenard.udn.tamlab_ovn_layer2_switch)
4168b09e-0580-4d54-9ef2-3c5dc361a0bf (ext_worker-5.ocpv.tamlab.rdu2.redhat.com)
26008754-5629-419b-87cb-bf9fbbf9ff7f (join)
57b99e34-e732-4ce1-8244-eb507ee02ffe (ovs.bridge.vlan530_ovn_localnet_switch)
8763ad07-147d-4f38-b87f-41b5c152a7bb (transit_switch)
16ccd735-09f4-41c1-a546-2df09f22c2f7 (vlan.636.localnet_ovn_localnet_switch)
217a444d-ef97-4d51-89d8-581d673e3cf0 (worker-5.ocpv.tamlab.rdu2.redhat.com)

Based on our NNCP and NAD configuration, we know br.ex.localnet_ovn_localnet_switch (or it’s associated UUID) is the correct logical switch.

Now that we know the logical switch, we can find the switch port, which will act as the inport for our microflow. Locate the logical switch port (by name or UUID) of our virtual machine using lsp-list on the br.ex.localnet_ovn_localnet_switch logical switch:

ovn-nbctl lsp-list br.ex.localnet_ovn_localnet_switch
ef35b56d-78c1-4464-b53c-4c33c82cfca1 (br.ex.localnet_ovn_localnet_port)
a213268a-3bc1-4988-80d8-7bc18f84818d (default.vlan530_external_virt-launcher-external-vm-x7gnx)
156a0c30-938b-4295-a736-0baaa3f73d94 (default.vlan530_faatam_virt-launcher-faatam-idm1-ghrg2)

We can then fill in the remainder of the fields from the VMI, the destination IP address we are targeting and some default values for TTL and our source and destination ports:

  1. eth.src==02:a1:3b:00:00:15

  2. ip4.src==10.6.153.247

  3. ip4.dst==140.82.112.2

  4. ip.ttl==64

  5. tcp.dst==80

  6. tcp.src==60000

A difference from previous traces is that we do not set an eth.dst MAC address which we will discuss later in the trace analysis.

Putting all of that together, you can run the following ovn-trace from your execution environment:

ovn-trace --no-leader-only  --db unix:/var/run/ovn/ovnsb_db.sock br.ex.localnet_ovn_localnet_switch 'inport=="default.vlan530_external_virt-launcher-external-vm-x7gnx" && eth.src==02:a1:3b:00:00:15 && ip4.src==10.6.153.247 && ip4.dst==140.82.112.2 && ip.ttl==64 && tcp.dst==80 && tcp.src==60000'

When the ovn-trace runs, you get a full accounting of how that packet would traverse the SDN and where it would exit to hit the desired external resource.

ingress(dp="br.ex.localnet_ovn_localnet_switch", inport="default.vlan530_external_virt-launcher-external-vm-x7gnx")
-------------------------------------------------------------------------------------------------------------------
 0. ls_in_check_port_sec (northd.c:9437): 1, priority 50, uuid fc8d6873
    reg0[15] = check_in_port_sec();
    next;
 4. ls_in_pre_acl (northd.c:6092): ip, priority 100, uuid 6aeebe54
    reg0[0] = 1;
    next;
 6. ls_in_pre_stateful (northd.c:6342): reg0[0] == 1, priority 100, uuid 516d2bfd
    ct_next(dnat);

ct_next(ct_state=est|trk /* default (use --ct to customize) */)
---------------------------------------------------------------
 7. ls_in_acl_hint (northd.c:6437): !ct.new && ct.est && !ct.rpl && ct_mark.blocked == 0, priority 4, uuid cdda5685
    reg0[8] = 1;
    reg0[10] = 1;
    next;
10. ls_in_acl_action (northd.c:7335): 1, priority 0, uuid 21f3b456
    reg8[16] = 0;
    reg8[17] = 0;
    reg8[18] = 0;
    next;
20. ls_in_acl_after_lb_action (northd.c:7346): reg8[30..31] == 0, priority 500, uuid 2ad8154b
    reg8[30..31] = 1;
    next(18);
20. ls_in_acl_after_lb_action (northd.c:7346): reg8[30..31] == 1, priority 500, uuid 2b82ad72
    reg8[30..31] = 2;
    next(18);
18. ls_in_acl_after_lb_eval (northd.c:7175): reg8[30..31] == 2 && reg0[10] == 1 && (inport == @a10139399715150233253), priority 2000, uuid a20fd388
    reg8[17] = 1;
    ct_commit { ct_mark.blocked = 1; ct_label.obs_point_id = 0; };
    next;
20. ls_in_acl_after_lb_action (northd.c:7319): reg8[17] == 1, priority 1000, uuid e3c8afab
    reg8[16] = 0;
    reg8[17] = 0;
    reg8[18] = 0;
    reg8[30..31] = 0;

If we look at the output, we can see ls_in_acl_after_lb_eval is evaluating inport == @a10139399715150233253.

As with previous failures, we see an abrupt end with no clear indication of success or failure and we commit ct_mark.blocked = 1 which means that the connection is no longer allowed by the policy and anything in reply will be dropped.

18. ls_in_acl_after_lb_eval (northd.c:7175): reg8[30..31] == 2 && reg0[10] == 1 && (inport == @a10139399715150233253), priority 2000, uuid a20fd388
    reg8[17] = 1;
    ct_commit { ct_mark.blocked = 1; ct_label.obs_point_id = 0; };

We can also look at the inport a10139399715150233253 as a portgroup just like we did with outport in previous traces.

ovn-nbctl find Port_Group name="a10139399715150233253"
_uuid               : 87422145-5e09-4d4c-be45-2690faa9482a
acls                : [14b9c1ad-15fa-4049-96e6-84da7e674128, cce499ff-ddff-48a8-b74c-fdeb707fb685]
external_ids        : {direction=Egress, "k8s.ovn.org/id"="br-ex-localnet-network-controller:NetpolNamespace:external:Egress", "k8s.ovn.org/name"=external, "k8s.ovn.org/owner-controller"=br-ex-localnet-network-controller, "k8s.ovn.org/owner-type"=NetpolNamespace}
name                : a10139399715150233253
ports               : [a213268a-3bc1-4988-80d8-7bc18f84818d]

From the output there are 2 pieces of important information:

  1. acls → there are 2 UUIDs[14b9c1ad-15fa-4049-96e6-84da7e674128, cce499ff-ddff-48a8-b74c-fdeb707fb685]

  2. ports → there is 1 UUID[a213268a-3bc1-4988-80d8-7bc18f84818d]

You can see that the port UUID correspond to the UUID of the logical switch port entry for default.vlan530_external_virt-launcher-external-vm-x7gnx which confirms the portgroup is influencing our VM.

Taking a look at the 2 access control list (ACL) objects, we can see:

ovn-nbctl list ACL 14b9c1ad-15fa-4049-96e6-84da7e674128
_uuid               : 14b9c1ad-15fa-4049-96e6-84da7e674128
action              : allow
direction           : from-lport
external_ids        : {direction=Egress, "k8s.ovn.org/id"="br-ex-localnet-network-controller:NetpolNamespace:external:Egress:arpAllow", "k8s.ovn.org/name"=external, "k8s.ovn.org/owner-controller"=br-ex-localnet-network-controller, "k8s.ovn.org/owner-type"=NetpolNamespace, type=arpAllow}
label               : 0
log                 : false
match               : "inport == @a10139399715150233253 && (arp || nd)"
meter               : acl-logging
name                : "NP:external:Egress"
options             : {apply-after-lb="true"}
priority            : 1001
sample_est          : []
sample_new          : []
severity            : []
tier                : 2

You can ignore the first ACL as it is matching on arp || nd and we are sending tcp requests.

ovn-nbctl list ACL cce499ff-ddff-48a8-b74c-fdeb707fb685
_uuid               : cce499ff-ddff-48a8-b74c-fdeb707fb685
action              : drop
direction           : from-lport
external_ids        : {direction=Egress, "k8s.ovn.org/id"="br-ex-localnet-network-controller:NetpolNamespace:external:Egress:defaultDeny", "k8s.ovn.org/name"=external, "k8s.ovn.org/owner-controller"=br-ex-localnet-network-controller, "k8s.ovn.org/owner-type"=NetpolNamespace, type=defaultDeny}
label               : 0
log                 : false
match               : "inport == @a10139399715150233253"
meter               : acl-logging
name                : "NP:external:Egress"
options             : {apply-after-lb="true"}
priority            : 1000
sample_est          : []
sample_new          : []
severity            : []
tier                : 2

Looking at the output from the second, we can see a few key points:

  1. actiondrop

  2. directionto-lport

  3. external_idsexternal:Egress:defaultDeny

  4. matchinport == @a10139399715150233253

This tells us that any traffic that is forwarded to a logical port that matches inport == @a10139399715150233253 will be dropped due to a defaultDeny policy in the external namespace.

Looking back at our Lab Setup Network Policies, we know that the cluster has a deny-by-default MultiNetworkPolicy in every Namespace. That aligns to the policy above that is impacting our external communication.

Now let’s run a trace that is successful.

We change our destination address to something that has been working: ip4.dst==140.82.113.2

ovn-trace --no-leader-only  --db unix:/var/run/ovn/ovnsb_db.sock br.ex.localnet_ovn_localnet_switch 'inport=="default.vlan530_external_virt-launcher-external-vm-x7gnx" && eth.src==02:a1:3b:00:00:15 && ip4.src==10.6.153.247 && ip4.dst==140.82.113.2 && ip.ttl==64 && tcp.dst==80 && tcp.src==60000'
# tcp,reg14=0x5,vlan_tci=0x0000,dl_src=02:a1:3b:00:00:15,dl_dst=00:00:00:00:00:00,nw_src=10.6.153.247,nw_dst=140.82.113.2,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=60000,tp_dst=80,tcp_flags=0

ingress(dp="br.ex.localnet_ovn_localnet_switch", inport="default.vlan530_external_virt-launcher-external-vm-x7gnx")
-------------------------------------------------------------------------------------------------------------------
 0. ls_in_check_port_sec (northd.c:9437): 1, priority 50, uuid fc8d6873
    reg0[15] = check_in_port_sec();
    next;
 4. ls_in_pre_acl (northd.c:6092): ip, priority 100, uuid 6aeebe54
    reg0[0] = 1;
    next;
 6. ls_in_pre_stateful (northd.c:6342): reg0[0] == 1, priority 100, uuid 516d2bfd
    ct_next(dnat);

ct_next(ct_state=est|trk /* default (use --ct to customize) */)
---------------------------------------------------------------
 7. ls_in_acl_hint (northd.c:6437): !ct.new && ct.est && !ct.rpl && ct_mark.blocked == 0, priority 4, uuid cdda5685
    reg0[8] = 1;
    reg0[10] = 1;
    next;
10. ls_in_acl_action (northd.c:7335): 1, priority 0, uuid 21f3b456
    reg8[16] = 0;
    reg8[17] = 0;
    reg8[18] = 0;
    next;
20. ls_in_acl_after_lb_action (northd.c:7346): reg8[30..31] == 0, priority 500, uuid 2ad8154b
    reg8[30..31] = 1;
    next(18);
20. ls_in_acl_after_lb_action (northd.c:7346): reg8[30..31] == 1, priority 500, uuid 2b82ad72
    reg8[30..31] = 2;
    next(18);
18. ls_in_acl_after_lb_eval (northd.c:7127): reg8[30..31] == 2 && reg0[8] == 1 && (ip4.dst == 140.82.113.0/24 && inport == @a6086215553377573259), priority 2001, uuid 22969843
    reg8[16] = 1;
    next;
20. ls_in_acl_after_lb_action (northd.c:7314): reg8[16] == 1, priority 1000, uuid a00e31d8
    reg8[16] = 0;
    reg8[17] = 0;
    reg8[18] = 0;
    reg8[30..31] = 0;
    next;
28. ls_in_l2_lkup (northd.c:5883): 1, priority 0, uuid 222432ad
    outport = get_fdb(eth.dst);
    next;
29. ls_in_l2_unknown (northd.c:9374): outport == "none", priority 50, uuid 64d6fbdf
    outport = "_MC_unknown";
    output;

multicast(dp="br.ex.localnet_ovn_localnet_switch", mcgroup="_MC_unknown")
-------------------------------------------------------------------------

    egress(dp="br.ex.localnet_ovn_localnet_switch", inport="default.vlan530_external_virt-launcher-external-vm-x7gnx", outport="br.ex.localnet_ovn_localnet_port")
    --------------------------------------------------------------------------------------------------------------------------------------------------------------
         2. ls_out_pre_acl (northd.c:5944): ip && outport == "br.ex.localnet_ovn_localnet_port", priority 110, uuid bf10c02c
            next;
         3. ls_out_pre_lb (northd.c:5944): ip && outport == "br.ex.localnet_ovn_localnet_port", priority 110, uuid 9ee8a726
            next;
         5. ls_out_acl_hint (northd.c:6437): !ct.new && ct.est && !ct.rpl && ct_mark.blocked == 0, priority 4, uuid cfae0de0
            reg0[8] = 1;
            reg0[10] = 1;
            next;
         8. ls_out_acl_action (northd.c:7346): reg8[30..31] == 0, priority 500, uuid 9ba0e4eb
            reg8[30..31] = 1;
            next(6);
         8. ls_out_acl_action (northd.c:7346): reg8[30..31] == 1, priority 500, uuid a72c3c38
            reg8[30..31] = 2;
            next(6);
         8. ls_out_acl_action (northd.c:7335): 1, priority 0, uuid 2e305a31
            reg8[16] = 0;
            reg8[17] = 0;
            reg8[18] = 0;
            reg8[30..31] = 0;
            next;
        11. ls_out_check_port_sec (northd.c:5904): 1, priority 0, uuid a87e1d41
            reg0[15] = check_out_port_sec();
            next;
        12. ls_out_apply_port_sec (northd.c:5912): 1, priority 0, uuid 7e9dcb6d
            output;
            /* output to "br.ex.localnet_ovn_localnet_port", type "localnet" */

We can see that the request is successful outputting to output to "br.ex.localnet_ovn_localnet_port", type "localnet".

Unique to this output is the multicast broadcast:

28. ls_in_l2_lkup (northd.c:5883): 1, priority 0, uuid 222432ad
    outport = get_fdb(eth.dst);
    next;
29. ls_in_l2_unknown (northd.c:9374): outport == "none", priority 50, uuid 64d6fbdf
    outport = "_MC_unknown";
    output;

multicast(dp="br.ex.localnet_ovn_localnet_switch", mcgroup="_MC_unknown")

Earlier in the lab I mentioned we were not setting a eth.dst. This is because the destination mac of the external resource will be unknown inside the SDN, so even if we set it, it will still be treated as unknown.

You can see this when it tries to do a layer 2 lookup get_fdb(eth.dst) to find the outport, but gets nothing in back.

This means we flood the request to the logical switch ports that have addresses set to "unknown" aka the _MC_unknown "multicast" group.

This can seen by executing a show on the logical switch our VM is attached to (br.ex.localnet_ovn_localnet_switch):

ovn-nbctl show br.ex.localnet_ovn_localnet_switch

In it, you can see br.ex.localnet_ovn_localnet_port has addresses set to unknown and that is ultimately where the request went:

switch cc63f8ab-f7ff-4906-a15b-fb7b24c62eb2 (br.ex.localnet_ovn_localnet_switch)
    port default.vlan530_faatam_virt-launcher-faatam-idm1-ghrg2
        addresses: ["02:a1:3b:00:00:2f"]
    port default.vlan530_external_virt-launcher-external-vm-x7gnx
        addresses: ["02:a1:3b:00:00:15"]
    port br.ex.localnet_ovn_localnet_port
        type: localnet
        addresses: ["unknown"]

Moving on, we can see ls_in_acl_after_lb_eval is evaluating ip4.dst == 140.82.113.0/24 && inport == @a6086215553377573259

18. ls_in_acl_after_lb_eval (northd.c:7127): reg8[30..31] == 2 && reg0[8] == 1 && (ip4.dst == 140.82.113.0/24 && inport == @a6086215553377573259), priority 2001, uuid 22969843

We can see that the ip4.dst subnet range 140.82.113.0/24 includes our target destination address of 140.82.113.2.

Looking at the inport a6086215553377573259 as a portgroup:

ovn-nbctl find Port_Group name="a6086215553377573259"
_uuid               : 47add0e0-a59a-4aca-94b0-047bf42c6c10
acls                : [a2a890ed-68b4-46f2-9d72-a6f8bc82e242, e408d3fb-9869-435a-ab9a-e56186f2cc02]
external_ids        : {"k8s.ovn.org/id"="br-ex-localnet-network-controller:NetworkPolicy:external:egressallow-gh", "k8s.ovn.org/name"="external:egressallow-gh", "k8s.ovn.org/owner-controller"=br-ex-localnet-network-controller, "k8s.ovn.org/owner-type"=NetworkPolicy}
name                : a6086215553377573259
ports               : [a213268a-3bc1-4988-80d8-7bc18f84818d]

We can see 2 pieces of important information:

  1. acls → there are 2 UUIDs[a2a890ed-68b4-46f2-9d72-a6f8bc82e242, e408d3fb-9869-435a-ab9a-e56186f2cc02]

  2. ports → there is 1 UUID[a213268a-3bc1-4988-80d8-7bc18f84818d]

The port UUID correspond to the UUID of the logical switch port entry for default.vlan530_external_virt-launcher-external-vm-x7gnx which confirms the portgroup is influencing our VM.

Looking at the 2 access control list (ACL) objects, we can see:

ovn-nbctl list ACL a2a890ed-68b4-46f2-9d72-a6f8bc82e242

A match rule ip4.dst == 140.82.113.0/24 that allows traffic to that range stemming from a NetworkPolicy called egressallow-gh in the external namespace.

_uuid               : a2a890ed-68b4-46f2-9d72-a6f8bc82e242
action              : allow-related
direction           : from-lport
external_ids        : {direction=Egress, gress-index="0", ip-block-index="0", "k8s.ovn.org/id"="br-ex-localnet-network-controller:NetworkPolicy:external:egressallow-gh:Egress:0:None:0", "k8s.ovn.org/name"="external:egressallow-gh", "k8s.ovn.org/owner-controller"=br-ex-localnet-network-controller, "k8s.ovn.org/owner-type"=NetworkPolicy, port-policy-protocol=None}
label               : 0
log                 : false
match               : "ip4.dst == 140.82.113.0/24 && inport == @a6086215553377573259"
meter               : acl-logging
name                : "NP:external:egressallow-gh:Egress:0"
options             : {apply-after-lb="true"}
priority            : 1001
sample_est          : []
sample_new          : []
severity            : []
tier                : 2
ovn-nbctl list ACL e408d3fb-9869-435a-ab9a-e56186f2cc02

The second ACL has a match rule ip4.dst == 8.8.8.8/32 that allows traffic to that range stemming from the same NetworkPolicy called egressallow-gh in the external namespace.

_uuid               : e408d3fb-9869-435a-ab9a-e56186f2cc02
action              : allow-related
direction           : from-lport
external_ids        : {direction=Egress, gress-index="0", ip-block-index="1", "k8s.ovn.org/id"="br-ex-localnet-network-controller:NetworkPolicy:external:egressallow-gh:Egress:0:None:1", "k8s.ovn.org/name"="external:egressallow-gh", "k8s.ovn.org/owner-controller"=br-ex-localnet-network-controller, "k8s.ovn.org/owner-type"=NetworkPolicy, port-policy-protocol=None}
label               : 0
log                 : false
match               : "ip4.dst == 8.8.8.8/32 && inport == @a6086215553377573259"
meter               : acl-logging
name                : "NP:external:egressallow-gh:Egress:0"
options             : {apply-after-lb="true"}
priority            : 1001
sample_est          : []
sample_new          : []
severity            : []
tier                : 2

Looking back at the Lab Setup Network Policies, the MultiNetworkPolicy ipBlock entries correspond to the match strings in the ACL.

We also see that the MultiNetworkPolicy only applies to default/vlan530 and pods with the internet: "true" label.

Based on that information, we know that our policy in blocking all external traffic by default and only allowing access to 140.82.113.0/24 and 8.8.8.8/32.