Skip to content
This repository has been archived by the owner on Jun 28, 2024. It is now read-only.

CCv0 | kata-remote | Issues with running conformance on kata-remote runtime class #5779

Open
sharath-srikanth-chellappa opened this issue Oct 5, 2023 · 0 comments
Labels
question Requires an answer

Comments

@sharath-srikanth-chellappa

I am trying to run the e2e conformance tests as part of my setup on the kata-remote runtime class. I seem to notice that the YAML files for the test pods are very different to those generated for the kata or the kata-cc runtime class. I understand that these differences may arise due to parameters in the configuration-remote.toml differing from that of the configurations for kata and kata-cc which gives rise to the question on whether the tests are even supported on kata-remote.

To compare the 2 configuration files -

diff kata-config-toml/configuration-remote.toml configuration-clh-snp.toml
1,2c1,2
< # Copyright (c) 2017-2019 Intel Corporation
< # Copyright (c) 2023 IBM Corporation
---
> # Copyright (c) 2019 Ericsson Eurolab Deutschland GmbH
> # Copyright (c) 2021 Adobe Inc.
9c9
< # XXX: Source file: "config/configuration-remote.toml.in"
---
> # XXX: Source file: "config/configuration-clh.toml.in"
14,18c14,23
<
< [hypervisor.remote]
< remote_hypervisor_socket = "/run/peerpod/hypervisor.sock"
< remote_hypervisor_timeout = 600
<
---
> [hypervisor.clh]
> path = "/opt/confidential-containers/bin/cloud-hypervisor-snp"
> igvm = "/opt/confidential-containers/share/kata-containers/kata-containers-igvm.img"
> image = "/opt/confidential-containers/share/kata-containers/kata-containers.img"
>
> # rootfs filesystem type:
> #   - ext4 (default)
> #   - xfs
> #   - erofs
> rootfs_type="ext4"
29c34
< #   - CPU Hotplug
---
> #   - CPU Hotplug
32a38,40
> # Supported TEEs:
> # * Intel TDX
> #
34c42,67
< # confidential_guest = true
---
> confidential_guest = true
>
> # enable SEV SNP VMs.
> # This is not currently used by CLH
> sev_snp_guest = true
>
> # SNP guest policy
> # Based on SEV Secure Nested Paging Firmware ABI Specification section 4.3
> # If it is unspecified or 0, it will default to 0x30000 (i.e. Bit#17 is '1' which is reserved and Bit#16 is '1' which means SMT is allowed).
> # This is not currently used by CLH
> snp_guest_policy=0x30000
>
> # Enable running clh VMM as a non-root user.
> # By default clh VMM run as root. When this is set to true, clh VMM process runs as
> # a non-root random user. See documentation for the limitations of this mode.
> # rootless = true
>
> # disable applying SELinux on the VMM process (default false)
> disable_selinux=false
>
> # disable applying SELinux on the container process
> # If set to false, the type `container_t` is applied to the container process by default.
> # Note: To enable guest SELinux, the guest rootfs must be CentOS that is created and built
> # with `SELINUX=yes`.
> # (default: true)
> disable_guest_selinux=true
35a69,78
> # Path to the firmware.
> # If you want Cloud Hypervisor to use a specific firmware, set its path below.
> # This is option is only used when confidential_guest is enabled.
> #
> # For more information about firmwared that can be used with specific TEEs,
> # please, refer to:
> # * Intel TDX:
> #   - td-shim: https://github.com/confidential-containers/td-shim
> #
> # firmware = ""
40,41c83,89
< # Note: Remote hypervisor is only handling the following annotations
< enable_annotations = ["machine_type", "default_memory", "default_vcpus", "image", "volume_name"]
---
> enable_annotations = ["enable_iommu"]
>
> # List of valid annotations values for the hypervisor
> # Each member of the list is a path pattern as described by glob(3).
> # The default if not set is empty (all annotations rejected.)
> # Your distribution recommends: ["/opt/confidential-containers/bin/cloud-hypervisor-snp"]
> valid_hypervisor_paths = ["/opt/confidential-containers/bin/cloud-hypervisor-snp"]
53,58c101
< # NOTE: kernel_params are not currently passed over in remote hypervisor
< # kernel_params = ""
<
< # Path to the firmware.
< # If you want that qemu uses the default firmware leave this option empty
< firmware = ""
---
> kernel_params = " agent.enable_signature_verification=false "
65c108
< # default_vcpus = 1
---
> default_vcpus = 1
81,94c124
< # NOTICE: on arm platform with gicv2 interrupt controller, set it to 8.
< # default_maxvcpus = 0
<
< # Bridges can be used to hot plug devices.
< # Limitations:
< # * Currently only pci bridges are supported
< # * Until 30 devices per bridge can be hot plugged.
< # * Until 5 PCI bridges can be cold plugged per VM.
< #   This limitation could be a bug in qemu or in the kernel
< # Default number of bridges per SB/VM:
< # unspecified or 0   --> will be set to 1
< # > 1 <= 5           --> will be set to the specified number
< # > 5                --> will be set to 5
< default_bridges = 1
---
> default_maxvcpus = 0
98,100c128,129
< # Note: the remote hypervisor uses the peer pod config to determine the memory of the VM
< # default_memory = 2048
< #
---
> default_memory = 2048
>
104d132
< # Note: the remote hypervisor uses the peer pod config to determine the memory of the VM
106a135,196
> # Default maximum memory in MiB per SB / VM
> # unspecified or == 0           --> will be set to the actual amount of physical RAM
> # > 0 <= amount of physical RAM --> will be set to the specified number
> # > amount of physical RAM      --> will be set to the actual amount of physical RAM
> default_maxmemory = 0
>
> # Shared file system type:
> #   - virtio-fs (default)
> #   - virtio-fs-nydus
> shared_fs = "virtio-fs"
>
> # Path to vhost-user-fs daemon.
> virtio_fs_daemon = "/opt/confidential-containers/libexec/virtiofsd"
>
> # List of valid annotations values for the virtiofs daemon
> # The default if not set is empty (all annotations rejected.)
> # Your distribution recommends: ["/opt/confidential-containers/libexec/virtiofsd"]
> valid_virtio_fs_daemon_paths = ["/opt/confidential-containers/libexec/virtiofsd"]
>
> # Default size of DAX cache in MiB
> virtio_fs_cache_size = 0
>
> # Default size of virtqueues
> virtio_fs_queue_size = 1024
>
> # Extra args for virtiofsd daemon
> #
> # Format example:
> #   ["-o", "arg1=xxx,arg2", "-o", "hello world", "--arg3=yyy"]
> # Examples:
> #   Set virtiofsd log level to debug : ["-o", "log_level=debug"] or ["-d"]
> # see `virtiofsd -h` for possible options.
> virtio_fs_extra_args = ["--thread-pool-size=1", "-o", "announce_submounts"]
>
> # Cache mode:
> #
> #  - never
> #    Metadata, data, and pathname lookup are not cached in guest. They are
> #    always fetched from host and any changes are immediately pushed to host.
> #
> #  - auto
> #    Metadata and pathname lookup cache expires after a configured amount of
> #    time (default is 1 second). Data is cached while the file is open (close
> #    to open consistency).
> #
> #  - always
> #    Metadata, data, and pathname lookup are cached in guest and never expire.
> virtio_fs_cache = "auto"
>
> # Block storage driver to be used for the hypervisor in case the container
> # rootfs is backed by a block device. This is virtio-blk.
> block_device_driver = "virtio-blk"
>
> # Enable huge pages for VM RAM, default false
> # Enabling this will result in the VM memory
> # being allocated using huge pages.
> #enable_hugepages = true
>
> # Disable the 'seccomp' feature from Cloud Hypervisor, default false
> # TODO - to be re-enabled with next CH-SNP release. This is fixed but the fix is not yet released
> # disable_seccomp = true
>
108c198
< # to enable debug output where available. And Debug also enable the hmp socket.
---
> # to enable debug output where available.
128,139c218,290
< #guest_hook_path = "/usr/share/oci/hooks"
<
< # disable applying SELinux on the VMM process (default false)
< disable_selinux=false
<
< # disable applying SELinux on the container process
< # If set to false, the type `container_t` is applied to the container process by default.
< # Note: To enable guest SELinux, the guest rootfs must be CentOS that is created and built
< # with `SELINUX=yes`.
< # (default: true)
< # Note: The remote hypervisor has a different guest, so currently requires this to be disabled
< disable_guest_selinux = true
---
> #guest_hook_path = "/opt/confidential-containers/share/oci/hooks"
> #
> # These options are related to network rate limiter at the VMM level, and are
> # based on the Cloud Hypervisor I/O throttling.  Those are disabled by default
> # and we strongly advise users to refer the Cloud Hypervisor official
> # documentation for a better understanding of its internals:
> # https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/docs/io_throttling.md
> #
> # Bandwidth rate limiter options
> #
> # net_rate_limiter_bw_max_rate controls network I/O bandwidth (size in bits/sec
> # for SB/VM).
> # The same value is used for inbound and outbound bandwidth.
> # Default 0-sized value means unlimited rate.
> #net_rate_limiter_bw_max_rate = 0
> #
> # net_rate_limiter_bw_one_time_burst increases the initial max rate and this
> # initial extra credit does *NOT* affect the overall limit and can be used for
> # an *initial* burst of data.
> # This is *optional* and only takes effect if net_rate_limiter_bw_max_rate is
> # set to a non zero value.
> #net_rate_limiter_bw_one_time_burst = 0
> #
> # Operation rate limiter options
> #
> # net_rate_limiter_ops_max_rate controls network I/O bandwidth (size in ops/sec
> # for SB/VM).
> # The same value is used for inbound and outbound bandwidth.
> # Default 0-sized value means unlimited rate.
> #net_rate_limiter_ops_max_rate = 0
> #
> # net_rate_limiter_ops_one_time_burst increases the initial max rate and this
> # initial extra credit does *NOT* affect the overall limit and can be used for
> # an *initial* burst of data.
> # This is *optional* and only takes effect if net_rate_limiter_bw_max_rate is
> # set to a non zero value.
> #net_rate_limiter_ops_one_time_burst = 0
> #
> # These options are related to disk rate limiter at the VMM level, and are
> # based on the Cloud Hypervisor I/O throttling.  Those are disabled by default
> # and we strongly advise users to refer the Cloud Hypervisor official
> # documentation for a better understanding of its internals:
> # https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/docs/io_throttling.md
> #
> # Bandwidth rate limiter options
> #
> # disk_rate_limiter_bw_max_rate controls disk I/O bandwidth (size in bits/sec
> # for SB/VM).
> # The same value is used for inbound and outbound bandwidth.
> # Default 0-sized value means unlimited rate.
> #disk_rate_limiter_bw_max_rate = 0
> #
> # disk_rate_limiter_bw_one_time_burst increases the initial max rate and this
> # initial extra credit does *NOT* affect the overall limit and can be used for
> # an *initial* burst of data.
> # This is *optional* and only takes effect if disk_rate_limiter_bw_max_rate is
> # set to a non zero value.
> #disk_rate_limiter_bw_one_time_burst = 0
> #
> # Operation rate limiter options
> #
> # disk_rate_limiter_ops_max_rate controls disk I/O bandwidth (size in ops/sec
> # for SB/VM).
> # The same value is used for inbound and outbound bandwidth.
> # Default 0-sized value means unlimited rate.
> #disk_rate_limiter_ops_max_rate = 0
> #
> # disk_rate_limiter_ops_one_time_burst increases the initial max rate and this
> # initial extra credit does *NOT* affect the overall limit and can be used for
> # an *initial* burst of data.
> # This is *optional* and only takes effect if disk_rate_limiter_bw_max_rate is
> # set to a non zero value.
> #disk_rate_limiter_ops_one_time_burst = 0
168,169c319,320
< # (default: 30)
< #dial_timeout = 30
---
> # (default: 90)
> dial_timeout = 90
193,194c344
< # Note: The remote hypervisor, uses it's own network, so "none" is required
< internetworking_model="none"
---
> internetworking_model="tcfilter"
201d350
< # Note: The remote hypervisor has a different guest, so currently requires this to be set to true
204d352
<
234,235c382
< # Note: The remote hypervisor has a different networking model, which requires true
< disable_new_netns = true
---
> #disable_new_netns = true
252d398
< # Note: the remote hypervisor uses the peer pod config to determine the sandbox size, so requires this to be set to true
254a401,406
> # If specified, sandbox_bind_mounts identifieds host paths to be mounted (ro) into the sandboxes shared path.
> # This is only valid if filesystem sharing is utilized. The provided path(s) will be bindmounted into the shared fs directory.
> # If defaults are utilized, these mounts should be available in the guest at `/run/kata-containers/shared/containers/sandbox-mounts`
> # These will not be exposed to the container workloads, and are only provided for potential guest services.
> sandbox_bind_mounts=[]
>
278d429
< # Note: remote hypervisor has no sharing of emptydir mounts from host to guest
299,300c450
< # Note: The remote hypervisor offloads the pulling on images on the peer pod VM, so requries this to be true
< service_offload = true
---
> service_offload = false

I seem to notice that the same set of tests which pass for kata-cc and kata runtime class fail out with the kata-remote runtime class.

I am not able to pinpoint what in the configuration is giving rise to this difference. An example of a test that is failing on kata remote but passing on kata/kata-cc is the should function for intra-pod communication: udp test.

Some interesting differences that I observed in the YAMLs for the test pods generated in kata and kata-remote are given below:

diff sample/netserver-0-kata.yaml sample/netserver-0-kata-remote.yaml
5,6c5,28
<   creationTimestamp: "2023-10-04T23:07:51Z"
---
>     cni.projectcalico.org/containerID: 9e3a3c5f55b3e686924e3485a4de1b9175ac716dfa55c529bbed93269a89509a
>     cni.projectcalico.org/podIP: 172.28.95.139/32
>     cni.projectcalico.org/podIPs: 172.28.95.139/32
>     k8s.v1.cni.cncf.io/network-status: |-
>       [{
>           "name": "k8s-pod-network",
>           "ips": [
>               "172.28.95.139"
>           ],
>           "default": true,
>           "dns": {}
>       }]
>     k8s.v1.cni.cncf.io/networks-status: |-
>       [{
>           "name": "k8s-pod-network",
>           "ips": [
>               "172.28.95.139"
>           ],
>           "default": true,
>           "dns": {}
>       }]
>   creationTimestamp: "2023-10-05T17:10:09Z"
>   deletionGracePeriodSeconds: 30
>   deletionTimestamp: "2023-10-05T17:15:57Z"
19c41
<     image: registry.k8s.io/e2e-test-images/agnhost:2.43
---
>     image: k8s.gcr.io/e2e-test-images/agnhost:2.39
60,64c82,83
<     katacontainers.io/kata-runtime: "true"
<     kubernetes.io/hostname: aks-nodepool1-90529124-vmss000000
<   overhead:
<     cpu: 250m
<     memory: 160Mi
---
>     kubernetes.io/hostname: hardware0-control-plane
>     node.kubernetes.io/worker: ""
.
.
.
.

I don't quite understand what parameters lead to the introduction of these new annotations which makes the behaviour of the test so different. Why are these parameters different between kata/kata-cc and kata-remote. What is the impact of setting these to be the same.

@sharath-srikanth-chellappa sharath-srikanth-chellappa added the question Requires an answer label Oct 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Requires an answer
Projects
None yet
Development

No branches or pull requests

1 participant