Skip to content
This repository was archived by the owner on May 12, 2021. It is now read-only.
This repository was archived by the owner on May 12, 2021. It is now read-only.

Performance related I/O issues #195

@raravena80

Description

@raravena80

Description of problem

We are seeing less than desirable numbers when running the Cassandra stress test on a 2 node Scylla cluster running on the same machine. The baseline is the same cluster but running on straight Docker containers.

We are also seeing the same behavior if using Storage Driver: devicemapper instead of Storage Driver: overlay2

More details here: http://blog.serverbooter.com/post/scylla-in-kata-containers/

Expected result

Better performance number. There's an overhead by running in VM, but the numbers show up to 80% degradation on reads for the test. Something like around like 30% degradation would be nice. Or we can take advantage of VMs talking directly to the hardware it would be great, to possible better performance(?)

Actual result

Details here:

http://blog.serverbooter.com/post/scylla-in-kata-containers/


Output of kata-collect-data.sh

Meta details

Running kata-collect-data.sh version 0.1.0 (commit bc83bf052de7157c3e36afa24d7f0f54e91b5ddf) at 2018-04-10.13:55:38.980643542+0000.


Runtime is /usr/local/bin/kata-runtime.

kata-env

Output of "/usr/local/bin/kata-runtime kata-env":

[Meta]
  Version = "1.0.10"

[Runtime]
  Debug = false
  [Runtime.Version]
    Semver = "0.1.0"
    Commit = "bc83bf052de7157c3e36afa24d7f0f54e91b5ddf"
    OCI = "1.0.1"
  [Runtime.Config]
    Path = "/usr/share/defaults/kata-containers/configuration.toml"

[Hypervisor]
  MachineType = "pc"
  Version = "QEMU emulator version 2.7.1(2.7.1+git.d4a337fe91-11.cc), Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers"
  Path = "/usr/bin/qemu-lite-system-x86_64"
  Debug = false
  BlockDeviceDriver = "virtio-scsi"

[Image]
  Path = "/usr/share/kata-containers/kata-containers-2018-04-06-03:29:24.440175437+0000-2b2b68d"

[Kernel]
  Path = "/usr/share/clear-containers/vmlinuz-4.14.22-86.container"
  Parameters = "agent.log=debug"

[Initrd]
  Path = ""

[Proxy]
  Type = "kataProxy"
  Version = "kata-proxy version 0.0.1-de18929631091cb1f079b2e6e1caa3e4625f4930"
  Path = "/usr/libexec/kata-containers/kata-proxy"
  Debug = false

[Shim]
  Type = "kataShim"
  Version = "kata-shim version 0.0.1-2bdea2e41109b6b71c600899e1bf5794b125200d"
  Path = "/usr/libexec/kata-containers/kata-shim"
  Debug = false

[Agent]
  Type = "kata"
  Version = "<<unknown>>"

[Host]
  Kernel = "4.4.122-mainline-rev1"
  Architecture = "amd64"
  VMContainerCapable = true
  [Host.Distro]
    Name = "Ubuntu"
    Version = "16.04"
  [Host.CPU]
    Vendor = "GenuineIntel"
    Model = "Intel(R) Atom(TM) CPU  C2550  @ 2.40GHz"

Runtime config files

Runtime default config files

/etc/kata-containers/configuration.toml
/usr/share/defaults/kata-containers/configuration.toml

Runtime config file contents

Config file /etc/kata-containers/configuration.toml not found
Output of "cat "/usr/share/defaults/kata-containers/configuration.toml"":

# XXX: WARNING: this file is auto-generated.
# XXX:
# XXX: Source file: "cli/config/configuration.toml.in"
# XXX: Project:
# XXX:   Name: Kata Containers
# XXX:   Type: kata

[hypervisor.qemu]
#path = "/usr/bin/qemu-system-x86_64"
path = "/usr/bin/qemu-lite-system-x86_64"
kernel = "/usr/share/kata-containers/vmlinuz.container"
# initrd = "/usr/share/kata-containers/kata-containers-initrd.img"
image = "/usr/share/kata-containers/kata-containers.img"
machine_type = "pc"

# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
# trouble running pre-2.15 glibc.
#
# WARNING: - any parameter specified here will take priority over the default
# parameter value of the same name used to start the virtual machine.
# Do not set values here unless you understand the impact of doing so as you
# may stop the virtual machine from booting.
# To see the list of default parameters, enable hypervisor debug, create a
# container and look for 'default-kernel-parameters' log entries.
kernel_params = " agent.log=debug"

# Path to the firmware.
# If you want that qemu uses the default firmware leave this option empty
firmware = ""

# Machine accelerators
# comma-separated list of machine accelerators to pass to the hypervisor.
# For example, `machine_accelerators = "nosmm,nosmbus,nosata,nopit,static-prt,nofw"`
machine_accelerators=""

# Default number of vCPUs per POD/VM:
# unspecified or 0                --> will be set to 1
# < 0                             --> will be set to the actual number of physical cores
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores      --> will be set to the actual number of physical cores
default_vcpus = 1


# Bridges can be used to hot plug devices.
# Limitations:
# * Currently only pci bridges are supported
# * Until 30 devices per bridge can be hot plugged.
# * Until 5 PCI bridges can be cold plugged per VM.
#   This limitation could be a bug in qemu or in the kernel
# Default number of bridges per POD/VM:
# unspecified or 0   --> will be set to 1
# > 1 <= 5           --> will be set to the specified number
# > 5                --> will be set to 5
default_bridges = 1

# Default memory size in MiB for POD/VM.
# If unspecified then it will be set 2048 MiB.
#default_memory = 2048

# Disable block device from being used for a container's rootfs.
# In case of a storage driver like devicemapper where a container's
# root file system is backed by a block device, the block device is passed
# directly to the hypervisor for performance reasons.
# This flag prevents the block device from being passed to the hypervisor,
# 9pfs is used instead to pass the rootfs.
disable_block_device_use = false

# Block storage driver to be used for the hypervisor in case the container
# rootfs is backed by a block device. This is either virtio-scsi or
# virtio-blk.
block_device_driver = "virtio-scsi"

# Enable iothreads (data-plane) to be used. This causes IO to be
# handled in a separate IO thread. This is currently only implemented
# for SCSI.
#
enable_iothreads = false

# Enable pre allocation of VM RAM, default false
# Enabling this will result in lower container density
# as all of the memory will be allocated and locked
# This is useful when you want to reserve all the memory
# upfront or in the cases where you want memory latencies
# to be very predictable
# Default false
#enable_mem_prealloc = true

# Enable huge pages for VM RAM, default false
# Enabling this will result in the VM memory
# being allocated using huge pages.
# This is useful when you want to use vhost-user network
# stacks within the container. This will automatically
# result in memory pre allocation
#enable_hugepages = true

# Enable swap of vm memory. Default false.
# The behaviour is undefined if mem_prealloc is also set to true
#enable_swap = true

# This option changes the default hypervisor and kernel parameters
# to enable debug output where available. This extra output is added
# to the proxy logs, but only when proxy debug is also enabled.
#
# Default false
enable_debug = true

# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
# behaving as it would when running on bare metal.
#
#disable_nesting_checks = true

[proxy.kata]
path = "/usr/libexec/kata-containers/kata-proxy"

# If enabled, proxy messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[shim.kata]
path = "/usr/libexec/kata-containers/kata-shim"

# If enabled, shim messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[agent.kata]
# There is no field for this section. The goal is only to be able to
# specify which type of agent the user wants to use.

[runtime]
# If enabled, the runtime will log additional debug messages to the
# system log
# (default: disabled)
#enable_debug = true
#
# Internetworking model
# Determines how the VM should be connected to the
# the container network interface
# Options:
#
#   - bridged
#     Uses a linux bridge to interconnect the container interface to
#     the VM. Works for most cases except macvlan and ipvlan.
#
#   - macvtap
#     Used when the Container network interface can be bridged using
#     macvtap.
internetworking_model="macvtap"

Agent

version:

kata-agent version 0.0.1-c77567216cf0b9c7df4851592b25956b6f21dce1

Logfiles

Runtime logs

No recent runtime problems found in system journal.

Proxy logs

No recent proxy problems found in system journal.

Shim logs

No recent shim problems found in system journal.


Container manager details

Have docker

Docker

Output of "docker version":

Client:
 Version:	18.03.0-ce
 API version:	1.37
 Go version:	go1.9.4
 Git commit:	0520e24
 Built:	Wed Mar 21 23:10:01 2018
 OS/Arch:	linux/amd64
 Experimental:	false
 Orchestrator:	swarm

Server:
 Engine:
  Version:	18.03.0-ce
  API version:	1.37 (minimum version 1.12)
  Go version:	go1.9.4
  Git commit:	0520e24
  Built:	Wed Mar 21 23:08:31 2018
  OS/Arch:	linux/amd64
  Experimental:	false

Output of "docker info":

Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 1
Server Version: 18.03.0-ce
Storage Driver: devicemapper
 Pool Name: docker-thinpool
 Pool Blocksize: 524.3kB
 Base Device Size: 10.74GB
 Backing Filesystem: ext4
 Udev Sync Supported: true
 Data Space Used: 1.152GB
 Data Space Total: 47.5GB
 Data Space Available: 46.34GB
 Metadata Space Used: 626.7kB
 Metadata Space Total: 499.1MB
 Metadata Space Available: 498.5MB
 Thin Pool Minimum Free Space: 4.75GB
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: true
 Deferred Deleted Device Count: 0
 Library Version: 1.02.110 (2015-10-30)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: kata-runtime runc
Default Runtime: runc
Init Binary: docker-init
containerd version: cfd04396dc68220d1cecbe686a6cc3aa5ce3667c
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.122-mainline-rev1
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.752GiB
Name: katatests
ID: FVTG:DPRW:YDNR:MBR7:YPWY:3BQC:POWA:OOVQ:KHO3:CC7B:WBB5:Z6H4
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 33
 Goroutines: 42
 System Time: 2018-04-10T13:55:40.01964918Z
 EventsListeners: 0
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Output of "systemctl show docker":

Type=simple
Restart=on-failure
NotifyAccess=none
RestartUSec=100ms
TimeoutStartUSec=infinity
TimeoutStopUSec=1min 30s
RuntimeMaxUSec=infinity
WatchdogUSec=0
WatchdogTimestamp=Tue 2018-04-10 02:59:36 UTC
WatchdogTimestampMonotonic=540000287
FailureAction=none
PermissionsStartOnly=no
RootDirectoryStartOnly=no
RemainAfterExit=no
GuessMainPID=yes
MainPID=5206
ControlPID=0
FileDescriptorStoreMax=0
NFileDescriptorStore=0
StatusErrno=0
Result=success
ExecMainStartTimestamp=Tue 2018-04-10 02:59:36 UTC
ExecMainStartTimestampMonotonic=540000226
ExecMainExitTimestampMonotonic=0
ExecMainPID=5206
ExecMainCode=0
ExecMainStatus=0
ExecStart={ path=/usr/bin/dockerd ; argv[]=/usr/bin/dockerd -D --default-runtime runc --add-runtime kata-runtime=/usr/local/bin/kata-runtime ; ignore_errors=no ; start_time=[Tue 2018-04-10 02:59:36 UTC] ; stop_time=[n/a] ; pid=5206 ; code=(null) ; status=0/0 }
ExecReload={ path=/bin/kill ; argv[]=/bin/kill -s HUP $MAINPID ; ignore_errors=no ; start_time=[n/a] ; stop_time=[n/a] ; pid=0 ; code=(null) ; status=0/0 }
Slice=system.slice
ControlGroup=/system.slice/docker.service
MemoryCurrent=3222347776
CPUUsageNSec=8685366929353
TasksCurrent=111
Delegate=yes
CPUAccounting=no
CPUShares=18446744073709551615
StartupCPUShares=18446744073709551615
CPUQuotaPerSecUSec=infinity
BlockIOAccounting=no
BlockIOWeight=18446744073709551615
StartupBlockIOWeight=18446744073709551615
MemoryAccounting=no
MemoryLimit=18446744073709551615
DevicePolicy=auto
TasksAccounting=no
TasksMax=18446744073709551615
UMask=0022
LimitCPU=18446744073709551615
LimitCPUSoft=18446744073709551615
LimitFSIZE=18446744073709551615
LimitFSIZESoft=18446744073709551615
LimitDATA=18446744073709551615
LimitDATASoft=18446744073709551615
LimitSTACK=18446744073709551615
LimitSTACKSoft=8388608
LimitCORE=18446744073709551615
LimitCORESoft=18446744073709551615
LimitRSS=18446744073709551615
LimitRSSSoft=18446744073709551615
LimitNOFILE=1048576
LimitNOFILESoft=1048576
LimitAS=18446744073709551615
LimitASSoft=18446744073709551615
LimitNPROC=18446744073709551615
LimitNPROCSoft=18446744073709551615
LimitMEMLOCK=65536
LimitMEMLOCKSoft=65536
LimitLOCKS=18446744073709551615
LimitLOCKSSoft=18446744073709551615
LimitSIGPENDING=31712
LimitSIGPENDINGSoft=31712
LimitMSGQUEUE=819200
LimitMSGQUEUESoft=819200
LimitNICE=0
LimitNICESoft=0
LimitRTPRIO=0
LimitRTPRIOSoft=0
LimitRTTIME=18446744073709551615
LimitRTTIMESoft=18446744073709551615
OOMScoreAdjust=0
Nice=0
IOScheduling=0
CPUSchedulingPolicy=0
CPUSchedulingPriority=0
TimerSlackNSec=50000
CPUSchedulingResetOnFork=no
NonBlocking=no
StandardInput=null
StandardOutput=journal
StandardError=inherit
TTYReset=no
TTYVHangup=no
TTYVTDisallocate=no
SyslogPriority=30
SyslogLevelPrefix=yes
SyslogLevel=6
SyslogFacility=3
SecureBits=0
CapabilityBoundingSet=18446744073709551615
AmbientCapabilities=0
MountFlags=0
PrivateTmp=no
PrivateNetwork=no
PrivateDevices=no
ProtectHome=no
ProtectSystem=no
SameProcessGroup=no
UtmpMode=init
IgnoreSIGPIPE=yes
NoNewPrivileges=no
SystemCallErrorNumber=0
RuntimeDirectoryMode=0755
KillMode=process
KillSignal=15
SendSIGKILL=yes
SendSIGHUP=no
Id=docker.service
Names=docker.service
Requires=docker.socket sysinit.target system.slice
Wants=network-online.target
WantedBy=multi-user.target
ConsistsOf=docker.socket
Conflicts=shutdown.target
Before=multi-user.target shutdown.target
After=firewalld.service network-online.target docker.socket sysinit.target systemd-journald.socket system.slice basic.target
TriggeredBy=docker.socket
Documentation=https://docs.docker.com
Description=Docker Application Container Engine
LoadState=loaded
ActiveState=active
SubState=running
FragmentPath=/lib/systemd/system/docker.service
DropInPaths=/etc/systemd/system/docker.service.d/kata-containers.conf
UnitFileState=enabled
UnitFilePreset=enabled
StateChangeTimestamp=Tue 2018-04-10 02:59:36 UTC
StateChangeTimestampMonotonic=540000289
InactiveExitTimestamp=Tue 2018-04-10 02:59:36 UTC
InactiveExitTimestampMonotonic=540000289
ActiveEnterTimestamp=Tue 2018-04-10 02:59:36 UTC
ActiveEnterTimestampMonotonic=540000289
ActiveExitTimestamp=Tue 2018-04-10 02:58:06 UTC
ActiveExitTimestampMonotonic=450505725
InactiveEnterTimestamp=Tue 2018-04-10 02:58:07 UTC
InactiveEnterTimestampMonotonic=451515435
CanStart=yes
CanStop=yes
CanReload=yes
CanIsolate=no
StopWhenUnneeded=no
RefuseManualStart=no
RefuseManualStop=no
AllowIsolate=no
DefaultDependencies=yes
OnFailureJobMode=replace
IgnoreOnIsolate=no
NeedDaemonReload=no
JobTimeoutUSec=infinity
JobTimeoutAction=none
ConditionResult=yes
AssertResult=yes
ConditionTimestamp=Tue 2018-04-10 02:59:36 UTC
ConditionTimestampMonotonic=539968286
AssertTimestamp=Tue 2018-04-10 02:59:36 UTC
AssertTimestampMonotonic=539968287
Transient=no
StartLimitInterval=60000000
StartLimitBurst=3
StartLimitAction=none

No kubectl


Packages

Have dpkg
Output of "dpkg -l|egrep "(cc-oci-runtimecc-runtimerunv|kata-proxy|kata-runtime|kata-shim|kata-containers-image|linux-container|qemu-)"":

ii  linux-container                4.14.22-86                            amd64        linux kernel optimised for container-like workloads.
ii  qemu-block-extra:amd64         1:2.5+dfsg-5ubuntu10.24               amd64        extra block backend modules for qemu-system and qemu-utils
ii  qemu-lite                      2.7.1+git.d4a337fe91-11               amd64        linux kernel optimised for container-like workloads.
ii  qemu-slof                      20151103+dfsg-1ubuntu1.1              all          Slimline Open Firmware -- QEMU PowerPC version
ii  qemu-system                    1:2.5+dfsg-5ubuntu10.24               amd64        QEMU full system emulation binaries
ii  qemu-system-arm                1:2.5+dfsg-5ubuntu10.24               amd64        QEMU full system emulation binaries (arm)
ii  qemu-system-common             1:2.5+dfsg-5ubuntu10.24               amd64        QEMU full system emulation binaries (common files)
ii  qemu-system-mips               1:2.5+dfsg-5ubuntu10.24               amd64        QEMU full system emulation binaries (mips)
ii  qemu-system-misc               1:2.5+dfsg-5ubuntu10.24               amd64        QEMU full system emulation binaries (miscelaneous)
ii  qemu-system-ppc                1:2.5+dfsg-5ubuntu10.24               amd64        QEMU full system emulation binaries (ppc)
ii  qemu-system-sparc              1:2.5+dfsg-5ubuntu10.24               amd64        QEMU full system emulation binaries (sparc)
ii  qemu-system-x86                1:2.5+dfsg-5ubuntu10.24               amd64        QEMU full system emulation binaries (x86)
ii  qemu-user                      1:2.5+dfsg-5ubuntu10.24               amd64        QEMU user mode emulation binaries
ii  qemu-user-binfmt               1:2.5+dfsg-5ubuntu10.24               amd64        QEMU user mode binfmt registration for qemu-user
ii  qemu-utils                     1:2.5+dfsg-5ubuntu10.24               amd64        QEMU utilities
No `rpm`

---

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions