When running the Epicli container on Windows you might get such errors when trying to run the apply command:
Azure:
INFO cli.src.terraform.TerraformCommand - Error: Error reading queue properties for AzureRM Storage Account "cluster": queues.Client#GetServiceProperties: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: error response cannot be parsed: "\ufeff<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>AuthenticationFailed</Code><Message>Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.\nRequestId:cba2935f-1003-006f-071d-db55f6000000\nTime:2020-02-04T05:38:45.4268197Z</Message><AuthenticationErrorDetail>Request date header too old: 'Fri, 31 Jan 2020 12:28:37 GMT'</AuthenticationErrorDetail></Error>" error: invalid character 'ï' looking for beginning of value
AWS:
ERROR epicli - An error occurred (AuthFailure) when calling the DescribeImages operation: AWS was not able to validate the provided access credentials
These issues might occur when the host machine you are running the Epicli container on was put to sleep or hybernated for an extended period of time. Hyper-V might have issues syncing the time between the container and the host after it wakes up or is resumed. You can confirm this by checking the date and time in your container by running:
Date
If the times are out of sync restarting the container will resolve the issue. If you do not want to restart the container you can also run the following 2 commands from an elevated Powershell prompt to force it during container runtime:
Get-VMIntegrationService -VMName DockerDesktopVM -Name "Time Synchronization" | Disable-VMIntegrationService
Get-VMIntegrationService -VMName DockerDesktopVM -Name "Time Synchronization" | Enable-VMIntegrationService
Common:
When public key is created by ssh-keygen
sometimes it's necessary to convert it to utf-8 encoding.
Otherwise such error occurs:
ERROR epicli - 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
When running the Ansible automation there is a verification script called kafka_producer_consumer.py
which creates a topic, produces messages and consumes messages. If the script fails for whatever reason then Ansible verification will report it as an error. An example of an issue is as follows:
ERROR org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 1 larger than available brokers: 0.
This issue is saying the a replication of 1 is being attempted but there are no brokers '0'. This means that the kafka broker(s) are not running any longer. Kafka will start and attempt to establish connections etc. and if unable it will shutdown and log the message. So, when the verification script runs it will not be able to find a local broker (runs on each broker).
Take a look at syslog/dmesg and run sudo systemctl status kafka
. Most likely it is related to security (TLS/SSL) and/or network but it can also be incorrect settings in the config file /opt/kafka/config/server.properties
. Correct and rerun the automation.
Epicli uses Ansible to configure machines in cluster. Several tasks in Epiphany rely on ansible_default_ipv4
variable.
In some specific configuration (mostly on-prem), this variable might be fetched wrong. Those cases are:
- more than one network interface per machine,
- changes in hardware configuration (add or remove network interface / rename interface),
- lack / wrong / multiplied default routing configuration.
When ansible_default_ipv4
is not equal to machine ip address used in inventory, installation fail with relevant error message.
This means that machine's default routing configuration needs to be modified to use the same network interface (and ip address) used in inventory file. Here you can read more about routing configuration