Skip to content

Commit

Permalink
Options to omit labels in K8s pod logs (#1260)
Browse files Browse the repository at this point in the history
  • Loading branch information
alesnovak-s1 authored May 14, 2024
1 parent 089af0f commit ecf9915
Show file tree
Hide file tree
Showing 13 changed files with 653 additions and 124 deletions.
36 changes: 18 additions & 18 deletions .github/workflows/reusable-agent-build-linux-packages-new.yml
Original file line number Diff line number Diff line change
Expand Up @@ -172,28 +172,28 @@ jobs:
needs:
- build_packages

runs-on: ubuntu-22.04
runs-on: ${{ matrix.test_target.runner }}
strategy:
fail-fast: false
matrix:
test_target:
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "ubuntu2204", "remote-machine-type": "ec2" }
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "ubuntu2004", "remote-machine-type": "ec2" }
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "ubuntu1804", "remote-machine-type": "ec2" }
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "ubuntu1604", "remote-machine-type": "ec2" }
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "ubuntu1404", "remote-machine-type": "docker" }
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "debian10", "remote-machine-type": "ec2" }
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "debian11", "remote-machine-type": "docker" }
- { "package_type": "rpm", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "centos8", "remote-machine-type": "docker" }
- { "package_type": "rpm", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "centos7", "remote-machine-type": "ec2" }
- { "package_type": "rpm", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "centos6", "remote-machine-type": "docker" }
- { "package_type": "rpm", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "amazonlinux2", "remote-machine-type": "ec2" }
- { "package_type": "deb", "builder": "aio-aarch64", "arch": "arm64", "distro-name": "ubuntu1404", "remote-machine-type": "docker" }
- { "package_type": "rpm", "builder": "aio-aarch64", "arch": "arm64", "distro-name": "centos7", "remote-machine-type": "docker" }
# - { "package_type": "deb", "builder": "non-aio", "arch": "x86_64", "distro-name": "ubuntu1404", "remote-machine-type": "docker" }
- { "package_type": "deb", "builder": "non-aio", "arch": "x86_64", "distro-name": "ubuntu2204", "remote-machine-type": "docker" }
# - { "package_type": "rpm", "builder": "non-aio", "arch": "x86_64", "distro-name": "centos7", "remote-machine-type": "docker" }
- { "package_type": "rpm", "builder": "non-aio", "arch": "x86_64", "distro-name": "amazonlinux2", "remote-machine-type": "docker" }
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "ubuntu2204", "remote-machine-type": "ec2", runner: "ubuntu-22.04" }
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "ubuntu2004", "remote-machine-type": "ec2", runner: "ubuntu-22.04" }
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "ubuntu1804", "remote-machine-type": "ec2", runner: "ubuntu-22.04" }
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "ubuntu1604", "remote-machine-type": "ec2", runner: "ubuntu-22.04" }
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "ubuntu1404", "remote-machine-type": "docker", runner: "ubuntu-22.04" }
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "debian10", "remote-machine-type": "ec2", runner: "ubuntu-22.04" }
- { "package_type": "deb", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "debian11", "remote-machine-type": "docker", runner: "ubuntu-22.04" }
- { "package_type": "rpm", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "centos8", "remote-machine-type": "docker", runner: "ubuntu-22.04" }
- { "package_type": "rpm", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "centos7", "remote-machine-type": "ec2", runner: "ubuntu-22.04" }
- { "package_type": "rpm", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "centos6", "remote-machine-type": "docker", runner: "ubuntu-22.04" }
- { "package_type": "rpm", "builder": "aio-x86_64", "arch": "x86_64", "distro-name": "amazonlinux2", "remote-machine-type": "ec2", runner: "ubuntu-22.04" }
- { "package_type": "deb", "builder": "aio-aarch64", "arch": "arm64", "distro-name": "ubuntu1404", "remote-machine-type": "docker", runner: "aws-aarch64" }
- { "package_type": "rpm", "builder": "aio-aarch64", "arch": "arm64", "distro-name": "centos7", "remote-machine-type": "docker", runner: "aws-aarch64" }
# - { "package_type": "deb", "builder": "non-aio", "arch": "x86_64", "distro-name": "ubuntu1404", "remote-machine-type": "ec2", runner: "ubuntu-22.04" }
- { "package_type": "deb", "builder": "non-aio", "arch": "x86_64", "distro-name": "ubuntu2204", "remote-machine-type": "docker", runner: "ubuntu-22.04" }
# - { "package_type": "rpm", "builder": "non-aio", "arch": "x86_64", "distro-name": "centos7", "remote-machine-type": "ec2", runner: "ubuntu-22.04" }
- { "package_type": "rpm", "builder": "non-aio", "arch": "x86_64", "distro-name": "amazonlinux2", "remote-machine-type": "docker", runner: "ubuntu-22.04" }
steps:
- name: Checkout repository
uses: actions/checkout@8ade135a41bc03ea155e62e844d188df1ea18608 # v4
Expand Down
6 changes: 4 additions & 2 deletions docs/monitors/kubernetes_monitor.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Besides the default API key stored in the scalyr/scalyr-api-key secret, the user
#### Namespace level API keys
```log.config.scalyr.com/teams.{team_number}.secret: {secret_name}```

Overrides the default API key for all pods in the namespace.
Overrides the default API key for all pods in the namespace.
The `secret_name` is the name of the secret (stored in the same namespace) holding the Scalyr API key.
The `teams_number` is an arbitrary unique number.

Expand Down Expand Up @@ -67,7 +67,7 @@ data:
#### Simple visual example of Secret key annotation priority
> [!NOTE]
> [!NOTE]
> When no annotation is present for either the namespace or pod, the default secret _scalyr/scalyr-api-key_ is used.
![Annotation Priority](kubernetes_monitor_annotations_priority.png)
Expand Down Expand Up @@ -391,6 +391,8 @@ cluster.
| `k8s_kubelet_host_ip` | Optional (defaults to None). Defines the host IP address for the Kubelet API. If None, the Kubernetes API will be queried for it |
| `k8s_kubelet_api_url_template` | Optional (defaults to https://${host_ip}:10250). Defines the port and protocol to use when talking to the kubelet API. Allowed template variables are `node_name` and `host_ip`. |
| `k8s_sidecar_mode` | Optional, (defaults to False). If true, then logs will only be collected for containers running in the same Pod as the agent. This is used in situations requiring very high throughput. |
| `k8s_label_include_globs` | Optional, (defaults to ['*']). Specifies a list of K8s labels to be added to logs. |
| `k8s_label_exclude_globs` | Optional, (defaults to []]). Specifies a list of K8s labels to be ignored and not added to logs. |

<a name="metrics"></a>
## Metrics Reference
Expand Down
264 changes: 262 additions & 2 deletions scalyr_agent/builtin_monitors/kubernetes_monitor.py
Original file line number Diff line number Diff line change
Expand Up @@ -441,6 +441,24 @@
# 'Optional (defaults to False). If true, stdout/stderr logs will contain docker timestamps at the beginning of the line\n',
# convert_to=bool, default=False)

define_config_option(
__monitor__,
"k8s_label_include_globs",
"Optional, (defaults to ['*']). Specifies a list of K8s labels to be added to logs.",
convert_to=ArrayOfStrings,
default=["*"],
env_aware=True,
)

define_config_option(
__monitor__,
"k8s_label_exclude_globs",
"Optional, (defaults to []]). Specifies a list of K8s labels to be ignored and not added to logs.",
convert_to=ArrayOfStrings,
default=[],
env_aware=True,
)

define_metric(
__monitor__,
"docker.net.rx_bytes",
Expand Down Expand Up @@ -3232,6 +3250,15 @@ def __get_base_attributes(self):

return attributes

def __is_label_allowed(self, label_name):
return any(
fnmatch.fnmatch(label_name, glob)
for glob in self._config.get("k8s_label_include_globs")
) and not any(
fnmatch.fnmatch(label_name, glob)
for glob in self._config.get("k8s_label_exclude_globs")
)

def __get_log_config_for_container(self, cid, info, k8s_cache, base_attributes):
# type: (str, dict, KubernetesCache, JsonObject) -> List[Dict]

Expand Down Expand Up @@ -3300,7 +3327,8 @@ def __get_log_config_for_container(self, cid, info, k8s_cache, base_attributes):
container_attributes["k8s_node"] = pod.node_name

for label, value in six.iteritems(pod.labels):
container_attributes[label] = value
if self.__is_label_allowed(label):
container_attributes[label] = value

if "parser" in pod.labels:
parser = pod.labels["parser"]
Expand All @@ -3321,7 +3349,11 @@ def __get_log_config_for_container(self, cid, info, k8s_cache, base_attributes):
# field `_k8s_ck`
container_attributes["_k8s_dn"] = controller.name
# `_k8s_dl` is translated to `k8s-labels`
container_attributes["_k8s_dl"] = controller.flat_labels
container_attributes["_k8s_dl"] = ",".join(
f"{label}={value}"
for label, value in controller.labels.items()
if self.__is_label_allowed(label)
)
# `_k8s_ck` is translated into the field key for
# `_k8s_dn`. Here are some examples: `k8s-deployment`,
# `k8s-daemon-set`, `k8s-stateful-set`, etc. If the
Expand Down Expand Up @@ -3639,6 +3671,232 @@ class KubernetesMonitor(
* lineGroupers (not supported at all)
* path (the path is always fixed for k8s container logs)
### Configuring multiple accounts per container
Besides the default API key stored in the scalyr/scalyr-api-key secret, the user can specify API keys for namespaces, pods and containers using annotations.
#### Namespace level API keys
```log.config.scalyr.com/teams.{team_number}.secret: {secret_name}```
Overrides the default API key for all pods in the namespace.
The `secret_name` is the name of the secret (stored in the same namespace) holding the Scalyr API key.
The `teams_number` is an arbitrary unique number.
#### Pod level API keys
```log.config.scalyr.com/teams.{team_number}.secret: {secret_name}```
Overrides the default API key and the namespace API key for all containers in the pod.
The `secret_name` is the name of the secret (stored in the same namespace as the pod) holding the Scalyr API key.
The `teams_number` is an arbitrary unique number.
#### Container level API keys
```log.config.scalyr.com/{container_name}.teams.{team_number}.secret: {secret_name}```
Overrides the default API key, the namespace API key and the pod API keys for all containers in the pod.
The `secret_name` is the name of the secret (stored in the same namespace as the pod) holding the Scalyr API key.
The `teams_number` is an arbitrary unique number.
#### API Key Secret structure
```yaml
apiVersion: v1
kind: Secret
data:
scalyr-api-key: <b64 encoded API key>
```
#### Simple visual example of Secret key annotation priority
> [!NOTE]
> When no annotation is present for either the namespace or pod, the default secret _scalyr/scalyr-api-key_ is used.
![Annotation Priority](kubernetes_monitor_annotations_priority.png)
#### Example:
#### Configuration:
##### Default API key for the Scalyr Agent
```yaml
apiVersion: v1
kind: Secret
metadata:
name: scalyr-api-key
namespace: scalyr
data:
scalyr-api-key: <b64 encoded SCALYR_API_KEY_WRITE_TEAM_1>
```
##### Workload Namespaces
```yaml
apiVersion: v1
kind: Namespace
metadata:
name: workload-namespace-1
annotations:
log.config.scalyr.com/teams.1.secret: scalyr-api-key-team-2
---
apiVersion: v1
kind: Namespace
metadata:
name: workload-namespace-2
```
##### API keys in the workload-namespace-1 Namespace
```yaml
apiVersion: v1
kind: Secret
metadata:
name: scalyr-api-key-team-2
namespace: workload-namespace-1
data:
scalyr-api-key: <b64 encoded SCALYR_API_KEY_WRITE_TEAM_2>
---
apiVersion: v1
kind: Secret
metadata:
name: scalyr-api-key-team-3
data:
scalyr-api-key: <b64 encoded SCALYR_API_KEY_WRITE_TEAM_3>
---
apiVersion: v1
kind: Secret
metadata:
name: scalyr-api-key-team-4
data:
scalyr-api-key: <b64 encoded SCALYR_API_KEY_WRITE_TEAM_4>
---
apiVersion: v1
kind: Secret
metadata:
name: scalyr-api-key-team-5
data:
scalyr-api-key: <b64 encoded SCALYR_API_KEY_WRITE_TEAM_5>
---
apiVersion: v1
kind: Secret
metadata:
name: scalyr-api-key-team-6
data:
scalyr-api-key: <b64 encoded SCALYR_API_KEY_WRITE_TEAM_6>
---
apiVersion: v1
kind: Secret
metadata:
name: scalyr-api-key-team-7
data:
scalyr-api-key: <b64 encoded SCALYR_API_KEY_WRITE_TEAM_7>
```
#### Workload Pods in the workload-namespace Namespace
```yaml
apiVersion: v1
kind: Pod
metadata:
labels:
app: multi-account-test
name: workload-pod-1
namespace: workload-namespace-1
annotations:
log.config.scalyr.com/teams.1.secret: "scalyr-api-key-team-3"
log.config.scalyr.com/teams.5.secret: "scalyr-api-key-team-4"
log.config.scalyr.com/workload-pod-1-container-1.teams.1.secret: "scalyr-api-key-team-5"
log.config.scalyr.com/workload-pod-1-container-2.teams.1.secret: "scalyr-api-key-team-6"
log.config.scalyr.com/workload-pod-1-container-2.teams.2.secret: "scalyr-api-key-team-7"
spec:
containers:
- name: workload-pod-1-container-1
image: busybox
command:
- /bin/sh
- -c
- while true; do echo workload-pod-1-container-1; sleep 1; done
- name: workload-pod-1-container-2
image: busybox
command:
- /bin/sh
- -c
- while true; do echo workload-pod-1-container-2; sleep 1; done
- name: workload-pod-1-container-3
image: busybox
command:
- /bin/sh
- -c
- while true; do echo workload-pod-1-container-3; sleep 1; done
---
apiVersion: v1
kind: Pod
metadata:
labels:
app: multi-account-test
name: workload-pod-2
namespace: workload-namespace-1
spec:
containers:
- name: workload-pod-2-container-1
image: busybox
command:
- /bin/sh
- -c
- while true; do echo workload-pod-2-container-1; sleep 1; done
```
##### Workload Pod in workload-namespace-2 Namespace
```yaml
apiVersion: v1
kind: Pod
metadata:
labels:
app: multi-account-test
name: workload-pod-3
namespace: workload-namespace-2
spec:
containers:
- name: workload-pod-3-container-1
image: busybox
command:
- /bin/sh
- -c
- while true; do echo workload-pod-3-container-1; sleep 1; done
```
#### Ingested data:
| Container Name | API keys used to ingest logs | Note |
| --- |----------------------------------------------------------|-----------------------------|
| workload-pod-1-container-1 | SCALYR_API_KEY_WRITE_TEAM_4 | Container specific api keys |
| workload-pod-1-container-2 | SCALYR_API_KEY_WRITE_TEAM_5, SCALYR_API_KEY_WRITE_TEAM_6 | Container specific api keys |
| workload-pod-1-container-3 | SCALYR_API_KEY_WRITE_TEAM_3, SCALYR_API_KEY_WRITE_TEAM_4 | Pod default api keys |
| workload-pod-2-container-1 | SCALYR_API_KEY_WRITE_TEAM_2 | Namespace default api key |
| workload-pod-3-container-1 | SCALYR_API_KEY_WRITE_TEAM_1 | Agent default api key |
#### Querying the data:
```bash
scalyr_readlog_token=SCALYR_API_KEY_WRITE_TEAM_1 scalyr query 'app="multi-account-test"' --columns=message
# workload-pod-3-container-1
scalyr_readlog_token=SCALYR_API_KEY_WRITE_TEAM_2 scalyr query 'app="multi-account-test"' --columns=message
# workload-pod-2-container-1
scalyr_readlog_token=SCALYR_API_KEY_WRITE_TEAM_3 scalyr query 'app="multi-account-test"' --columns=message
# workload-pod-1-container-3
scalyr_readlog_token=SCALYR_API_KEY_WRITE_TEAM_4 scalyr query 'app="multi-account-test"' --columns=message
# workload-pod-1-container-3
# workload-pod-1-container-1
scalyr_readlog_token=SCALYR_API_KEY_WRITE_TEAM_5 scalyr query 'app="multi-account-test"' --columns=message
# workload-pod-1-container-2
scalyr_readlog_token=SCALYR_API_KEY_WRITE_TEAM_6 scalyr query 'app="multi-account-test"' --columns=message
# workload-pod-1-container-2
```
### Excluding Logs
Containers and pods can be specifically included/excluded from having their logs collected and
Expand All @@ -3653,6 +3911,8 @@ class KubernetesMonitor(
log.config.scalyr.com/include: false
In an edge case when a short-lived container metadata is not available anymore via K8s API and some logs are found, they will be collected based on `k8s_include_all_containers` flag.
By default the agent monitors the logs of all pods/containers, and you have to manually exclude
pods/containers you don't want. You can also set `k8s_include_all_containers: false` in the
kubernetes_monitor monitor config section of `agent.d/docker.json`, in which case all containers are
Expand Down
6 changes: 1 addition & 5 deletions scalyr_agent/monitor_utils/k8s.py
Original file line number Diff line number Diff line change
Expand Up @@ -560,11 +560,7 @@ def __init__(
self.access_time = None
self.parent_name = parent_name
self.parent_kind = parent_kind
flat_labels = []
for key, value in six.iteritems(labels):
flat_labels.append("%s=%s" % (key, value))

self.flat_labels = ",".join(flat_labels)
self.labels = labels


class ApiQueryOptions(object):
Expand Down
Empty file added scripts/__init__.py
Empty file.
Empty file added scripts/cicd/__init__.py
Empty file.
Loading

0 comments on commit ecf9915

Please sign in to comment.