Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add orchestrator for supporting multiple processes #208

Closed

Conversation

dineshg13
Copy link
Member

@dineshg13 dineshg13 commented Jun 25, 2023

changes necessary to resolve #197 .
This PR moves the logic from main, and creates a new orchestrator pkg for handling multiple processes.

@dineshg13 dineshg13 marked this pull request as ready for review June 28, 2023 01:16
@dineshg13 dineshg13 requested a review from a team June 28, 2023 01:16
@dineshg13
Copy link
Member Author

@edeNFed This PR is working on baremetal and as sidecar for a pod. But when i run same as a deployment in k8s, am getting error below error. Can you please help with same.

{"level":"info","ts":1687914965.2464294,"caller":"cli/main.go:60","msg":"starting Go OpenTelemetry Agent ..."}
{"level":"info","ts":1687914965.2465029,"caller":"cli/main.go:62","msg":"Establishing connection to OTLP receiver ..."}
{"level":"info","ts":1687914967.2483935,"caller":"orchestrator/orchestrator.go:151","msg":"No go process not found yet, trying again in 2 seconds"}
{"level":"info","ts":1687914969.2487526,"caller":"orchestrator/orchestrator.go:151","msg":"No go process not found yet, trying again in 2 seconds"}
{"level":"info","ts":1687914971.24835,"caller":"orchestrator/orchestrator.go:151","msg":"No go process not found yet, trying again in 2 seconds"}
{"level":"info","ts":1687914973.247508,"caller":"orchestrator/orchestrator.go:151","msg":"No go process not found yet, trying again in 2 seconds"}
{"level":"info","ts":1687914975.2487097,"caller":"orchestrator/orchestrator.go:97","msg":"Add auto instrumetors","pid":2247,"serviceName":"sample-app"}
{"level":"info","ts":1687914975.250355,"caller":"ptrace/ptrace_linux.go:102","msg":"retry attaching thread","tid":2247,"retryCount":1,"limit":10,"error":"no such process"}
{"level":"info","ts":1687914975.2504175,"caller":"ptrace/ptrace_linux.go:102","msg":"retry attaching thread","tid":2248,"retryCount":1,"limit":10,"error":"no such process"}
{"level":"info","ts":1687914975.2504244,"caller":"ptrace/ptrace_linux.go:102","msg":"retry attaching thread","tid":2249,"retryCount":1,"limit":10,"error":"no such process"}
{"level":"info","ts":1687914975.2504296,"caller":"ptrace/ptrace_linux.go:102","msg":"retry attaching thread","tid":2250,"retryCount":1,"limit":10,"error":"no such process"}
{"level":"info","ts":1687914975.2504346,"caller":"ptrace/ptrace_linux.go:102","msg":"retry attaching thread","tid":2251,"retryCount":1,"limit":10,"error":"no such process"}
{"level":"info","ts":1687914975.25044,"caller":"ptrace/ptrace_linux.go:102","msg":"retry attaching thread","tid":2252,"retryCount":1,"limit":10,"error":"no such process"}
{"level":"info","ts":1687914975.250455,"caller":"ptrace/ptrace_linux.go:102","msg":"retry attaching thread","tid":2247,"retryCount":2,"limit":10,"error":"no such process"

cli/main.go Outdated Show resolved Hide resolved
cli/main.go Outdated Show resolved Hide resolved
pkg/process/args.go Outdated Show resolved Hide resolved
pkg/instrumentors/manager.go Outdated Show resolved Hide resolved
pkg/opentelemetry/controller.go Outdated Show resolved Hide resolved
pkg/process/discover.go Outdated Show resolved Hide resolved
@damemi
Copy link
Contributor

damemi commented Jun 29, 2023

@edeNFed This PR is working on baremetal and as sidecar for a pod. But when i run same as a deployment in k8s, am getting error below error. Can you please help with same.

@dineshg13 could you share the deployment yaml you're trying?

@dineshg13
Copy link
Member Author

dineshg13 commented Jun 29, 2023

@damemi I have used our test case by converting the sidecar to a deployment. I have built the docker images and loaded them to the kind cluster. I have loaded the images into kind cluster and created below two pods.

auto-instrumentation - pod

apiVersion: v1
kind: Pod
metadata:
  name: "auto-instrumentation"
  namespace: default
  labels:
    app: "auto-instrumentation"
spec:
  containers:
    - name: auto-instrumentation
      image: otel-go-instrumentation
      imagePullPolicy: IfNotPresent
      env:
        - name: OTEL_GO_AUTO_TARGET_EXE
          value: /sample-app/main
        - name: OTEL_EXPORTER_OTLP_ENDPOINT
          value: "http://test-opentelemetry-collector:4317"
        - name: OTEL_SERVICE_NAME
          value: "sample-app"
        - name: OTEL_PROPAGATORS
          value: "tracecontext,baggage"
      resources: {}
      securityContext:
        runAsUser: 0
        runAsGroup: 0
        capabilities:
          add:
            - SYS_PTRACE
        privileged: true
      volumeMounts:
        - mountPath: /sys/kernel/debug
          name: kernel-debug
        - mountPath: /proc
          name: hostproc
  volumes:
    - name: kernel-debug
      hostPath:
        path: /sys/kernel/debug
    - name: hostproc
      hostPath:
        path: /proc
  restartPolicy: Always
---

our app

apiVersion: v1
kind: Pod
metadata:
  name: "sample-app"
  namespace: default
  labels:
    app: "sample-app"
spec:
  containers:
    - name: sample-app
      image: sample-app
      imagePullPolicy: IfNotPresent
      command: ["/bin/sh", "-c"]
      args: ["/sample-app/main"]
      ports:
        - containerPort: 8080
          name: http
  restartPolicy: Always
---

@damemi
Copy link
Contributor

damemi commented Jun 29, 2023

@dineshg13 ah thanks, I wasn't sure what you meant by that. So they are running in separate pods

It looks like it does find the sample app process, but the thread error says no such process. When it's a sidecar, the 2 containers can see the running processes with shareProcessNamespace. I don't know how process namespaces work between different pods, but maybe that's it

I would also recommend adding an e2e test for this use case

@edeNFed
Copy link
Contributor

edeNFed commented Jul 2, 2023

@dineshg13 please try to use hostPID: true instead of mounting the /proc directory

@dineshg13
Copy link
Member Author

Thanks @edeNFed i was to get this working with hostPID change. Can you please review the PR .

@damemi i prefer to add an e2e test in a follow-up PR. This PR is already big and backward compatible with existing functionality. The current test framework has only k8s sidecar, I want to refactor the tests to support all 3 paths viz sidecar, deployment & bare-metal. Let me know your thoughts.

@dineshg13 dineshg13 requested review from edeNFed and damemi July 4, 2023 11:36
@edeNFed
Copy link
Contributor

edeNFed commented Jul 5, 2023

Looks good @dineshg13
I think we should merge #211 before this one, so we will avoid overwriting different BPF structures when instrumenting multiple processes.

cli/main.go Outdated Show resolved Hide resolved
@damemi
Copy link
Contributor

damemi commented Jul 5, 2023

@damemi i prefer to add an e2e test in a follow-up PR. This PR is already big and backward compatible with existing functionality. The current test framework has only k8s sidecar, I want to refactor the tests to support all 3 paths viz sidecar, deployment & bare-metal. Let me know your thoughts.

I think it would be a lot better to have some testing for the new functionality before merging. You should be able to use the workflow matrix to deploy a different set of manifests for the sample app + instrumentation. I'm happy to help if you have any questions

@dineshg13
Copy link
Member Author

@damemi i added bare-metal tests to the workflow. Currently the tests are failing on service-name. This might require some discussion, I have added this agenda to next SIG meeting.

@damemi
Copy link
Contributor

damemi commented Jul 10, 2023

@dineshg13 thanks, the bare metal test is a good idea and probably wouldn't fit with the kind tests since the setup is so different.

If it helps, I put together a commit damemi@9ef0f4d based on your branch that shows what I mean for the deployment tests. Feel free to pick this commit to this PR or I can push it to your branch if you'd like (not sure I have permission). The tests pass in a test PR damemi#10

Right now, I think this would be good enough to show everything working in the new feature. The fixtures can be shared between deployment+sidecar modes because we expect the output to be identical. But if we add more configurability (like service name), we should look at restructuring the fixtures. This can be part of a broader refactor for the e2e framework.

CHANGELOG.md Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
cli/main.go Outdated Show resolved Hide resolved
instrumentation.go Outdated Show resolved Hide resolved
internal/pkg/orchestrator/orchestrator.go Outdated Show resolved Hide resolved
internal/pkg/orchestrator/service.go Show resolved Hide resolved
internal/pkg/orchestrator/service.go Show resolved Hide resolved
internal/pkg/orchestrator/service.go Show resolved Hide resolved
Comment on lines +67 to +71
log.Logger.V(0).Info("Establishing connection to OTLP receiver ...")
otlpTraceClient := otlptracegrpc.NewClient(
otlptracegrpc.WithDialOption(grpc.WithUserAgent(autoinstUserAgent)),
)
traceExporter, err := otlptrace.New(ctx, otlpTraceClient)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be passed as an option instead?

It seems like this could continue to be setup in the controller. It could be changed in a targeted follow up PR instead.

I think changing the exporter setup in this PR is going to bloat this PR. Especially since this doesn't seem like the appropriate way to expose the option to the user.

Copy link
Member Author

@dineshg13 dineshg13 Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, we can do this in the followup PR. The intent was we can create mock exporter that we write unit tests against. This is a step in that direction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just leave this in the Controller and make the change to the trace exporter setup in its own PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to use same exporter for monitoring all the processes. If we do inside the controller, we will be creating on grpc connection for each process.

internal/pkg/instrumentors/runner.go Outdated Show resolved Hide resolved
instrumentation.go Outdated Show resolved Hide resolved
@dineshg13
Copy link
Member Author

There are too many breaking changes. It has become harder to update the PR. Closing the effort.

@dineshg13 dineshg13 closed this Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto Instrumentation - Monitor multiple processes
6 participants