Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Telemetry Transport #1229

Open
wants to merge 55 commits into
base: master
Choose a base branch
from
Open

Open Telemetry Transport #1229

wants to merge 55 commits into from

Conversation

ansoniS1
Copy link

@ansoniS1 ansoniS1 commented Jan 4, 2024

Experimental OTLP Transport. Allows output to an Open Telemetry Collector.

@ansoniS1 ansoniS1 added the WIP label Jan 4, 2024
Copy link

github-actions bot commented Jan 4, 2024

Test Results

     20 files  ±  0       20 suites  ±0   32m 9s ⏱️ -5s
1 492 tests +  6  1 472 ✔️ +  6    20 💤 ±0  0 ±0 
7 131 runs  +30  6 906 ✔️ +30  225 💤 ±0  0 ±0 

Results for commit d627f14. ± Comparison against base commit 72a06bc.

♻️ This comment has been updated with latest results.

Copy link

codecov bot commented Jan 25, 2024

Codecov Report

Attention: Patch coverage is 27.11443% with 293 lines in your changes are missing coverage. Please review.

Project coverage is 81.16%. Comparing base (72a06bc) to head (d627f14).
Report is 4 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1229      +/-   ##
==========================================
- Coverage   81.72%   81.16%   -0.55%     
==========================================
  Files         176      178       +2     
  Lines       42965    43308     +343     
  Branches     4782     4812      +30     
==========================================
+ Hits        35109    35149      +40     
- Misses       6625     6917     +292     
- Partials     1231     1242      +11     
Files Coverage Δ
tests/unit/configuration_test.py 98.30% <100.00%> (+0.02%) ⬆️
tests/unit/scalyr_client_test.py 99.42% <100.00%> (ø)
scalyr_agent/agent_main.py 57.70% <0.00%> (-0.31%) ⬇️
scalyr_agent/configuration.py 94.68% <95.35%> (+0.02%) ⬆️
scalyr_agent/scalyr_client.py 80.43% <50.00%> (+0.62%) ⬆️
scalyr_agent/copying_manager/worker.py 77.95% <40.00%> (-0.41%) ⬇️
scalyr_agent/util.py 81.05% <72.73%> (-0.75%) ⬇️
scalyr_agent/client_auth.py 0.00% <0.00%> (ø)
scalyr_agent/otlp_client.py 0.00% <0.00%> (ø)

... and 5 files with indirect coverage changes

@ansoniS1 ansoniS1 removed the WIP label Jan 25, 2024
@ansoniS1 ansoniS1 requested a review from alesnovak-s1 January 26, 2024 00:23
@alesnovak-s1
Copy link
Collaborator

Cover the code with unit tests and implement an integration test that will use the OTLP endpoint.

Copy link
Collaborator

@alesnovak-s1 alesnovak-s1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made few comments and will continue going through the otlp code.

return True

# Implement https://datatracker.ietf.org/doc/html/rfc6749#section-4.4 flow
class OAuth2(object):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a python library for OAuth2 to avoid having an in-house implementation?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python2 requirement paints us into some really weird corners like having to be on older versions of code that opens up security risks. This seems like the path of least resistance. I originally used the OpenTelemetry Python SDK, but had to back out of that because of no Python2 support.

return self.__get_config().get_string("auth")

@property
def client_id(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move all oauth2 configuration keys under a new oauth2 key:
oauth2:
->> client_id:
->> client_secret:
....

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with oauth_ prefix instead which goes with the current pattern used for kubernetes keys (k8s_)

@@ -354,6 +355,7 @@ def __init__(self, configuration, worker_config_entry, session_id, is_daemon=Fal

# The current pending AddEventsTask. We will retry the contained AddEventsRequest several times.
self.__pending_add_events_task = None
self.__receive_response_status = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this private field used?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in self.__pending_add_events_task.__receive_response_status

Trying to resolve this error (it doesn't):

2024-01-26 17:17:51.891Z ERROR [core] [scalyr_agent.scalyr_logging:707] Failed while attempting to scan and transmit logs :stack_trace:
  Traceback (most recent call last):
    File "/Users/anthonyj/_Scalyr/scalyr-agent-2/scalyr_agent/copying_manager/worker.py", line 500, in run
      "Repeatedly failed to parse response due to exception.  Dropping events",
  AttributeError: 'AddEventsTask' object has no attribute '_CopyingManagerWorkerSession__receive_response_status'

I defined the attribute in hopes that this was just a sequencing issue in the way the scalyr_client works, but this didn't fix it. Happens when we start to hit publishing delays. Easiest way for me to hit it is to Debug Agent;Pause Execution for X minutes;Resume Agent

Since this is experimental, I'm comfortable shipping this bug to the customer so they can provide feedback on the feature.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the second one on the line 358. I believe that's a mistake.

@@ -354,6 +355,7 @@ def __init__(self, configuration, worker_config_entry, session_id, is_daemon=Fal

# The current pending AddEventsTask. We will retry the contained AddEventsRequest several times.
self.__pending_add_events_task = None
self.__receive_response_status = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the second one on the line 358. I believe that's a mistake.

body=PB2AnyValue(string_value=event.message),
attributes=_encode_attributes(event.attrs))

def __get_user_agent(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole method is copy pasted from scalyr_client.py.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. This was intentional to avoid touching scalyr_client.py. I went ahead and moved this method and the wrap method to util.py

…n2. concurrent.futures imported only for PY3 (#1230)"

This reverts commit 5fe0df4.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants