Skip to content

Conversation

xoaryaa
Copy link
Contributor

@xoaryaa xoaryaa commented Sep 2, 2025

Related Issues

Proposed Changes:

Feature

  • Add optional request_headers: dict[str, str] = {} to LinkContentFetcher.
  • Per-request header precedence:
    1. httpx client defaults
    2. component defaults (REQUEST_HEADERS)
    3. request_headers (new)
    4. rotating User-Agent (rotation wins; keeps UA rotation behavior intact)
  • Apply the same merge logic in both sync and async code paths.

Resilience

  • If http2=True but h2 is not installed, log a warning and gracefully fall back to HTTP/1.1 (avoid raising ImportError).

Docs

  • Update class docstring with header precedence and a short usage example.

How did you test it?

  • Unit tests (in test/components/fetchers/test_link_content_fetcher.py):
    • TestLinkContentFetcher_CustomHeaders::test_request_headers_sync_merging_and_ua_override
    • TestLinkContentFetcherAsync_CustomHeaders::test_request_headers_async_merging_and_ua_override
    • Confirmed that:
      • custom headers are sent,
      • client-level headers merge correctly,
      • rotating User-Agent overrides any UA supplied via request_headers.
  • Ran existing tests locally (-m "not integration"): all green.
  • Ran pre-commit hooks locally and fixed reported issues.

Notes for the reviewer

  • The change is additive and backward compatible; default behavior is unchanged unless request_headers is provided.
  • Happy to split the drive-by typo fixes (codespell) into a separate PR if you prefer—kept here to keep CI green.

Checklist

@xoaryaa xoaryaa requested review from a team as code owners September 2, 2025 14:09
@xoaryaa xoaryaa requested review from dfokina and julian-risch and removed request for a team September 2, 2025 14:09
@github-actions github-actions bot added topic:CI type:documentation Improvements on the docs labels Sep 2, 2025
@coveralls
Copy link
Collaborator

Pull Request Test Coverage Report for Build 17424233833

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 23 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.02%) to 92.093%

Files with Coverage Reduction New Missed Lines %
components/fetchers/link_content.py 23 86.44%
Totals Coverage Status
Change from base Build 17402313102: 0.02%
Covered Lines: 12939
Relevant Lines: 14050

💛 - Coveralls

@sjrl
Copy link
Contributor

sjrl commented Sep 3, 2025

hey @xoaryaa thanks for opening a new PR! Could you remove all changes made to non-code files?

@sjrl sjrl requested review from sjrl and removed request for dfokina and julian-risch September 3, 2025 08:32
@sjrl
Copy link
Contributor

sjrl commented Sep 3, 2025

A few other notes:

@xoaryaa
Copy link
Contributor Author

xoaryaa commented Sep 3, 2025 via email

@xoaryaa
Copy link
Contributor Author

xoaryaa commented Sep 4, 2025

Opened a new clean pr

@sjrl
Copy link
Contributor

sjrl commented Sep 4, 2025

Closing and moving discussion to the new PR #9760

@sjrl sjrl closed this Sep 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:CI type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make REQUEST_HEADERS in LinkContentFetcher customizable
3 participants