Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for special characters in path #225

Merged
merged 5 commits into from
Feb 24, 2025

Conversation

byte-bandit
Copy link
Contributor

This change adds support for parsing URL encoded characters, ensuring regex matches down the pipeline can be executed against the unescaped, raw value, matching the contents of the generated swagger documentation.

Copy link

@llamapreview llamapreview bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto Pull Request Review from LlamaPReview

1. Overview

1.1 PR Summary

  • Business value and requirements alignment: This PR introduces support for special characters in path parameters, aligning with the business requirement to handle URLs as documented in the generated swagger documentation.
  • Key components modified: The chirouter package, specifically the PathToURLValues function.
  • Impact assessment: The changes affect how path parameters are handled, which could have system-wide implications on URL matching and data processing.
  • System dependencies and integration impacts: Downstream components relying on path parameters must now handle URL-encoded values, which could break existing regex matches or assumptions about the format of path parameters.

1.2 Architecture Changes

  • System design modifications: None significant.
  • Component interactions: The changes affect the interaction between the chirouter package and downstream components that process path parameters.
  • Integration points: The PathToURLValues function is a key integration point, as it is used in the routing process to decode path parameters.

2. Detailed Technical Analysis

2.1 Code Logic Deep-Dive

Core Logic Changes

  • chirouter/path_decoder.go - PathToURLValues function
    • Submitted PR Code:
    func PathToURLValues(r *http.Request) (url.Values, error) {
      if routeCtx := chi.RouteContext(r.Context()); routeCtx != nil {
        params := make(url.Values, len(routeCtx.URLParams.Keys))

        for i, key := range routeCtx.URLParams.Keys {
          value := routeCtx.URLParams.Values[i]
          value, err := url.PathUnescape(value)
          if err != nil {
            return nil, fmt.Errorf("PathToUrlValues: %w", err)
          }
          params[key] = []string{value}
        }

        return params, nil
      }

      return nil, nil
    }
  • Analysis:
    • The current logic adds support for URL decoding in path parameters. This is a significant change that could have system-wide implications.
    • The error handling for URL decoding is not comprehensive. It only checks for url.PathUnescape errors, but it does not handle cases where the decoded value exceeds the maximum allowed length or contains invalid characters.
    • The cross-component impact is that downstream components relying on path parameters must now handle URL-encoded values. This could break existing regex matches or assumptions about the format of path parameters.
    • Business logic considerations: The application's behavior may change if it relies on specific path parameter formats.
  • LlamaPReview Suggested Improvements:
    func PathToURLValues(r *http.Request) (url.Values, error) {
      if routeCtx := chi.RouteContext(r.Context()); routeCtx != nil {
        params := make(url.Values, len(routeCtx.URLParams.Keys))

        for i, key := range routeCtx.URLParams.Keys {
          value := routeCtx.URLParams.Values[i]
          decoded, err := url.PathUnescape(value)
          if err != nil {
            return nil, fmt.Errorf("PathToUrlValues: %w", err)
          }
          if len(decoded) > maxAllowedLength { // Assuming maxAllowedLength is defined elsewhere
            return nil, fmt.Errorf("PathToUrlValues: decoded value too long")
          }
          if !isValidPathParam(decoded) { // Assuming isValidPathParam is defined elsewhere
            return nil, fmt.Errorf("PathToUrlValues: invalid character in path parameter")
          }
          params[key] = []string{decoded}
        }

        return params, nil
      }

      return nil, nil
    }
  • Improvement rationale:

    • The improved code adds checks for the length and validity of the decoded path parameters. This ensures that the application does not crash or behave unexpectedly due to malformed or overly long URL-encoded values.
    • The technical benefits include improved error handling and robustness against invalid input.
    • The business value is that the application can now handle a wider range of input without crashing or producing incorrect output.
    • The risk assessment is that without these checks, the application could crash or produce incorrect output, leading to data loss or security vulnerabilities.
  • chirouter/example_test.go - TestPathDecoding function

    • Submitted PR Code:
    func TestPathDecoding(t *testing.T) {
      router := chi.NewRouter()
      router.Handle("/foo/{path1}/bar/{path2}", http.HandlerFunc(func(_ http.ResponseWriter, r *http.Request) {
        var in myRequest

        _ = dec.Decode(r, &in, nil)

        fmt.Printf("%+v\n", in)
      }))

      // Serving example URL.
      w := httptest.NewRecorder()
      req, _ := http.NewRequest(http.MethodPost, `/foo/a%2Fbc/bar/123?query1=321`,
        bytes.NewBufferString("formData1=true&formData2=def"))

      req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
      req.Header.Set("X-Header-1", "1.23")

      router.ServeHTTP(w, req)
      // Output:
      // {Query1:321 Path1:a/bc Path2:123 Header1:1.23 FormData1:true FormData2:def}
    }
  • Analysis:
    • The current test case demonstrates the URL decoding functionality. However, it only tests a single scenario with a specific URL.
    • The cross-component impact is that this test case ensures that the PathToURLValues function works as expected in the context of the application.
    • Business logic considerations: The test case ensures that the application behaves as expected when handling URL-encoded path parameters.
  • LlamaPReview Suggested Improvements:
    • No code changes are suggested for this test case. However, the initial PR review missed the opportunity to suggest adding more test cases to cover edge scenarios, such as malformed or overly long URL-encoded values.
  • Improvement rationale:
    • Adding more test cases would improve the test coverage of the URL decoding functionality, ensuring that the application behaves as expected in a wider range of scenarios.
    • The technical benefits include improved confidence in the application's behavior and reduced risk of unexpected behavior in production.
    • The business value is that the application can now be tested more thoroughly, reducing the risk of bugs and ensuring that it meets business requirements.
    • The risk assessment is that without these test cases, the application could behave unexpectedly in production, leading to data loss or security vulnerabilities.

2.2 Implementation Quality

  • Code organization and structure: The changes are well-organized and follow the existing code structure.
  • Design patterns usage: No significant design patterns are used in this PR.
  • Error handling approach: The error handling in the PathToURLValues function is not comprehensive, as discussed in the code logic deep-dive section.
  • Resource management: Not applicable in this PR.

3. Critical Findings

3.1 Potential Issues

  • 🔴 Critical Issues

    • Incompatibility with existing regex matches: Downstream components may have regex matches that assume path parameters are not URL-encoded. These matches may now fail, leading to unexpected behavior.
      • Impact: The application may not behave as expected, leading to incorrect data processing or user experience issues.
      • Recommendation: Review all regex matches in downstream components to ensure they can handle URL-encoded values.
    • Potential performance impact: URL decoding is a CPU-intensive operation. If not handled efficiently, it could lead to performance degradation, especially in high-traffic scenarios.
      • Impact: The application may become slower, leading to reduced user experience or increased costs.
      • Recommendation: Benchmark the URL decoding process to ensure it does not introduce significant performance overhead.
  • 🟡 Warnings

    • Edge cases: The PR does not mention handling of edge cases, such as malformed or overly long URL-encoded values.
      • Potential risks: The application may crash or behave unexpectedly due to malformed or overly long URL-encoded values.
      • Suggested improvements: Add checks for the length and validity of the decoded path parameters, as suggested in the code logic deep-dive section.
    • Security considerations: While not immediately apparent, changes in URL handling could potentially introduce security vulnerabilities, such as bypassing certain protections or enabling new attack vectors.
      • Potential risks: The application may become vulnerable to attacks, leading to data loss or security breaches.
      • Suggested improvements: Conduct a security review of the changes to ensure they do not introduce new vulnerabilities.

3.2 Code Quality Concerns

  • Maintainability aspects: The changes are well-documented and follow the existing coding style.
  • Readability issues: No significant readability issues were found in this PR.
  • Performance bottlenecks: As discussed in the potential issues section, URL decoding could potentially introduce performance bottlenecks.

4. Security Assessment

  • Authentication/Authorization impacts: None significant.
  • Data handling concerns: The changes affect how data is passed between components. Inconsistent handling of URL-encoded values could lead to data discrepancies.
  • Input validation: The PathToURLValues function does not validate the input values, as discussed in the code logic deep-dive section.
  • Security best practices: Not applicable in this PR.
  • Potential security risks: As discussed in the potential issues section, changes in URL handling could potentially introduce security vulnerabilities.
  • Mitigation strategies: Conduct a security review of the changes to ensure they do not introduce new vulnerabilities.
  • Security testing requirements: Conduct security testing to ensure the changes do not introduce new vulnerabilities.

5. Testing Strategy

5.1 Test Coverage

  • Unit test analysis: The PR includes a test case for the PathToURLValues function. However, it only tests a single scenario with a specific URL.
  • Integration test requirements: Validate the changes in an integrated environment to catch any unexpected behavior.
  • Edge cases coverage: As discussed in the potential issues section, edge cases are not covered in the current test suite.

5.2 Test Recommendations

Suggested Test Cases

  func TestPathDecodingMalformedURL(t *testing.T) {
    // Test with a malformed URL-encoded value
  }

  func TestPathDecodingOverlyLongURL(t *testing.T) {
    // Test with an overly long URL-encoded value
  }
  • Coverage improvements: Add more test cases to cover edge scenarios, such as malformed or overly long URL-encoded values.
  • Performance testing needs: As discussed in the potential issues section, consider including load testing in the review process.

6. Documentation & Maintenance

  • Documentation updates needed: Update the documentation to reflect the changes in URL handling.
  • Long-term maintenance considerations: The changes introduce a new behavior that must be maintained and tested in the long term.
  • Technical debt and monitoring requirements: No significant technical debt or monitoring requirements were found in this PR.

7. Deployment & Operations

  • Deployment impact and strategy: The changes do not affect the deployment process.
  • Key operational considerations: The changes do not introduce new operational considerations.

8. Summary & Recommendations

8.1 Key Action Items

  1. Review regex matches: Thoroughly review all regex matches in downstream components to ensure they can handle URL-encoded values.
  2. Benchmark URL decoding: Benchmark the URL decoding process to ensure it does not introduce significant performance overhead.
  3. Add edge case tests: Add more test cases to cover edge scenarios, such as malformed or overly long URL-encoded values.
  4. Conduct security review: Conduct a security review of the changes to ensure they do not introduce new vulnerabilities.

8.2 Future Considerations

  • Technical evolution path: The application should evolve to support more advanced URL handling, such as support for internationalized domain names (IDNs) and internationalized resource identifiers (IRIs).
  • Business capability evolution: The business may require additional support for special characters in path parameters, such as support for emojis or other special characters.
  • System integration impacts: The changes may have system-wide implications on how URLs are handled and parsed. Downstream components must be updated to handle URL-encoded values.

💡 Help Shape LlamaPReview
How's this review format working for you? Vote in our Github Discussion Polls to help us improve your review experience!

Copy link

codecov bot commented Feb 24, 2025

Codecov Report

Attention: Patch coverage is 50.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 75.12%. Comparing base (15a0c23) to head (b7dedce).
Report is 18 commits behind head on master.

Files with missing lines Patch % Lines
chirouter/path_decoder.go 50.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #225      +/-   ##
==========================================
- Coverage   76.01%   75.12%   -0.89%     
==========================================
  Files          30       30              
  Lines        1730     2629     +899     
==========================================
+ Hits         1315     1975     +660     
- Misses        298      528     +230     
- Partials      117      126       +9     
Flag Coverage Δ
unittests 75.12% <50.00%> (-0.89%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vearutop
Copy link
Member

Please remove //nolint:unparam // Matches request.DecoderFactory.SetDecoderFunc., since now the error may be returned.

@byte-bandit byte-bandit requested a review from vearutop February 24, 2025 13:36
@vearutop vearutop merged commit 59d2fa3 into swaggest:master Feb 24, 2025
8 of 11 checks passed
@vearutop
Copy link
Member

Thank you!

@vearutop
Copy link
Member

v0.2.73 tagged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants