-
Notifications
You must be signed in to change notification settings - Fork 4
Jenkins job for validation summary clickhouse table #54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| currentBuild.description = "Import: ${env.DATE_TO_USE}" | ||
| } | ||
| dir('pixel-schema') { | ||
| checkout([$class: 'GitSCM', branches: [[name: "sam/dashboard"]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jenkins job checks out developer feature branch
High Severity
The pixel-schema repository checkout uses branch sam/dashboard which appears to be a personal feature branch rather than an official branch. All other repos in this Jenkinsfile correctly use main or develop. This means the production Jenkins job would be pulling from an unstable development branch instead of the main codebase.
Please tell me if this was useful or not with a 👍 or 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix this just before merging so as not to break the current job.
| ) | ||
| ENGINE = ReplicatedReplacingMergeTree() | ||
| PARTITION BY (date) | ||
| ORDER BY (agent, prefix, params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ClickHouse table ORDER BY causes data deduplication
Medium Severity
The ReplicatedReplacingMergeTree table has ORDER BY (agent, prefix, params) but stores distinct pixel, pixel_id, and version values in separate columns. Since these columns aren't in the ORDER BY, rows with different pixels that share the same agent, prefix, and params will be deduplicated during ClickHouse merges, causing silent data loss including loss of freq aggregations.
Please tell me if this was useful or not with a 👍 or 👎.
| userRemoteConfigs: [[url: 'https://github.com/duckduckgo/duckduckgo-privacy-extension.git']]]) | ||
| } | ||
| dir('pixel-schema') { | ||
| sh "/bin/bash scripts/detailedValidation.sh ../duckduckgo-privacy-extension/pixel-definitions/ ${env.DATE_TO_USE}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Command injection via unvalidated DATE parameter
High Severity
The DATE parameter is used directly in shell commands without validation. A malicious value like 2024-01-01; malicious_command would be interpreted by the shell, executing arbitrary commands on the Jenkins agent. The parameter description suggests YYYY-MM-DD format but doesn't enforce it, allowing command injection via ${env.DATE_TO_USE} in sh steps.
Please tell me if this was useful or not with a 👍 or 👎.
Additional Locations (2)
cab33c6 to
cb05cab
Compare
| `; | ||
| const result = spawnSync('ddg-ro-ch', ['-h', 'clickhouse', '--query', query]); | ||
| console.log(result.stderr.toString()); | ||
| console.log(result.stdout.toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing exit code check allows silent query failures
High Severity
The spawnSync call to execute the ClickHouse query doesn't check result.status or result.error. When the query fails (network issues, permission errors, syntax errors), the script logs output but exits with code 0. The shell script's set -e won't catch this, so validation continues with a missing or empty CSV file, producing incorrect or empty results in the dashboard table.
| } | ||
| const [prefix, pixelMatch] = matchPixel(pixel, this.#compiledPixels); | ||
| this.#currentPixelState.prefix = prefix; | ||
| this.#currentPixelState.pixelMatch = pixelMatch || ''; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cb05cab to
43aa502
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
|
|
||
| SELECT date, agent, version, pixel_id, pixel, params_fixed AS params, COUNT(*) AS freq | ||
| FROM metrics.pixels | ||
| WHERE date = '${day}' AND |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL injection risk in date parameter interpolation
Medium Severity
The day parameter from user input is directly interpolated into the SQL query string at WHERE date = '${day}' without sanitization or parameterization. While the Jenkins job is internal and the Clickhouse client appears read-only (ddg-ro-ch), a user with Jenkins access could provide a malicious DATE parameter to manipulate the query (e.g., 2024-01-01' OR '1'='1).


Adds a Jenkins job that imports pixel data from Clickhouse, runs validation to generate rows for a validation summary table. This can be used for dashboards to measure pixel traffic and validation status, as well as identify issues with specific pixels.
Note
Medium Risk
Introduces new CI automation that reads/writes ClickHouse and changes validation output behavior, so failures or query/ingest issues could impact reporting and job stability. Core validation logic changes are small but the new data pipeline increases operational risk.
Overview
Adds a new scheduled, parameterized Jenkins pipeline (
Jenkinsfile.dashboard) that checks out each client repo’s pixel definitions and runs a newscripts/detailedValidation.shworkflow per platform/date.The workflow fetches aggregated pixels for a specific day from ClickHouse (
fetch_pixels_for_day.mjs), runs validation while emitting per-input-row JSONL output (validate_live_pixel.mjsviaDETAILED_VALIDATION_OUTPUT_FILE/DISABLE_RESULT_SAVING), and inserts those rows into a new ClickHouse table defined inlive_validation_scripts/validation_table.sql.Improves report usability by adding Grafana deep links to pixel names in Asana owner subtasks (
asana_reports.mjs) and by extendingLivePixelsValidator.validatePixel()to always include matchedprefix(andpixelMatch) in returned state.Written by Cursor Bugbot for commit 43aa502. This will update automatically on new commits. Configure here.