perf: derive last_job_host_summary from query instead of denormalized FK#16332
perf: derive last_job_host_summary from query instead of denormalized FK#16332benthomasson wants to merge 13 commits intoansible:develfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughHost last-job data denormalization removed: code now derives latest host summaries from JobHostSummary via annotations, Subquery/OuterRef, and cached lookups. Serializers, views, inventory logic, managers, events, and signals updated; JobHostSummary gained latest_for_host / latest_job_for_host helpers. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant View
participant Annotator
participant Serializer
participant JobHostSummary
participant Database
Client->>View: request host(s) or dashboard
View->>Annotator: apply _annotate_host_latest_summary(qs) (Subquery/OuterRef)
Annotator->>Database: run subquery for latest JobHostSummary per host
Database-->>Annotator: annotated host queryset (_latest_* fields)
View->>Serializer: serialize host(s) with annotations
Serializer->>Serializer: call _get_last_summary(host)
alt annotations present
Serializer-->>Serializer: build summary from host._latest_* annotations
else annotations absent
Serializer->>JobHostSummary: JobHostSummary.latest_for_host(host.id)
JobHostSummary->>Database: SELECT ... ORDER BY id DESC LIMIT 1
Database-->>JobHostSummary: summary row
JobHostSummary-->>Serializer: summary instance
end
Serializer-->>View: host representation (derived last_job / last_job_host_summary)
View-->>Client: response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can generate a title for your PR based on the changes with custom instructions.Set the |
e474510 to
d9b8734
Compare
d9b8734 to
c3950f3
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
awx/api/views/__init__.py (1)
207-215:⚠️ Potential issue | 🟠 MajorUpdate the dashboard failed-host drilldown to use the new latest-summary logic.
failedis now counted from the subquery, but Line 213 still links to?last_job_host_summary__failed=True. Since that FK is no longer maintained, the dashboard count and the drill-down list can diverge. Point the URL at a filter backed by the same derived latest-summary predicate instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/api/views/__init__.py` around lines 207 - 215, The dashboard drilldown URL still uses the old FK filter query param "last_job_host_summary__failed=True" causing divergence; update the failures_url to use the same derived predicate used in the view by pointing it to the annotation name created by latest_summary_failed (annotate('_latest_failed') / user_hosts_failed) — i.e. build failures_url as reverse('api:host_list', request=request) + "?_latest_failed=True" so the drilldown filter matches the computed _latest_failed used to compute 'failed'.awx/main/signals.py (1)
257-285:⚠️ Potential issue | 🟠 MajorDon’t bail out on a disappearing latest summary during job deletion.
If
latest_for_host()returns the summary for the job currently being deleted, theJob.DoesNotExistpath exits without falling back to the next surviving summary. That can leaveHost.last_jobempty/stale even when an older summary still exists for the host.Suggested fix
-def _update_host_last_jhs(host): - jhs = JobHostSummary.latest_for_host(host.pk) +def _update_host_last_jhs(host, exclude_job_id=None): + jhs = JobHostSummary.objects.filter(host_id=host.pk).exclude(job_id=exclude_job_id).order_by('-id').first() update_fields = [] - try: - last_job = jhs.job if jhs else None - except Job.DoesNotExist: - # The job (and its summaries) have already been/are currently being - # deleted, so there's no need to update the host w/ a reference to it - return - if host.last_job != last_job: - host.last_job = last_job + last_job_id = jhs.job_id if jhs else None + if host.last_job_id != last_job_id: + host.last_job_id = last_job_id update_fields.append('last_job') if update_fields: host.save(update_fields=update_fields) @@ - _update_host_last_jhs(host) + _update_host_last_jhs(host, exclude_job_id=instance.pk)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/main/signals.py` around lines 257 - 285, _update_host_last_jhs currently bails out when accessing jhs.job raises Job.DoesNotExist, which leaves Host.last_job stale; change the exception handler so that instead of returning you re-query for the next surviving JobHostSummary for that host (i.e., fetch the latest summary excluding summaries whose job PK no longer exists or equals the deleted job), set last_job to that summary's job or None if none found, then proceed with the existing update_fields/save logic; update the handling around JobHostSummary.latest_for_host, the Job.DoesNotExist except block, and Host.last_job assignment to implement this fallback.
🧹 Nitpick comments (1)
awx/main/models/inventory.py (1)
389-393: Add DB support for this latest-summary lookup shape.This correlated subquery now sits on the wrapup path via
update_computed_fields(). It relies on repeatedhost_id + latest idlookups, somain_jobhostsummarywill become the next hotspot as history grows unless there is a matching composite index for that access pattern.As per coding guidelines,
**: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/main/models/inventory.py` around lines 389 - 393, The correlated subquery using JobHostSummary (latest_summary_failed) orders by '-id' filtered by host_id, which will become a hotspot; add a composite DB index on JobHostSummary to support lookups by host then id (e.g. an index on ['host', 'id'] or ['host_id','id'] matching the order_by('-id')) by updating the JobHostSummary model Meta.indexes to include a models.Index(fields=['host','id'], name='ix_jobhostsummary_host_id') and generate/apply the migration so the Subquery lookup (latest_summary_failed) is covered efficiently by the DB.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@awx/api/serializers.py`:
- Around line 1843-1845: The serializer is re-querying
JobHostSummary.latest_for_host multiple times and loses constructed inventory
host behavior; instead, derive last_summary from the already-built
constructed_host_summaries (and set has_active_failures) to avoid repeated DB
calls. Change the code that sets last_job_host_summary/has_active_failures to
first look up the host's summary in constructed_host_summaries (e.g. iterate
res.get('constructed_host_summaries', []) for a summary with host_id == obj.id),
use that summary if found, otherwise call JobHostSummary.latest_for_host once
and cache the result (store it in a local variable or res key) so subsequent
fields reuse it; update references to JobHostSummary.latest_for_host,
constructed_host_summaries, and has_active_failures accordingly.
- Around line 1931-1934: The code currently only clears
ret['last_job_host_summary'] when no derived summary exists, leaving a stale FK
value; change the block that calls JobHostSummary.latest_for_host(obj.id) so
that if last_summary exists you overwrite ret['last_job_host_summary'] with the
derived/serialized summary (e.g., use
JobHostSummarySerializer(last_summary).data or the appropriate representation)
and if not, set it to None; update the code around
JobHostSummary.latest_for_host and the serializer usage to always replace the FK
value returned by super().to_representation(obj).
In `@awx/main/models/jobs.py`:
- Around line 1143-1146: Add a bulk-friendly companion to latest_for_host:
create a classmethod latest_for_hosts on JobHostSummary that returns a Subquery
(or a queryset) suitable for bulk use so callers can annotate/filter hosts in
one query instead of N extra queries; implement e.g. latest_id_subquery =
Subquery(cls.objects.filter(host_id=OuterRef('pk')).order_by('-id').values('id')[:1])
and return that Subquery (or provide a queryset variant that filters cls.objects
by id__in=Subquery(...) for a list of host_ids). Keep latest_for_host unchanged
and name the new helper latest_for_hosts so serializers/endpoints can use it for
bulk endpoints.
---
Outside diff comments:
In `@awx/api/views/__init__.py`:
- Around line 207-215: The dashboard drilldown URL still uses the old FK filter
query param "last_job_host_summary__failed=True" causing divergence; update the
failures_url to use the same derived predicate used in the view by pointing it
to the annotation name created by latest_summary_failed
(annotate('_latest_failed') / user_hosts_failed) — i.e. build failures_url as
reverse('api:host_list', request=request) + "?_latest_failed=True" so the
drilldown filter matches the computed _latest_failed used to compute 'failed'.
In `@awx/main/signals.py`:
- Around line 257-285: _update_host_last_jhs currently bails out when accessing
jhs.job raises Job.DoesNotExist, which leaves Host.last_job stale; change the
exception handler so that instead of returning you re-query for the next
surviving JobHostSummary for that host (i.e., fetch the latest summary excluding
summaries whose job PK no longer exists or equals the deleted job), set last_job
to that summary's job or None if none found, then proceed with the existing
update_fields/save logic; update the handling around
JobHostSummary.latest_for_host, the Job.DoesNotExist except block, and
Host.last_job assignment to implement this fallback.
---
Nitpick comments:
In `@awx/main/models/inventory.py`:
- Around line 389-393: The correlated subquery using JobHostSummary
(latest_summary_failed) orders by '-id' filtered by host_id, which will become a
hotspot; add a composite DB index on JobHostSummary to support lookups by host
then id (e.g. an index on ['host', 'id'] or ['host_id','id'] matching the
order_by('-id')) by updating the JobHostSummary model Meta.indexes to include a
models.Index(fields=['host','id'], name='ix_jobhostsummary_host_id') and
generate/apply the migration so the Subquery lookup (latest_summary_failed) is
covered efficiently by the DB.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 28ae5786-6481-4684-87b9-ca6af2355b1f
📒 Files selected for processing (8)
awx/api/serializers.pyawx/api/views/__init__.pyawx/main/access.pyawx/main/managers.pyawx/main/models/events.pyawx/main/models/inventory.pyawx/main/models/jobs.pyawx/main/signals.py
💤 Files with no reviewable changes (2)
- awx/main/managers.py
- awx/main/access.py
…ompletion The playbook_on_stats wrapup path bulk-updates last_job_host_summary_id on every host touched by a job. In the Q4CY25 scale lab this query had a median execution time of 75 seconds due to index churn on main_host. Replace all reads of the denormalized FK with a new classmethod JobHostSummary.latest_for_host(host_id) that queries for the most recent summary on demand. This eliminates the write-side bulk_update of last_job_host_summary_id entirely. Changes: - Add JobHostSummary.latest_for_host() classmethod - Serializer: use latest_for_host() instead of obj.last_job_host_summary - Dashboard view: use subquery instead of FK traversal for failed hosts - Inventory.update_computed_fields: use subquery for failed host count - events.py: remove last_job_host_summary_id from bulk_update - signals.py: simplify _update_host_last_jhs to only update last_job - access.py/managers.py: remove select_related/defer through the FK The FK field on Host is left in place for now (removal requires a migration) but is no longer written to. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
c3950f3 to
46e600f
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
awx/api/serializers.py (2)
1955-1959:⚠️ Potential issue | 🟠 MajorOverwrite the top-level IDs with the derived values.
super().to_representation(obj)has already serialized the model FKs. This block only clears them when no summary exists, so hosts with a newer derived summary still return stalelast_job_host_summaryand can diverge from the related/summary data you now derive fromlast_summary.Suggested fix
last_summary = self._get_last_summary(obj) - if 'last_job' in ret and not last_summary: - ret['last_job'] = None - if 'last_job_host_summary' in ret and not last_summary: - ret['last_job_host_summary'] = None + if 'last_job' in ret: + ret['last_job'] = last_summary.job_id if last_summary and last_summary.job_id else None + if 'last_job_host_summary' in ret: + ret['last_job_host_summary'] = last_summary.pk if last_summary else None return ret🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/api/serializers.py` around lines 1955 - 1959, The current to_representation flow calls self._get_last_summary(obj) but only clears top-level FK fields when last_summary is missing, leaving stale IDs; update the serializer so after calling last_summary you also overwrite the top-level FK fields (e.g., ret['last_job'] and ret['last_job_host_summary']) with the corresponding values derived from last_summary when it exists, and keep the existing behavior of setting them to None when last_summary is falsy; locate this logic in the to_representation override that calls _get_last_summary(obj) and update ret accordingly.
1848-1853:⚠️ Potential issue | 🟠 MajorKeep constructed hosts on the constructed-summary path.
_get_last_summary()now always callsJobHostSummary.latest_for_host(obj.id). That bypasses the constructed-host behavior still used on Lines 1887-1890, so constructed inventory hosts can return the wronglast_job_host_summary/has_active_failures, and list serialization still pays one fallback query per host.Suggested fix
`@staticmethod` def _get_last_summary(obj): """Cache the latest JobHostSummary on the instance to avoid repeated queries.""" if not hasattr(obj, '_cached_last_summary'): - obj._cached_last_summary = JobHostSummary.latest_for_host(obj.id) + if getattr(getattr(obj, 'inventory', None), 'kind', None) == 'constructed': + obj._cached_last_summary = obj.constructed_host_summaries.order_by('-id').first() + else: + obj._cached_last_summary = JobHostSummary.latest_for_host(obj.id) return obj._cached_last_summary🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/api/serializers.py` around lines 1848 - 1853, The _get_last_summary method currently always calls JobHostSummary.latest_for_host(obj.id), bypassing the earlier "constructed-summary" path used for constructed inventory hosts; update _get_last_summary to first check for and return any constructed/cached summary already attached to the host object (i.e., prefer an existing constructed-summary attribute set by the constructed-host code path such as a prepopulated _cached_last_summary or similarly named constructed-summary attribute), and only fall back to calling JobHostSummary.latest_for_host(obj.id) when no constructed summary exists; keep caching behavior by setting obj._cached_last_summary when you perform the fallback query.
🧹 Nitpick comments (1)
awx/main/models/inventory.py (1)
406-408: Consider cachingfailed_hosts.count()to avoid duplicate query.
failed_hosts.count()is evaluated twice on lines 406 and 408, resulting in two identical database queries.♻️ Suggested optimization
+ failed_hosts_count = failed_hosts.count() computed_fields = { - 'has_active_failures': bool(failed_hosts.count()), + 'has_active_failures': bool(failed_hosts_count), 'total_hosts': total_hosts, - 'hosts_with_active_failures': failed_hosts.count(), + 'hosts_with_active_failures': failed_hosts_count,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/main/models/inventory.py` around lines 406 - 408, The code calls failed_hosts.count() twice which issues duplicate DB queries; compute it once into a local variable (e.g., failed_hosts_count) before building the dict and then use that variable for both 'has_active_failures' (bool(failed_hosts_count)) and 'hosts_with_active_failures' to avoid the extra query. Ensure any existing uses of failed_hosts.count() in the surrounding scope are replaced with the cached variable where appropriate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@awx/api/serializers.py`:
- Around line 1861-1875: The manual reconstruction of d['last_job_host_summary']
and d['last_job'] is dropping keys previously exposed by SUMMARIZABLE_FK_FIELDS;
update the block that builds these dicts (the code that references last_summary
and last_job) to include last_summary.description and for last_job include
description, license_error and canceled_on (and any other keys
SUMMARIZABLE_FK_FIELDS originally returned) so the serializer restores the
original summary_fields contract; ensure the keys match the original names
(e.g., 'description', 'license_error', 'canceled_on') and are populated from
last_summary and last_job objects where present.
---
Duplicate comments:
In `@awx/api/serializers.py`:
- Around line 1955-1959: The current to_representation flow calls
self._get_last_summary(obj) but only clears top-level FK fields when
last_summary is missing, leaving stale IDs; update the serializer so after
calling last_summary you also overwrite the top-level FK fields (e.g.,
ret['last_job'] and ret['last_job_host_summary']) with the corresponding values
derived from last_summary when it exists, and keep the existing behavior of
setting them to None when last_summary is falsy; locate this logic in the
to_representation override that calls _get_last_summary(obj) and update ret
accordingly.
- Around line 1848-1853: The _get_last_summary method currently always calls
JobHostSummary.latest_for_host(obj.id), bypassing the earlier
"constructed-summary" path used for constructed inventory hosts; update
_get_last_summary to first check for and return any constructed/cached summary
already attached to the host object (i.e., prefer an existing
constructed-summary attribute set by the constructed-host code path such as a
prepopulated _cached_last_summary or similarly named constructed-summary
attribute), and only fall back to calling JobHostSummary.latest_for_host(obj.id)
when no constructed summary exists; keep caching behavior by setting
obj._cached_last_summary when you perform the fallback query.
---
Nitpick comments:
In `@awx/main/models/inventory.py`:
- Around line 406-408: The code calls failed_hosts.count() twice which issues
duplicate DB queries; compute it once into a local variable (e.g.,
failed_hosts_count) before building the dict and then use that variable for both
'has_active_failures' (bool(failed_hosts_count)) and
'hosts_with_active_failures' to avoid the extra query. Ensure any existing uses
of failed_hosts.count() in the surrounding scope are replaced with the cached
variable where appropriate.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 45eece19-57b2-4fa6-a566-734cd0141889
📒 Files selected for processing (8)
awx/api/serializers.pyawx/api/views/__init__.pyawx/main/access.pyawx/main/managers.pyawx/main/models/events.pyawx/main/models/inventory.pyawx/main/models/jobs.pyawx/main/signals.py
💤 Files with no reviewable changes (1)
- awx/main/access.py
🚧 Files skipped from review as they are similar to previous changes (4)
- awx/api/views/init.py
- awx/main/models/events.py
- awx/main/models/jobs.py
- awx/main/managers.py
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
awx/api/serializers.py (3)
1884-1904:⚠️ Potential issue | 🟠 MajorRestore the full
summary_fields.last_job*payload.This manual rebuild still drops keys declared in
SUMMARIZABLE_FK_FIELDS:descriptionfor both summaries, pluslicense_errorandcanceled_onforlast_job. That changes the existing response contract.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/api/serializers.py` around lines 1884 - 1904, The manual reconstruction of last_job and last_job_host_summary in the serializer (using last_summary, last_job) is omitting keys defined in SUMMARIZABLE_FK_FIELDS (e.g., description on summaries, and license_error, canceled_on on last_job); restore the full summary_fields payload by copying the original annotated dict when present or by iterating the SUMMARIZABLE_FK_FIELDS entries to populate missing keys instead of hardcoding a subset — (1) when last_summary.job_data exists, ensure you deep-copy all keys from last_summary.job_data into d['last_job'] and d['last_job_host_summary']; (2) when falling back to last_summary.job, include the fields listed in SUMMARIZABLE_FK_FIELDS (description, license_error, canceled_on, etc.) from last_job and last_summary into the rebuilt OrderedDicts so the response contract matches the original summary_fields.
1873-1875:⚠️ Potential issue | 🟠 MajorKeep constructed hosts on the constructed-summary path.
The fallback here always calls
JobHostSummary.latest_for_host(obj.id). For constructed inventories, that bypassesobj.constructed_host_summaries, so nested or non-annotated serializers can return the wronglast_job_host_summaryandhas_active_failures.Suggested fix
else: # Fallback for non-annotated querysets (e.g. nested serializers) - obj._cached_last_summary = JobHostSummary.latest_for_host(obj.id) + if obj.inventory.kind == 'constructed': + obj._cached_last_summary = obj.constructed_host_summaries.order_by('-id').first() + else: + obj._cached_last_summary = JobHostSummary.latest_for_host(obj.id)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/api/serializers.py` around lines 1873 - 1875, The fallback should respect constructed inventories by using the host's constructed_host_summaries when available instead of directly calling JobHostSummary.latest_for_host; update the else branch where obj._cached_last_summary is set to first check for and call obj.constructed_host_summaries.latest_for_host(obj.id) (or equivalent API on constructed_host_summaries) and only fall back to JobHostSummary.latest_for_host(obj.id) if constructed_host_summaries is not present.
1983-1987:⚠️ Potential issue | 🟠 MajorAlways overwrite
last_job_host_summarywith the derived value.This only clears the field when no derived summary exists. When one does exist,
super().to_representation(obj)can still return the stale dormant FK, so the top-level field disagrees withrelatedandsummary_fields.Suggested fix
last_summary = self._get_last_summary(obj) if 'last_job' in ret and not last_summary: ret['last_job'] = None - if 'last_job_host_summary' in ret and not last_summary: - ret['last_job_host_summary'] = None + if 'last_job_host_summary' in ret: + ret['last_job_host_summary'] = last_summary.id if last_summary else None return ret🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/api/serializers.py` around lines 1983 - 1987, The serializer's to_representation is only clearing ret['last_job_host_summary'] when last_summary is missing, leaving a stale FK when a derived summary exists; after calling super().to_representation(obj) and computing last_summary via self._get_last_summary(obj), always overwrite ret['last_job_host_summary'] with the derived value (which may be None) instead of only clearing it — update the assignment in to_representation so ret['last_job_host_summary'] = last_summary (or the appropriate derived field from last_summary) to ensure related/summary_fields and top-level agree.
🧹 Nitpick comments (1)
awx/main/models/inventory.py (1)
389-393: Reuse the failed-host count.This annotated queryset is counted twice a few lines later. Caching it once avoids rerunning the heaviest query in
update_computed_fields().As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."♻️ Proposed change
latest_summary_failed = Subquery( JobHostSummary.objects.filter(host_id=OuterRef('pk')).order_by('-id').values('failed')[:1] ) failed_hosts = active_hosts.annotate(_latest_failed=latest_summary_failed).filter(_latest_failed=True) + failed_hosts_count = failed_hosts.count() active_groups = self.groups @@ computed_fields = { - 'has_active_failures': bool(failed_hosts.count()), + 'has_active_failures': bool(failed_hosts_count), 'total_hosts': total_hosts, - 'hosts_with_active_failures': failed_hosts.count(), + 'hosts_with_active_failures': failed_hosts_count, 'total_groups': active_groups.count(), 'has_inventory_sources': bool(active_inventory_sources.count()), 'total_inventory_sources': active_inventory_sources.count(),Also applies to: 406-408
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/main/models/inventory.py` around lines 389 - 393, The annotated queryset computing failed hosts (latest_summary_failed and failed_hosts) is being built and counted twice in update_computed_fields(); annotate it once and reuse the result to avoid rerunning the heavy query. Change the code to assign the annotated queryset to a variable (keep latest_summary_failed and failed_hosts names), then call failed_hosts.count() once and store that integer (e.g., failed_host_count) for subsequent use rather than rebuilding or recounting the queryset; reuse the cached queryset or cached count in the later code paths (including the other occurrence around lines 406-408) so the heavy Subquery is executed only once.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@awx/api/serializers.py`:
- Around line 1854-1859: The annotated summary shim created as
type('AnnotatedSummary', ...) only defines id, failed, host_name, and job_id but
callers expect a pk attribute (e.g., code that reads last_summary.pk), causing
AttributeError; update the shim in the summary assignment to include 'pk' (set
to obj._latest_summary_id) so pk and id are consistent for consumers like
last_summary.pk.
In `@awx/main/models/inventory.py`:
- Around line 389-393: The latest_summary_failed Subquery currently only matches
JobHostSummary.host_id which misses summaries for constructed hosts stored in
JobHostSummary.constructed_host_id; update the Subquery/QuerySet used by
latest_summary_failed (the JobHostSummary.objects.filter(...) used to annotate
active_hosts in inventory.py) to match either host_id=OuterRef('pk') OR
constructed_host_id=OuterRef('pk') so constructed hosts are considered; ensure
the order_by('-id').values('failed')[:1] logic and the annotate name
latest_summary_failed remain unchanged so has_active_failures and
hosts_with_active_failures reflect constructed-host summaries too.
---
Duplicate comments:
In `@awx/api/serializers.py`:
- Around line 1884-1904: The manual reconstruction of last_job and
last_job_host_summary in the serializer (using last_summary, last_job) is
omitting keys defined in SUMMARIZABLE_FK_FIELDS (e.g., description on summaries,
and license_error, canceled_on on last_job); restore the full summary_fields
payload by copying the original annotated dict when present or by iterating the
SUMMARIZABLE_FK_FIELDS entries to populate missing keys instead of hardcoding a
subset — (1) when last_summary.job_data exists, ensure you deep-copy all keys
from last_summary.job_data into d['last_job'] and d['last_job_host_summary'];
(2) when falling back to last_summary.job, include the fields listed in
SUMMARIZABLE_FK_FIELDS (description, license_error, canceled_on, etc.) from
last_job and last_summary into the rebuilt OrderedDicts so the response contract
matches the original summary_fields.
- Around line 1873-1875: The fallback should respect constructed inventories by
using the host's constructed_host_summaries when available instead of directly
calling JobHostSummary.latest_for_host; update the else branch where
obj._cached_last_summary is set to first check for and call
obj.constructed_host_summaries.latest_for_host(obj.id) (or equivalent API on
constructed_host_summaries) and only fall back to
JobHostSummary.latest_for_host(obj.id) if constructed_host_summaries is not
present.
- Around line 1983-1987: The serializer's to_representation is only clearing
ret['last_job_host_summary'] when last_summary is missing, leaving a stale FK
when a derived summary exists; after calling super().to_representation(obj) and
computing last_summary via self._get_last_summary(obj), always overwrite
ret['last_job_host_summary'] with the derived value (which may be None) instead
of only clearing it — update the assignment in to_representation so
ret['last_job_host_summary'] = last_summary (or the appropriate derived field
from last_summary) to ensure related/summary_fields and top-level agree.
---
Nitpick comments:
In `@awx/main/models/inventory.py`:
- Around line 389-393: The annotated queryset computing failed hosts
(latest_summary_failed and failed_hosts) is being built and counted twice in
update_computed_fields(); annotate it once and reuse the result to avoid
rerunning the heavy query. Change the code to assign the annotated queryset to a
variable (keep latest_summary_failed and failed_hosts names), then call
failed_hosts.count() once and store that integer (e.g., failed_host_count) for
subsequent use rather than rebuilding or recounting the queryset; reuse the
cached queryset or cached count in the later code paths (including the other
occurrence around lines 406-408) so the heavy Subquery is executed only once.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d8a5fd84-31bf-4c7e-b68a-0bac481812eb
📒 Files selected for processing (8)
awx/api/serializers.pyawx/api/views/__init__.pyawx/main/access.pyawx/main/managers.pyawx/main/models/events.pyawx/main/models/inventory.pyawx/main/models/jobs.pyawx/main/signals.py
💤 Files with no reviewable changes (1)
- awx/main/access.py
🚧 Files skipped from review as they are similar to previous changes (1)
- awx/main/models/jobs.py
…ublists - Add 'pk' to AnnotatedSummary dynamic type (fixes AttributeError in get_related) - Add job_template_id and job_template_name to subquery annotations so list views include these fields in summary_fields.last_job (matching detail views) - Traverse job__ FK from JobHostSummary instead of using separate UnifiedJob subquery with OuterRef on another annotation (cleaner SQL, avoids alias issue) - Annotate all host sublist views (InventoryHostsList, GroupHostsList, GroupAllHostsList, InventorySourceHostsList) to prevent N+1 queries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tale FKs Tests were asserting host.last_job_id and host.last_job_host_summary_id which are no longer updated. Use JobHostSummary.latest_for_host() to derive the same data, matching the new read-time derivation approach. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (3)
awx/api/serializers.py (3)
1991-1995:⚠️ Potential issue | 🟠 MajorAlways overwrite the dormant
last_job_host_summaryfield in the payload.
super().to_representation()still serializesHost.last_job_host_summary, but this FK is no longer maintained. This branch only clears it when no derived summary exists, so hosts with a newer derived summary still return the stale FK value.Suggested fix
last_summary = self._get_last_summary(obj) - if 'last_job' in ret and not last_summary: - ret['last_job'] = None - if 'last_job_host_summary' in ret and not last_summary: - ret['last_job_host_summary'] = None + if 'last_job' in ret: + ret['last_job'] = last_summary.job_id if last_summary and last_summary.job_id else None + if 'last_job_host_summary' in ret: + ret['last_job_host_summary'] = last_summary.pk if last_summary else None return ret🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/api/serializers.py` around lines 1991 - 1995, The payload can return a stale Host.last_job_host_summary FK because to_representation only clears ret['last_job_host_summary'] when last_summary is falsy; change the logic in the serializer's to_representation (after calling super().to_representation()) to always set ret['last_job_host_summary'] = last_summary (or None) using the value returned by self._get_last_summary(obj) instead of only clearing it when last_summary is missing so the derived summary always overwrites the stale FK.
1887-1912:⚠️ Potential issue | 🟠 Major
summary_fields.last_job*dropped existing keys.This manual rebuild no longer emits fields still declared in
SUMMARIZABLE_FK_FIELDS:descriptionforlast_job_host_summary, anddescription,license_error, andcanceled_onforlast_job. That is a response-contract regression for clients readingsummary_fields.Please build these dicts from
SUMMARIZABLE_FK_FIELDSor extend the annotated/fallback data so the previous keys are preserved.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/api/serializers.py` around lines 1887 - 1912, The manual dict construction for last_job_host_summary and last_job drops keys declared in SUMMARIZABLE_FK_FIELDS (e.g., description, license_error, canceled_on); change the logic to build these summary dicts by iterating over SUMMARIZABLE_FK_FIELDS (or using it as a template) and copying values from last_summary.job_data when present or from last_summary.job / last_job in the fallback, rather than hard-coding individual fields; preserve the existing special-case behavior for job_template_id/name (remove them only when None) and ensure the code paths that set d['last_job_host_summary'] and d['last_job'] reuse SUMMARIZABLE_FK_FIELDS to include all previously exposed keys.
1852-1878:⚠️ Potential issue | 🟠 MajorConstructed inventory hosts still lose their latest summary.
_get_last_summary()only reads host-based annotations orJobHostSummary.latest_for_host(obj.id). For constructed hosts, the latest row comes fromconstructed_host_summaries, andawx/api/views/__init__.py:131-149annotates onlyhost_id, so Line 1873 cachesNoneandget_related()/summary_fields/has_active_failuresall go blank for those hosts.Handle
obj.inventory.kind == 'constructed'before the host-based path, or add a matching constructed-host annotation/fallback.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/api/serializers.py` around lines 1852 - 1878, The code in _get_last_summary currently only checks host-based annotations and JobHostSummary.latest_for_host, causing constructed inventory hosts to get obj._cached_last_summary = None; modify _get_last_summary to first check if obj.inventory.kind == 'constructed' and, if so, use the constructed-host annotation (constructed_host_summaries) or a constructed-specific fallback to locate the latest summary (mirroring the existing annotated-summary object shape) and assign it to obj._cached_last_summary; alternatively, detect and use any annotated constructed-summary fields (analogous to _latest_summary_id/_latest_summary_host_name) before falling back to host-based JobHostSummary.latest_for_host so constructed hosts retain summary_fields and has_active_failures.
🧹 Nitpick comments (1)
awx/api/views/__init__.py (1)
131-149: Performance trade-off is acceptable but worth monitoring.The implementation uses 10 separate
Subqueryannotations, each becoming a correlated scalar subquery in SQL. This is a valid trade-off: one query with multiple scalar subqueries avoids the N+1 problem the PR aims to fix (each host requiring separate lookups).For large host lists, monitor query execution time. If it becomes problematic, consider consolidating into a single subquery with
values()returning multiple fields, or usingPrefetchwith annotation on the related manager.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/api/views/__init__.py` around lines 131 - 149, The function _annotate_host_latest_summary currently issues 10 correlated Subquery annotations (models.JobHostSummary via Subquery) which is acceptable but can be costly for very large host lists; add a clear TODO comment above _annotate_host_latest_summary documenting this performance trade-off and instruct future maintainers to monitor query times and memory for large batches, and include guidance to replace the multiple Subquery calls with a single consolidated Subquery (values(...) returning multiple fields or a JSON/tuple) or to use Prefetch with annotation on the related manager (JobHostSummary) if monitoring shows degradation; also add a short note to add a benchmark/test that reproduces large-host query timing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@awx/api/serializers.py`:
- Around line 1991-1995: The payload can return a stale
Host.last_job_host_summary FK because to_representation only clears
ret['last_job_host_summary'] when last_summary is falsy; change the logic in the
serializer's to_representation (after calling super().to_representation()) to
always set ret['last_job_host_summary'] = last_summary (or None) using the value
returned by self._get_last_summary(obj) instead of only clearing it when
last_summary is missing so the derived summary always overwrites the stale FK.
- Around line 1887-1912: The manual dict construction for last_job_host_summary
and last_job drops keys declared in SUMMARIZABLE_FK_FIELDS (e.g., description,
license_error, canceled_on); change the logic to build these summary dicts by
iterating over SUMMARIZABLE_FK_FIELDS (or using it as a template) and copying
values from last_summary.job_data when present or from last_summary.job /
last_job in the fallback, rather than hard-coding individual fields; preserve
the existing special-case behavior for job_template_id/name (remove them only
when None) and ensure the code paths that set d['last_job_host_summary'] and
d['last_job'] reuse SUMMARIZABLE_FK_FIELDS to include all previously exposed
keys.
- Around line 1852-1878: The code in _get_last_summary currently only checks
host-based annotations and JobHostSummary.latest_for_host, causing constructed
inventory hosts to get obj._cached_last_summary = None; modify _get_last_summary
to first check if obj.inventory.kind == 'constructed' and, if so, use the
constructed-host annotation (constructed_host_summaries) or a
constructed-specific fallback to locate the latest summary (mirroring the
existing annotated-summary object shape) and assign it to
obj._cached_last_summary; alternatively, detect and use any annotated
constructed-summary fields (analogous to
_latest_summary_id/_latest_summary_host_name) before falling back to host-based
JobHostSummary.latest_for_host so constructed hosts retain summary_fields and
has_active_failures.
---
Nitpick comments:
In `@awx/api/views/__init__.py`:
- Around line 131-149: The function _annotate_host_latest_summary currently
issues 10 correlated Subquery annotations (models.JobHostSummary via Subquery)
which is acceptable but can be costly for very large host lists; add a clear
TODO comment above _annotate_host_latest_summary documenting this performance
trade-off and instruct future maintainers to monitor query times and memory for
large batches, and include guidance to replace the multiple Subquery calls with
a single consolidated Subquery (values(...) returning multiple fields or a
JSON/tuple) or to use Prefetch with annotation on the related manager
(JobHostSummary) if monitoring shows degradation; also add a short note to add a
benchmark/test that reproduces large-host query timing.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 51cb34c9-11e8-4d82-be1e-22d43505eeee
📒 Files selected for processing (2)
awx/api/serializers.pyawx/api/views/__init__.py
There was a problem hiding this comment.
🧹 Nitpick comments (1)
awx/main/tests/functional/models/test_events.py (1)
74-77: Keep these assertions independent ofJobHostSummary.latest_for_host().Using the new production helper in the assertions weakens the test: if
latest_for_host()regresses, these tests can still stay green because they exercise the same selection path. It also adds an N+1 query loop over hosts. Prefer asserting from a queryset built directly in the test once per case.♻️ Example pattern
- for host in Host.objects.all(): - latest_summary = JobHostSummary.latest_for_host(host.id) + summaries_by_host_id = { + summary.host_id: summary + for summary in JobHostSummary.objects.filter(job=self.job, host_id__isnull=False) + } + for host in Host.objects.all(): + latest_summary = summaries_by_host_id.get(host.id) assert latest_summary is not None assert latest_summary.job_id == self.job.id assert latest_summary.host == hostApply the same idea in the
--limittest and assert missing hosts viasummaries_by_host_id.get(h.id) is None.Also applies to: 109-116
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/main/tests/functional/models/test_events.py` around lines 74 - 77, Replace assertions that call JobHostSummary.latest_for_host(host.id) with assertions based on a queryset assembled in the test (e.g., build summaries_by_host_id = {s.host_id: s for s in JobHostSummary.objects.filter(job=self.job).order_by(...)}), then assert summaries_by_host_id.get(host.id) and compare its job_id/host to expected values; do the same change in the `--limit` test (assert missing hosts with summaries_by_host_id.get(h.id) is None) to avoid using JobHostSummary.latest_for_host() and eliminate the N+1 query pattern.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@awx/main/tests/functional/models/test_events.py`:
- Around line 74-77: Replace assertions that call
JobHostSummary.latest_for_host(host.id) with assertions based on a queryset
assembled in the test (e.g., build summaries_by_host_id = {s.host_id: s for s in
JobHostSummary.objects.filter(job=self.job).order_by(...)}), then assert
summaries_by_host_id.get(host.id) and compare its job_id/host to expected
values; do the same change in the `--limit` test (assert missing hosts with
summaries_by_host_id.get(h.id) is None) to avoid using
JobHostSummary.latest_for_host() and eliminate the N+1 query pattern.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 719e90cf-a77f-4644-ade4-e3833a89c968
📒 Files selected for processing (1)
awx/main/tests/functional/models/test_events.py
The failures_url linked to ?last_job_host_summary__failed=True which filters on the now-stale FK. The dashboard count itself was already fixed to use a subquery annotation. Since DashboardView is deprecated and has_active_failures is a SerializerMethodField (not filterable), remove the failures_url entirely rather than creating a custom filter. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
This comes at the cost of adding a large number of annotations to every host view. Sure, you've made the post-job processing faster by not doing those operations. But it's unclear at this point how much worse the host listings have become. |
awx/api/views/__init__.py
Outdated
| _latest_job_failed=Subquery(latest_summary.values('job__failed')[:1]), | ||
| _latest_job_finished=Subquery(latest_summary.values('job__finished')[:1]), | ||
| _latest_job_template_id=Subquery(latest_summary.values('job__job_template_id')[:1]), | ||
| _latest_job_template_name=Subquery(latest_summary.values('job__job_template__name')[:1]), |
There was a problem hiding this comment.
You're not doing yourself much favors by annotating every field from JobHostSummary. I would keep _latest_summary_id only, and work on some kind of prefetch_related for the objects, as opposed to avoiding loading the object as you're doing. Saving 1 query isn't going to proportionally make much difference, and I'm far far more worried about other deeper problems with the approach.
There was a problem hiding this comment.
Done in db374c1. Exactly as you suggested — now annotates only _latest_summary_id and bulk-fetches the full JobHostSummary objects with select_related('job', 'job__job_template') after pagination. 10 correlated subqueries → 1 subquery + 1 IN query.
Instead of annotating every host queryset with 10 correlated subqueries
(summary + job + job_template fields), annotate only _latest_summary_id
and bulk-fetch the full JobHostSummary objects after pagination via
select_related('job', 'job__job_template').
This reduces the SQL from 10 correlated subqueries to 1 subquery + 1 IN
query, addressing review feedback about annotation overhead on host list
views.
- _annotate_host_latest_summary: only annotates _latest_summary_id
- _prefetch_latest_summaries: bulk-fetches and attaches to host objects
- HostSummaryPrefetchMixin: hooks into list() after pagination
- Serializer uses real JobHostSummary objects (no more AnnotatedSummary)
- to_representation always overwrites stale FK values
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@AlanCoding Good feedback — you're right that 10 correlated subqueries was too heavy. I've refactored in db374c1: Before: 10 correlated subqueries annotated on every host queryset (summary + job + job_template fields extracted individually) After: 1 correlated subquery ( The new approach:
So for a paginated host list, the total query cost is:
The serializer gets real |
| h.last_job_id = job.id | ||
| updated_hosts.add(h) | ||
| if h.id in host_mapping: | ||
| h.last_job_host_summary_id = host_mapping[h.id] |
There was a problem hiding this comment.
It seems that these 2 fields would then need to be removed from the Host model, since they are unused and now inconsistently populated after migration.
Host.last_job
Host.last_job_host_summary
| d = super(HostSerializer, self).get_summary_fields(obj) | ||
| try: | ||
| d['last_job']['job_template_id'] = obj.last_job.job_template.id | ||
| d['last_job']['job_template_name'] = obj.last_job.job_template.name |
There was a problem hiding this comment.
Again, if the fields won't be used again, we want the diff to flow logically from that. That would entail removing:
'last_job': DEFAULT_SUMMARY_FIELDS + ('finished', 'status', 'failed', 'license_error', 'canceled_on'),
'last_job_host_summary': DEFAULT_SUMMARY_FIELDS + ('failed',),Then, it's more like, oh, that was removed, meaning that the base serializer will no longer populate those. That means that HostSerializer now needs to take full custom control of those. Logical.
Even so, this should still attempt to share logic with the base serializer, which means using DEFAULT_SUMMARY_FIELDS in a loop (like the base serializier) plus the additions instead of enumerating all the fields.
|
Let me give a suggestion here. But first I want to acknowledge that there is a real non-obvious tradeoff between the models & views, because obtaining a good query execution hook is challenging. I've hit this difficulty many times, and have placed that custom-prefetch-related hook in the view for these reasons (more often, I've shoved a cache in the serializer context). However, overriding So my suggestion would be this. from django.db import models
from django.db.models import OuterRef, Subquery
from awx.main.models.jobs import JobHostSummary
class HostLatestSummaryQuerySet(models.QuerySet):
"""
Queryset that annotates and bulk-attaches the latest JobHostSummary
at queryset evaluation time.
Important:
- This is NOT streaming-safe.
- Once evaluated, it behaves similarly to prefetch_related(): the primary
queryset result cache must exist so related objects can be attached.
"""
_awx_latest_summary_attached = False
def _clone(self):
clone = super()._clone()
clone._awx_latest_summary_attached = self._awx_latest_summary_attached
return clone
def with_latest_summary_id(self):
latest_summary = JobHostSummary.objects.filter(host_id=OuterRef('pk')).order_by('-id')
return self.annotate(
_latest_summary_id=Subquery(latest_summary.values('id')[:1]),
)
def _fetch_all(self):
super()._fetch_all()
if self._awx_latest_summary_attached or self._result_cache is None:
return
latest_summary_ids = [
host._latest_summary_id
for host in self._result_cache
if getattr(host, '_latest_summary_id', None) is not None
]
summaries_by_id = JobHostSummary.objects.select_related('job').in_bulk(latest_summary_ids)
for host in self._result_cache:
latest_summary_id = getattr(host, '_latest_summary_id', None)
host._latest_summary_cache = summaries_by_id.get(latest_summary_id)
self._awx_latest_summary_attached = True
class HostLatestSummaryManager(models.Manager.from_queryset(HostLatestSummaryQuerySet)):
def get_queryset(self):
return super().get_queryset().with_latest_summary_id()
class Host(models.Model):
# fields...
# Keep your real default manager unchanged.
objects = models.Manager()
# Dedicated manager: querysets from here always have latest summary support.
with_latest_summary_objects = HostLatestSummaryManager()
@property
def latest_summary(self):
if hasattr(self, '_latest_summary_cache'):
return self._latest_summary_cache
summary = (
JobHostSummary.objects.filter(host_id=self.pk)
.order_by('-id')
.select_related('job')
.first()
)
self._latest_summary_cache = summary
return summary
@property
def latest_job(self):
summary = self.latest_summary
if summary is None:
return None
return summary.jobThis is on the model level, and the queryset would need to be replaced or modified on the view level with the custom manager here. In all cases working with a Host object, we would want to use the property |
…mmary Per review feedback, replace the view-level HostSummaryPrefetchMixin with a custom QuerySet that bulk-attaches summaries at evaluation time (like prefetch_related), and a Host.latest_summary property as the single access point. - HostLatestSummaryQuerySet: overrides _fetch_all() to bulk-fetch JobHostSummary objects with select_related after queryset evaluation - HostManager now inherits from the custom queryset via from_queryset() - Host.latest_summary property: uses cache if available, falls back to individual query - Remove _annotate_host_latest_summary, _prefetch_latest_summaries, HostSummaryPrefetchMixin from views — no more list() override needed - Remove last_job/last_job_host_summary from SUMMARIZABLE_FK_FIELDS - Serializer uses obj.latest_summary and DEFAULT_SUMMARY_FIELDS loop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@AlanCoding Implemented your suggestion in e58710b: Custom QuerySet + Manager pattern:
Removed from views:
Removed from serializer:
The views are now clean — they don't need to know about the summary prefetching at all. The manager handles it transparently. |
|
@AlanCoding One concern with the current implementation: putting Two options to scope it: Option A: Separate manager (your original suggestion had class Host(models.Model):
objects = HostManager() # no annotation
with_latest_summary_objects = HostLatestSummaryManager() # annotatedOption B: Keep one manager, apply in views # In each host-serving view:
def get_queryset(self):
return super().get_queryset().with_latest_summary_id()The Which approach do you prefer? |
- Remove with_latest_summary_id() from HostManager.get_queryset() to avoid applying the correlated subquery to every Host query globally (count, exists, internal relations) - Apply with_latest_summary_id() in get_queryset() of the 6 host-serving views only - Restore license_error and canceled_on to last_job summary fields to avoid breaking API change Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Without this guard, _fetch_all() would set _latest_summary_cache=None on every host in non-annotated querysets (e.g. Host.objects.filter()), masking the per-object fallback query in Host.latest_summary. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Added a guard in The guard checks |
|
Still catching up, but initially confirming this matches my prior intent and still current preference.
I did have: We would only use this in the view, but I recognize this might have the fundamental flaw the the prefetch executes when doing a |
|
This has apparently promising API results. I will put up a PR with fixes for the django-debug-toolbar in a bit. Here are results from the same hosts list, filled with 50+ hosts and 10+ jobs for each host. BeforeAfter |
awx/api/serializers.py
Outdated
| try: | ||
| last_job = last_summary.job | ||
| d['last_job'] = OrderedDict() | ||
| for field in DEFAULT_SUMMARY_FIELDS + ('finished', 'status', 'failed', 'license_error', 'canceled_on'): |
There was a problem hiding this comment.
canceled_on has been flagged as a field added to the API. Better to remove and keep current structure.
awx/api/serializers.py
Outdated
| if last_summary: | ||
| d['last_job_host_summary'] = OrderedDict() | ||
| d['last_job_host_summary']['id'] = last_summary.id | ||
| d['last_job_host_summary']['name'] = last_summary.host_name |
There was a problem hiding this comment.
Also, this "name" field is flagged as an API change. Better to remove for this change.
AlanCoding
left a comment
There was a problem hiding this comment.
I made 2 very minor requests for changing of the returned data structure.
Additionally:
- we should have some tests added
awx/main/tests/functional/models/**to test new methods available fromHost.objects, mainly what might be obvious from the net new code here + address the use of.count()and that it doesn't trigger the_fetch_alllogic - need to run a YOLO to check it, I can do this
Otherwise I think this looks in good shape, and the benefits are pretty solid.
|
Link related fix to Django debug toolbar #16352 |
…summary Per reviewer feedback: these fields were not in the original API contract via SUMMARIZABLE_FK_FIELDS and their addition would be an API change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Addressed both inline requests in 58047e6:
Regarding removing Also added the Re: tests — happy to add functional tests for |
…mmary Tests cover: - with_latest_summary_id() annotation and most-recent selection - _fetch_all() bulk-attach behavior on annotated querysets - _fetch_all() skips non-annotated querysets (preserves fallback) - .count() and .exists() do NOT trigger _fetch_all - Host.latest_summary cache hits (zero queries) and fallback - Host.latest_job property - select_related on bulk-attached summaries (no N+1) - Chaining preserves annotation - Multiple jobs / partial host coverage Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Added functional tests in 2c33358 (
|
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ben Thomasson <bthomass@redhat.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ben Thomasson <bthomass@redhat.com>
|



Summary
playbook_on_statswrapup path bulk-updateslast_job_idandlast_job_host_summary_idon every host touched by a job. In the Q4CY25 scale lab this query had a median execution time of 75 seconds due to index churn onmain_host(90:1 dead-to-live tuple ratio).JobHostSummary.latest_for_host(host_id)classmethod that derives the same data on demand viaJobHostSummary.objects.filter(host_id=host_id).order_by('-id').first()_annotate_host_latest_summary(qs)helper that annotates Host querysets with the latestJobHostSummaryID (single correlated subquery)_prefetch_latest_summaries(hosts)that bulk-fetches fullJobHostSummaryobjects withselect_related('job', 'job__job_template')after pagination — 1 subquery + 1 IN query instead of N+1HostSummaryPrefetchMixinthat hooks intolist()to call the prefetch between pagination and serializationlast_job_idandlast_job_host_summary_idinplaybook_on_statsselect_relatedthrough the stale FKs (eliminates the 11-table JOIN)Query cost for host list views
select_relatedlast_job_id+last_job_host_summary_idon every job completion (75s median at scale)SELECT ... FROM main_jobhostsummary JOIN main_job JOIN main_jobtemplate WHERE id IN (...)per pageFiles changed
awx/main/models/jobs.pyJobHostSummary.latest_for_host()andlatest_job_for_host()classmethodsawx/api/serializers.pyHostSerializer._get_last_summary()with caching + prefetch support,get_summary_fields()override,to_representationoverwrite of stale FK valuesawx/api/views/__init__.py_annotate_host_latest_summary(qs),_prefetch_latest_summaries(hosts),HostSummaryPrefetchMixinapplied to all 6 host-serving views; Dashboard: subquery for failed host count, removed stalefailures_urlawx/main/models/inventory.pyupdate_computed_fields: subquery for failed host countawx/main/models/events.pylast_job_idandlast_job_host_summary_id(the 75s query)awx/main/signals.pysave_host_pks_before_job_deleteandupdate_host_last_job_after_job_deletedsignal handlersawx/main/access.pyselect_relatedthroughlast_jobandlast_job_host_summaryFKsawx/main/managers.py.defer()through the stale FK chainsawx/main/tests/functional/models/test_events.pyJobHostSummary.latest_for_host()Test plan
last_jobandlast_job_host_summaryin related/summary fieldshas_active_failuresis correct on host serializer🤖 Generated with Claude Code
Change Type