Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Average Scroll Depth Metric: imported data #4915

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

RobertJoonas
Copy link
Contributor

@RobertJoonas RobertJoonas commented Dec 17, 2024

Changes

This PR starts making use of two new columns in the imported_pages table. The relevant migrations are already merged and executed:

  1. Redo migration: Add scroll_depth to imported_pages #4926
  2. Average Scroll Depth Metric: another imported schema migration #4932 (pageleave_visitors field)

The objectives of this PR are to:

  • Include the scroll_depth metric in the full CSV export/import feature
  • Include scroll_depth from imported data (if it exists) in stats queries

Returning warnings about imported data not being included in scroll_depth calculation will be implemented in a follow up PR.

Tests

  • Automated tests have been added

Changelog

  • This PR does not make a user-facing change

Documentation

  • This change does not need a documentation update

Dark mode

  • This PR does not change the UI

@RobertJoonas RobertJoonas changed the title Scroll depth full csv [WIP] Scroll depth full csv Dec 17, 2024
@RobertJoonas RobertJoonas marked this pull request as draft December 17, 2024 13:23
@RobertJoonas RobertJoonas changed the title [WIP] Scroll depth full csv [WIP] Average Scroll Depth Metric: imported data Dec 18, 2024
@RobertJoonas RobertJoonas marked this pull request as ready for review December 19, 2024 11:59
@RobertJoonas RobertJoonas changed the title [WIP] Average Scroll Depth Metric: imported data Average Scroll Depth Metric: imported data Dec 19, 2024
Copy link

Preview environment👷🏼‍♀️🏗️
PR-4915

lib/plausible/exports.ex Outdated Show resolved Hide resolved
lib/workers/export_analytics.ex Show resolved Hide resolved
test/plausible/imported/csv_importer_test.exs Outdated Show resolved Hide resolved
Copy link
Contributor

@ukutaht ukutaht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall. Some questions and comments below

)
|> group_by([e], [selected_as(:date), selected_as(:page)])
|> order_by([e], selected_as(:date))
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like both the if and else clauses have the same basic query from events_v2. Maybe there's an opportunity to extract it outside the conditional

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I've extracted a base_q in 7b5eeeb. The select clause however is a bit tricky to extract because we want to select a list. One possible solution is to select_merge which requires the select to be a map, but then we'd still need to convert it back to a list in the end. I figured it's not worth the hustle as the FF is not expected to be there forever.

@@ -345,6 +345,7 @@ defmodule Plausible.Stats.Imported do

defp can_order_by?(query) do
Enum.all?(query.order_by, fn
{:scroll_depth, _} -> false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is ordering by scroll depth prevented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because since scroll_depth is a "special metric", it's not yet selected into the query yet at this point, and any attempt to order by it will result in an invalid query.

lib/plausible/stats/sql/special_metrics.ex Outdated Show resolved Hide resolved
fragment(
"""
case
when isNotNull(?) AND isNotNull(?) then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not instead return 0 values when scroll depth is missing from either side of the join? Other words - can we avoid this big conditional somehow? There's nothing wrong with the conditonal itself but I find managing raw fragments like this is a pain because of how the params must be passed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fragment is pretty hideous indeed, but I'm afraid there's no way around it if we want the query to explicitly return NULL when there's no data. The main problem here is that zero can be a valid scroll_depth value.

defp select_metric(:scroll_depth, "imported_pages") do
wrap_alias([i], %{
scroll_depth_sum: sum(i.scroll_depth),
total_visitors: sum(i.pageleave_visitors)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any need/benefit to renaming pageleave_visitors to total_visitors here? Could be a bit misleading

Copy link
Contributor Author

@RobertJoonas RobertJoonas Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 Agreed. It's misleading and could be confused with total_visitors used in the calculation of conversion rate. I've renamed it to pageleave_visitors all around - including the native query. 3203218

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants