Skip to content

Commit bcd4e1d

Browse files
Misc doc updates (Velir#101)
* fix outdated reference to intergration tests * doc tweaks * removing reference to google's obfuscated data set which no longer works * fixing reference to integration_tests folder
1 parent 6607c04 commit bcd4e1d

File tree

2 files changed

+9
-21
lines changed

2 files changed

+9
-21
lines changed

README.md

Lines changed: 8 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,10 @@ Features include:
66
- Flattened models to access common events and event parameters such as `page_view`, `session_start`, and `purchase`
77
- Conversion of sharded event tables into a single partitioned table
88
- Incremental loading of GA4 data into your staging tables
9-
- Session and user dimensional models with conversion counts
10-
- Easy access to query parameters such as GCLID and UTM params
11-
- Support for custom event parameters & custom user properties
9+
- Page, session and user dimensional models with conversion counts
10+
- Simple methods for accessing query parameters (like UTM params) or filtering query parameters (like click IDs)
11+
- Support for custom event parameters & user properties
1212
- Mapping from source/medium to default channel grouping
13-
- Ability to exclude query parameters (like `fbclid`) from page paths
1413

1514
# Models
1615

@@ -23,7 +22,7 @@ Features include:
2322
| stg_ga4__user_properties | Finds the most recent occurance of specified user_properties for each user |
2423
| stg_ga4__derived_user_properties | Finds the most recent occurance of specific event_params value and assigns them to a user_pseudo_id. Derived user properties are specified as variables (see documentation below) |
2524
| stg_ga4__derived_session_properties | Finds the most recent occurance of specific event_params or user_properties value and assigns them to a session's session_key. Derived session properties are specified as variables (see documentation below) |
26-
| stg_ga4__session_conversions_daily | Produces daily counts of conversions per session. The lsit of conversion events to include is configurable (see documentation below) |
25+
| stg_ga4__session_conversions_daily | Produces daily counts of conversions per session. The list of conversion events to include is configurable (see documentation below) |
2726
| stg_ga4__sessions_traffic_sources | Finds the first source, medium, campaign, content, paid search term (from UTM tracking), and default channel grouping for each session |
2827
| dim_ga4__user_pseudo_ids | Dimension table for user devices as indicated by user_pseudo_ids. Contains attributes such as first and last page viewed.|
2928
| dim_ga4__sessions | Dimension table for sessions which contains useful attributes such as geography, device information, and campaign data |
@@ -70,28 +69,17 @@ packages:
7069
```
7170
## Required Variables
7271

73-
This package assumes that you have an existing DBT project with a BigQuery profile and a BigQuery GCP instance available with GA4 event data loaded. Source data is located using the following variables which must be set in your `dbt_project.yml` file.
72+
This package assumes that you have an existing DBT project with a BigQuery profile and a BigQuery GCP instance available with GA4 event data loaded. Source data is defined using the following variables which must be set in `dbt_project.yml`.
7473

7574
```
7675
vars:
7776
ga4:
7877
project: "your_gcp_project"
7978
dataset: "your_ga4_dataset"
8079
start_date: "YYYYMMDD" # Earliest date to load
81-
frequency: "daily" # daily|streaming|daily+streaming Match to the type of export configured in GA4; daily+streaming appends today's intraday data to daily data
80+
frequency: "daily" # daily|streaming|daily+streaming. See 'Export Frequency' below.
8281
```
8382

84-
If you don't have any GA4 data of your own, you can connect to Google's public data set with the following settings:
85-
86-
```
87-
vars:
88-
project: "bigquery-public-data"
89-
dataset: "ga4_obfuscated_sample_ecommerce"
90-
start_date: "20210120"
91-
```
92-
93-
More info about the GA4 obfuscated dataset here: https://support.google.com/analytics/answer/10937659?hl=en#zippy=%2Cin-this-article
94-
9583
## Optional Variables
9684

9785
### Query Parameter Exclusions
@@ -275,6 +263,6 @@ The easiest option is using OAuth with your Google Account. Summarized instructi
275263
```
276264
gcloud auth application-default login --scopes=https://www.googleapis.com/auth/bigquery,https://www.googleapis.com/auth/iam.test
277265
```
278-
# Integration Testing
266+
# Unit Testing
279267

280-
This package uses `pytest` as a method of unit testing individual models. More details can be found in the [integration_tests/README.md](integration_tests) folder.
268+
This package uses `pytest` as a method of unit testing individual models. More details can be found in the [unit_tests/README.md](unit_tests) folder.

unit_tests/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ The dbt-ga4 package treats each model and macro as a 'unit' of code. If we fix t
77

88
You'll need to install pytest, pytest-dotenv and create a `.env` file with a `BIGQUERY_PROJECT` key containing the name of your BigQuery project. An 'oauth' connection method is assumed for local development.
99

10-
Installing pytest & pytest-dotenv can be done using the requirements.txt file. Navigate to the `integration_tests` folder and run
10+
Installing pytest & pytest-dotenv can be done using the requirements.txt file. Navigate to the `unit_tests` folder and run
1111

1212
```
1313
pip install -r requirements.txt

0 commit comments

Comments
 (0)