Skip to content

Commit

Permalink
Pydata (#3)
Browse files Browse the repository at this point in the history
  • Loading branch information
MaxHalford authored Sep 21, 2024
1 parent 26e5518 commit 8efc987
Show file tree
Hide file tree
Showing 2 changed files with 108 additions and 1 deletion.
31 changes: 31 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: docs

on:
push:
branches:
- main

permissions:
contents: write

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure Git Credentials
run: |
git config user.name github-actions[bot]
git config user.email 41898282+github-actions[bot]@users.noreply.github.com
- uses: actions/setup-python@v5
with:
python-version: 3.x
- run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
- uses: actions/cache@v4
with:
key: mkdocs-material-${{ env.cache_id }}
path: .cache
restore-keys: |
mkdocs-material-
- run: pip install mkdocs-material
- run: mkdocs gh-deploy --force
78 changes: 77 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,82 @@ Here's how to interpret this explanation:
- From 2019 to 2020, the revenue growth was entirely due to an increase in the revenue per booking. The number of bookings was exactly the same. Therefore, the $20,000 is entirely due to the inner effect (increase in revenue per booking).
- From 2020 to 2021, the revenue growth was entirely due to an increase in the number of bookings. The revenue per booking was exactly the same. Therefore, the $110,000 is entirely due to the mix effect (increase in bookings).
- From 2021 to 2022, there was a $52,500 revenue growth. However, the revenue per booking went down by $10, so the increase is due to the higher number of bookings. The inner effect is -$7,500 while the mix effect is $45,000.
=======

![explanation](https://github.com/user-attachments/assets/d93a8b33-929f-4895-87c8-1b60c8d3bb2f)
Let's say you're an analyst at an Airbnb-like company, and you're tasked with analyzing year-over-year revenue growth. You have obtained the following dataset:

```py
>>> import locale
>>> import pandas as pd
>>> _ = locale.setlocale(locale.LC_ALL, 'en_US')
>>> fmt_currency = lambda x: '' if pd.isna(x) else locale.currency(x, grouping=True)[:-3]

>>> revenue = pd.DataFrame.from_dict([
... {'year': 2019, 'bookings': 1_000, 'revenue_per_booking': 200},
... {'year': 2020, 'bookings': 1_000, 'revenue_per_booking': 220},
... {'year': 2021, 'bookings': 1_500, 'revenue_per_booking': 220},
... {'year': 2022, 'bookings': 1_700, 'revenue_per_booking': 225},
... ])
>>> (
... revenue
... .assign(bookings=revenue.bookings.apply('{:,d}'.format))
... .assign(revenue_per_booking=revenue.revenue_per_booking.apply(fmt_currency))
... .set_index('year')
... )
bookings revenue_per_booking
year
2019 1,000 $200
2020 1,000 $220
2021 1,500 $220
2022 1,700 $225

```

It's quite straightforward to calculate the revenue for each year, and then to measure the year-over-year growth:

```py
>>> (
... revenue
... .assign(revenue=revenue.eval('bookings * revenue_per_booking'))
... .assign(growth=lambda x: x.revenue.diff())
... .assign(bookings=revenue.bookings.apply('{:,d}'.format))
... .assign(revenue_per_booking=revenue.revenue_per_booking.apply(fmt_currency))
... .assign(revenue=lambda x: x.revenue.apply(fmt_currency))
... .assign(growth=lambda x: x.growth.apply(fmt_currency))
... .set_index('year')
... )
bookings revenue_per_booking revenue growth
year
2019 1,000 $200 $200,000
2020 1,000 $220 $220,000 $20,000
2021 1,500 $220 $330,000 $110,000
2022 1,700 $225 $382,500 $52,500

```

Growth can be due to two factors: an increase in the number of bookings, or an increase in the revenue per booking. The icanexplain library to decompose the growth into these two factors:

```py
>>> import icanexplain as ice
>>> explainer = ice.SumExplainer(
... fact='revenue_per_booking',
... period='year',
... count='bookings'
... )
>>> explanation = explainer(revenue)
>>> explanation.map(fmt_currency)
inner mix
year
2020 $20,000 $0
2021 $0 $110,000
2022 $7,500 $45,000

```

Here's how to interpret this explanation:

- From 2019 to 2020, the revenue growth was entirely due to an increase in the revenue per booking. The number of bookings was exactly the same. Therefore, the $20,000 is entirely due to the inner effect (increase in revenue per booking).
- From 2020 to 2021, the revenue growth was entirely due to an increase in the number of bookings. The revenue per booking was exactly the same. Therefore, the $110,000 is entirely due to the mix effect (increase in bookings).
- From 2021 to 2022, there was a $52,500 revenue growth. However, the revenue per booking went down by $10, so the increase is due to the higher number of bookings. The inner effect is -$7,500 while the mix effect is $45,000.

![explanation](https://github.com/user-attachments/assets/d93a8b33-929f-4895-87c8-1b60c8d3bb2f)

0 comments on commit 8efc987

Please sign in to comment.