Overview ticket: Review Output tables (status 22nd of December) #113

AnneSchoenauer · 2024-01-02T11:40:23Z

This ticket is an overview ticket for the review that Tilman and I are doing for the output tables that @SKruthoff and @ysherstyuk Yana created on the 22nd of December. The output tables can be found here: https://drive.google.com/drive/u/0/folders/1uaYeEIiwAcJkNvG5oFMIVvlqrG5PrR_9.

@Tilmon please add your review here.
@ysherstyuk and @SKruthoff this is only FYI.
@maurolepore and @kalashsinghal there might be tasks coming out of this overview ticket which will then be assigned to you.

I created this ticket in the tiltIndicatorAfter package as I assume that the results are correct that are produced but there might be small things that we want to change in the output tables (so more a "view" problem than an actual "code" problem). If we find some mistakes that need to be code related, we will create tickets in the according packages - but I think and hope that this is not the case anyway ;).

AnneSchoenauer · 2024-01-02T11:59:49Z

Anne's review (preliminary):

0. Overall

0.1 Each of the output files have a row_id. What is this row_id about?
0.2 I think we should think about simplifying it. There are a lot of double information and maybe would be great to have only one file similar to this ING here: https://docs.google.com/spreadsheets/d/1rRUBpeXfj1w-paAX0Cono7RslZX_kzoujYE1E7XdnaM/edit#gid=1842238901 @Tilmon what do you think?
0.3 For the company view - I noticed that it is super important when doing econemtric that we need not a long but a wide view. Could we create both? @Tilmon what do you think? I also think that for some plots we need wide view so maybe good to have the view that is best compatible for further analysis? @Tilmon what do you think?
0.4 Do we want to have the Transition Risk Score some where in the data? As a combination from the emission profile and sector profile?

1. Emission profile product level:

1.1 Adding Co2e_lower and Co2e_upper in the output table. This refers to this ticket here. So I think the variables created with the jitter function are already in there but not in the output file. @Tilmon are the names okay for this variable?
1.2 Renaming PCTR_risk_category in emission_profile. @Tilmon is the name okay?

2. Emission profile company level:

2.1 @Tilmon We would need to decide if we do want to have a main_tilt_sector and main_tilt_subsector
2.2 Renaming PCTR_risk_category in emission_profile. @Tilmon is the name okay?
2.3 Renaming PCTR_share in emission_profile_share. @Tilmon is the name okay?
2.4 I know we decided that if the company produces a product that cannot be matched, that we exlcuded it from the analysis. However, if you now look at the emission_profile results, the results only hold for the products that we were able to match with ecoinvent. I think out of transparency, we do need also a NA section here. Especially because of this ticket here, we will have a NA in the emission_profile anyway. @Tilmon please let's discuss and then write a ticket here The same also holds for the sector profiles.

3. Emission profile upstream product level:

3.1 Empty file
3.2 But for sure: Rename 'ISTR_risk_category' into 'emission_usptream_profile' @Tilmon this is a very long name. Let's discuss the naming please!
3.3 Adding GEO column.

4. Emission profile upstream company level:

4.1 Rename 'ISTR_share' into 'emission_usptream_profile_share' @Tilmon this is a very long name. Let's discuss the naming please!
4.2 Rename 'ISTR_risk_category' into 'emission_usptream_profile' @Tilmon this is a very long name. Let's discuss the naming please!

Sector profile product level

Rename 'PSTR_risk_category' into 'sector_profile' @Tilmon Let's discuss the naming please!
Rename 'sector' and 'subsector' to 'sector_scenario' and 'subsector_scenario' @Tilmon this is a very long name. Let's discuss the naming please!
Rename 'profile_ranking' into 'SERT' @Tilmon Let's discuss the naming please!

Sector profile companyl evel

Rename 'PSTR_risk_category' into 'sector_profile' @Tilmon Let's discuss the naming please!
Rename 'PSTR_share' into 'sector_profile_share' @Tilmon Let's discuss the naming please!

Sector profile upstream product level:

Adding GEO column.
I noticed that some of the inputs are actually not inputs but outputs - for example biowaste. Shall we exlcude them @Tilmon
I noticed that some ep_products were matched to the same matched_activity_name. The results are therefore counted twice. For example, see for company_id 'wamic-gravur-lasertechnik-eu_00000005202642-001'
. Is this okay? What do you think @Tilmon

Sector profile upstream companyl evel
[ ] Rename 'ISTR_risk_category' into 'sector_profile_upstream' @Tilmon Let's discuss the naming please!
[ ] Rename 'ISTR_share' into 'sector_profile_upstream_share' @Tilmon Let's discuss the naming please!

Tilmon · 2024-01-03T13:41:26Z

Overall

Duplicates: running dplyr::distinct() on the datasets emission_profile_company.csv, emission_profile_product.csv, emission_profile_upstream_at_company_level.csv shows that all these 3 datasets have duplicates. Only tested for these 3. All datasets should be tested for duplications and duplications avoided. E.g. the companies_id "adolf-wurth-gmbh-co-kg_00000004971238-001" has all rows twice in the emission_profile_product.csv.
Re simplification & wide vs long format

I think we should think about simplifying it. There are a lot of double information and maybe would be great to have only one file similar to this ING here: https://docs.google.com/spreadsheets/d/1rRUBpeXfj1w-paAX0Cono7RslZX_kzoujYE1E7XdnaM/edit#gid=1842238901 @Tilmon what do you think?

@AnneSchoenauer I think the downside of the link you shared is that this would require to make separate columns for each benchmark (i.e. scenarios for sector_profile and the other benchmarks for emission_profile). Right now, we have it all in the long format instead of wide format. This then already relates to your comment here

For the company view - I noticed that it is super important when doing econemtric that we need not a long but a wide view. Could we create both? @Tilmon what do you think? I also think that for some plots we need wide view so maybe good to have the view that is best compatible for further analysis? @Tilmon what do you think?

@AnneSchoenauer Providing both to banks might be a bit of an overkill BUT we could provide code to modify datasets to wide format AND join all company-level results together and all product-level results? I assume that's possible and would also solve the "simplifying" question you raised above.

3. Transition Risk Score

Do we want to have the Transition Risk Score some where in the data? As a combination from the emission profile and sector profile?

YES

Emission Profile
@AnneSchoenauer both your suggestions are fine for me.

Emission profile company level:
ALL OK FOR ME. Regarding main_sectors: I think our current data don't allow for that, no? Maybe good to discuss, but would think that if we aim to use other data sources than Europages in the long-term that we maybe do not need to invest time in that right now, because eventually, another data source will solve the problem?

Emission profile upstream product level:
OK

Renaming in

1. Emission profile upstream company level: OK
Sector profile product level: OK
Sector profile companyl evel: OK
Sector profile upstream companyl evel: OK

Sector profile upstream product level:

Adding GEO column.

OK

I noticed that some of the inputs are actually not inputs but outputs - for example biowaste. Shall we exlcude them @Tilmon

@AnneSchoenauer can you share the specific example? Not entirely clear to me from your description :)

I noticed that some ep_products were matched to the same matched_activity_name. The results are therefore counted twice. For example, see for company_id 'wamic-gravur-lasertechnik-eu_00000005202642-001' Is this okay? What do you think @Tilmon

I would say Yes, that's OK. The matching is never perfect, always only a proxy. I think it's stringent if we stick to the number of ep_products, even if they are matched to the same ecoinvent product.

AnneSchoenauer · 2024-01-04T08:17:12Z

With regard to an example of biowaste please see here. In general the problem exists as we have a Life CYCLE assessment, i.e. also downstream and not only upstream information. What do you think?

One example is for example the company ihab-serour_00000005050260-001 (you can see it in the sector_profile_upstream_at_product_level which is producing coffee bean, green. One "input" product as we call it is biowaste. But biowaste is not an input product but rather an output (downstream).

AnneSchoenauer · 2024-01-04T08:38:39Z

@Tilmon I created a separate ticket for this to discuss it elsewhere and to be able to close this ticket here.

AnneSchoenauer · 2024-01-04T09:37:47Z

All tickets are created that's why I close the issue.

AnneSchoenauer added this to the Enhancement of tilt methodology milestone Jan 2, 2024

AnneSchoenauer assigned AnneSchoenauer and Tilmon Jan 2, 2024

AnneSchoenauer mentioned this issue Jan 3, 2024

Revise Output File Views #116

Closed

AnneSchoenauer added the documentation Improvements or additions to documentation label Jan 4, 2024

AnneSchoenauer closed this as completed Jan 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overview ticket: Review Output tables (status 22nd of December) #113

Overview ticket: Review Output tables (status 22nd of December) #113

AnneSchoenauer commented Jan 2, 2024 •

edited

Loading

AnneSchoenauer commented Jan 2, 2024 •

edited

Loading

Tilmon commented Jan 3, 2024 •

edited

Loading

AnneSchoenauer commented Jan 4, 2024

AnneSchoenauer commented Jan 4, 2024

AnneSchoenauer commented Jan 4, 2024

Overview ticket: Review Output tables (status 22nd of December) #113

Overview ticket: Review Output tables (status 22nd of December) #113

Comments

AnneSchoenauer commented Jan 2, 2024 • edited Loading

AnneSchoenauer commented Jan 2, 2024 • edited Loading

Tilmon commented Jan 3, 2024 • edited Loading

AnneSchoenauer commented Jan 4, 2024

AnneSchoenauer commented Jan 4, 2024

AnneSchoenauer commented Jan 4, 2024

AnneSchoenauer commented Jan 2, 2024 •

edited

Loading

AnneSchoenauer commented Jan 2, 2024 •

edited

Loading

Tilmon commented Jan 3, 2024 •

edited

Loading