use ISO 20275 data from GLEIF #32

petri · 2017-02-07T11:07:24Z

See https://www.gleif.org/en. There's a lot of data that would help improve the legal affix database of cleanco.

psolin · 2019-01-26T02:00:55Z

The ELF Code List definitely has more abbreviations: https://www.gleif.org/en/about-lei/code-lists

I am just not sure what the equivalents are in some of the languages to US/UK. However, there may be some that have been missed which are more obvious. I will keep a note of this.

petri · 2020-04-26T10:51:11Z

I suspected someone might have done this by now, and sure enough: https://pypi.org/project/iso-20275 .

Since 2017, there now exists ISO standard 20275 ‘Financial Services – Entity Legal Forms (ELF).

psolin · 2020-04-26T12:51:07Z

Cleanco was still built to ID entity types in strings, so I think it’s fine to move towards incorporating this package. It was only a matter of time before the data was standardized and put into a python package. Moving away from solely being US/UK based and towards an international standard is for the best for this package.

If incorporated, it would fix most of our open issues as well. I’ll look into doing this.

petri · 2020-04-26T15:17:43Z

For getting the base name without legal term affixes, the unique terms list from the ISO standard should probably be patched in here:
https://github.com/psolin/cleanco/blob/master/cleanco/clean.py#L25-L29

petri · 2020-04-26T15:21:59Z

This could be broken into two or three different tickets;

one for using in base name deduction
one for country decuction, and
one for legal entity detection.

psolin · 2020-04-26T15:22:58Z

Just to give you an idea of where this is going - I am counting 1,180 unique business entity affixes in this package to our 202. These are the classifiers (properties) that they use as well:

['alpha2', 'alpha2_2', 'country', 'creation_date', 'elf', 'jurisdiction', 'local_abbreviations', 'local_name', 'modification', 'modification_date', 'reason', 'status', 'transliterated_abbreviations', 'transliterated_name']

petri · 2020-05-05T04:56:45Z

Given we now understand more the differences between iso20275 data and cleanco termdata, it seems to me we need a decisions on data strategy. The current PR gets rid of cleanco termdata in favour of iso20275. But in hindsight it seems to me that instead, iso20275 should be used just a primary, but not exclusive source.

On the other hand, both iso20275 and clanco also need a mechanism by which users can use their own legal form data if needed. It would make sense if both packages used the same mechanisms and formats.

Thoughts?

FBnil · 2022-08-16T20:43:27Z

Replying to your "Thoughts?",
At first I was happy, for example, Netherlands has all the forms included in cleanco.
But then Japanese does not have the romanji versions (Y.K. - which termdata will have, if a pull request is accepted), only the kanji versions (有 and only the first character of 有限会社, which I don't know if it's written out like that - But in Chinese data, it's written out).

https://en.wikipedia.org/wiki/Y%C5%ABgen_gaisha

And even Dutch is incomplete; for example, "Foundation":
"V44D","Netherlands","NL","","","stichting","Dutch","nl","stichting","","","2017-11-30","ACTV","","",""

Looking it up it seems that "st." is the official one and fdn (and lesser: fndn. or fou.) Although in practice the word is written out full, because hey, you want to state clearly you are a foundation.

Thus, in my conclusion, there is still not a good list and I join @petri that maybe both lists need to be eligible. Or at least that we can merge the differences into a new version of iso20275 including many missed data that termdata does have, and then we can use that as a master list.

In practice it means we need to fix the bug where custom_basename() is unusable in it's current state and let users add their settings in an easy way, without jumping through hoops.

petri added the question label Feb 8, 2017

petri added the enhancement label Apr 26, 2020

petri added this to the ISO 20275 milestone Apr 26, 2020

petri changed the title ~~idea: parse data from GLEIF~~ parse ISO 20275 data from GLEIF Apr 26, 2020

petri removed the question label Apr 26, 2020

petri changed the title ~~parse ISO 20275 data from GLEIF~~ use ISO 20275 data from GLEIF Apr 26, 2020

petri added ISO 20275 Re-evaluate when ISO std. support lands and removed enhancement labels Jan 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use ISO 20275 data from GLEIF #32

use ISO 20275 data from GLEIF #32

petri commented Feb 7, 2017

psolin commented Jan 26, 2019

petri commented Apr 26, 2020 •

edited

Loading

psolin commented Apr 26, 2020 •

edited

Loading

petri commented Apr 26, 2020 •

edited

Loading

petri commented Apr 26, 2020

psolin commented Apr 26, 2020 •

edited

Loading

petri commented May 5, 2020 •

edited

Loading

FBnil commented Aug 16, 2022 •

edited

Loading

use ISO 20275 data from GLEIF #32

use ISO 20275 data from GLEIF #32

Comments

petri commented Feb 7, 2017

psolin commented Jan 26, 2019

petri commented Apr 26, 2020 • edited Loading

psolin commented Apr 26, 2020 • edited Loading

petri commented Apr 26, 2020 • edited Loading

petri commented Apr 26, 2020

psolin commented Apr 26, 2020 • edited Loading

petri commented May 5, 2020 • edited Loading

FBnil commented Aug 16, 2022 • edited Loading

petri commented Apr 26, 2020 •

edited

Loading

psolin commented Apr 26, 2020 •

edited

Loading

petri commented Apr 26, 2020 •

edited

Loading

psolin commented Apr 26, 2020 •

edited

Loading

petri commented May 5, 2020 •

edited

Loading

FBnil commented Aug 16, 2022 •

edited

Loading