-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use ISO 20275 data from GLEIF #32
Comments
The ELF Code List definitely has more abbreviations: https://www.gleif.org/en/about-lei/code-lists I am just not sure what the equivalents are in some of the languages to US/UK. However, there may be some that have been missed which are more obvious. I will keep a note of this. |
I suspected someone might have done this by now, and sure enough: https://pypi.org/project/iso-20275 . Since 2017, there now exists ISO standard 20275 ‘Financial Services – Entity Legal Forms (ELF). |
Cleanco was still built to ID entity types in strings, so I think it’s fine to move towards incorporating this package. It was only a matter of time before the data was standardized and put into a python package. Moving away from solely being US/UK based and towards an international standard is for the best for this package. If incorporated, it would fix most of our open issues as well. I’ll look into doing this. |
For getting the base name without legal term affixes, the unique terms list from the ISO standard should probably be patched in here: |
This could be broken into two or three different tickets;
|
Just to give you an idea of where this is going - I am counting 1,180 unique business entity affixes in this package to our 202. These are the classifiers (properties) that they use as well: ['alpha2', 'alpha2_2', 'country', 'creation_date', 'elf', 'jurisdiction', 'local_abbreviations', 'local_name', 'modification', 'modification_date', 'reason', 'status', 'transliterated_abbreviations', 'transliterated_name'] |
Given we now understand more the differences between iso20275 data and cleanco termdata, it seems to me we need a decisions on data strategy. The current PR gets rid of cleanco termdata in favour of iso20275. But in hindsight it seems to me that instead, iso20275 should be used just a primary, but not exclusive source. On the other hand, both iso20275 and clanco also need a mechanism by which users can use their own legal form data if needed. It would make sense if both packages used the same mechanisms and formats. Thoughts? |
Replying to your "Thoughts?", https://en.wikipedia.org/wiki/Y%C5%ABgen_gaisha And even Dutch is incomplete; for example, "Foundation": Looking it up it seems that "st." is the official one and fdn (and lesser: fndn. or fou.) Although in practice the word is written out full, because hey, you want to state clearly you are a foundation. Thus, in my conclusion, there is still not a good list and I join @petri that maybe both lists need to be eligible. Or at least that we can merge the differences into a new version of iso20275 including many missed data that termdata does have, and then we can use that as a master list. In practice it means we need to fix the bug where custom_basename() is unusable in it's current state and let users add their settings in an easy way, without jumping through hoops. |
See https://www.gleif.org/en. There's a lot of data that would help improve the legal affix database of cleanco.
The text was updated successfully, but these errors were encountered: