-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add more test data (company names) #17
Comments
Hi I've some compay name such as:
Do you think it's a good idea to add these additional terms on termdata.py ? |
https://opencorporates.com could be used for testing? |
@davidheryanto it depends. What countries are those for? |
I have added a companies.csv file to the tests directory, but unfortunately it seems we cannot really use bulk ascii company names for testing, since many international companies use common anglo-american suffixes such as ltd. or inc. in their corporate names. Which results in a lot of failures. If we could get the unicode versions of the national suffixes, now that would be useful (ie. in native Chinese or Russian characters). But I am not sure whether cleanco even supports that. |
Yes, agree with the Unicode approach. It will be applicable to company names in different countries. The company names I gave are examples of companies in Singapore. |
We now have improved Unicode & non - Latin script support. So better test coverage would make sense too. One option would be to use https://faker.readthedocs.io/en/master/ to generate fake test company names. Manual labour would still be needed to provide the expected base names that cleanco should be able to produce. |
@psolin , would you have any lists of company names that you want to see tested?
The text was updated successfully, but these errors were encountered: