Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add more test data (company names) #17

Open
petri opened this issue Aug 25, 2015 · 6 comments
Open

add more test data (company names) #17

petri opened this issue Aug 25, 2015 · 6 comments

Comments

@petri
Copy link
Collaborator

petri commented Aug 25, 2015

@psolin , would you have any lists of company names that you want to see tested?

@davidheryanto
Copy link

Hi I've some compay name such as:

  • xxx co ltd
  • xxx private limited
  • xxx pte limited
  • xxx co limited

Do you think it's a good idea to add these additional terms on termdata.py ?

@petri
Copy link
Collaborator Author

petri commented Jan 11, 2016

https://opencorporates.com could be used for testing?

@petri petri added this to the Proper Testing milestone Dec 29, 2016
@petri
Copy link
Collaborator Author

petri commented Jan 3, 2017

@davidheryanto it depends. What countries are those for?

@petri
Copy link
Collaborator Author

petri commented Jan 3, 2017

I have added a companies.csv file to the tests directory, but unfortunately it seems we cannot really use bulk ascii company names for testing, since many international companies use common anglo-american suffixes such as ltd. or inc. in their corporate names. Which results in a lot of failures.

If we could get the unicode versions of the national suffixes, now that would be useful (ie. in native Chinese or Russian characters). But I am not sure whether cleanco even supports that.

@davidheryanto
Copy link

Yes, agree with the Unicode approach. It will be applicable to company names in different countries.

The company names I gave are examples of companies in Singapore.

@petri
Copy link
Collaborator Author

petri commented Apr 26, 2020

We now have improved Unicode & non - Latin script support. So better test coverage would make sense too.

One option would be to use https://faker.readthedocs.io/en/master/ to generate fake test company names. Manual labour would still be needed to provide the expected base names that cleanco should be able to produce.

@petri petri removed the enhancement label Jan 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants