Skip to content

Name-to-Ethnicity

🔎 About:

🌍 What's this name-ethnicity classification?

Name-ethnicity classification is about finding out the most likely ethnical origin of a personal name (e.g Cixin Liu -> Chinese, Rita Papadopulo -> Greek). It can useful tool, especially for social science research: Interpreting findings in a dataset containing the name and other information about persons but not their nationalities can lead to the fact that existing coherences based on their background are not recognized. This can reult in biased research which benefits some groups of peoples more than others.

🌈 Research:

Classifying names into ethnicities is highly dependent on the dataset, since it might bias models in regards of gender, age, and - of course - race. In our paper Equal accuracy for Andrew and Abubakar—detecting and mitigating bias in name-ethnicity classification algorithms wee seek to identify and compare such biases in different existing name ethnicity/nationality classifiers (like ours).

🖥️ The CLI classifier:

Using the name-ethnicity-classifier repository, you can classify names into their ethnicities locally. You can choose between models that are trained on different nationality configurations. Since the dataset is private, you can't train models on your own. Therefore, if you want to classify between specific nationalities, feel free to write an issue, and we might train it for you. You can also use www.name-to-ethnicity.com to request custom models for free.

☁️ The webapp:

As the dataset is private and therefore you can't train custom ethnicity classifiers yourself, we still want to enable that option using a non-profit webapp. You will be able to choose the nationalities you need and request a model, which we will automatically train for you. You can then upload names in a .csv file or use the API to classify them.

❤️ Consider donating!

If you find our tool useful, please consider donating any amount to help us cover our server and maintenance fees, which we currently pay out of our own pockets. But we're also thrilled if you simply let us know that our tool contributed to your project—whether through a star on GitHub or a quick email!


⚙️ Development:

How to contribute:

I'm currently in the progress of rewriting the backend codebase.

You can find open tasks in the 📋 Kanban Board.

Pinned Loading

  1. backend backend Public

    New name-to-ethnicity backend using Flask and SQLAlchemy in the making.

    MDX

  2. frontend frontend Public

    Frontend for name-to-ethnicity using React, TS and ChakraUI.

    TypeScript

  3. model-training-service model-training-service Public

    Microservice which asynchronously trains N2E models requested by users.

    Python 1

  4. name-ethnicity-classifier name-ethnicity-classifier Public

    This repository contains a console-interface name-ethnicity classifier

    Python 24 8

  5. nec-experiments nec-experiments Public

    This repository contains the experiments for finding the best model for our name-ethnicity-classification task.

    Python 1 1

  6. model-training model-training Public

    This repository contains the pipeline for training name-to-ethnicity models.

    Python

Repositories

Showing 10 of 11 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…