-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect character encoding of datasets #17
Labels
Comments
Yes, it would be helpful to get this information as part of |
This is our perennial issue--since TDS doesn't ever "touch" the data it has no way to pull the encoding and store it. Would have to be TA1 service or the HMI server |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@YohannParis reports an issue with this dataset when trying to profile it since it's not
utf-8
.us-counties-2023.csv
The service errors with:
This can be addressed by dynamically detecting the encoding prior to reading the CSV in pandas. See this notebook for reference on how to do this with
chardet
.The text was updated successfully, but these errors were encountered: