Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translate to Thai #64

Open
43 of 69 tasks
peeraponw opened this issue Mar 30, 2022 · 24 comments
Open
43 of 69 tasks

Translate to Thai #64

peeraponw opened this issue Mar 30, 2022 · 24 comments

Comments

@peeraponw
Copy link
Contributor

peeraponw commented Mar 30, 2022

Hi there 👋

Let's translate the course to Thai so that the whole community can benefit from this resource 🌎!

Below are the chapters and files that need translating - let us know here if you'd like to translate any and we'll add your name to the list. Once you're finished, open a pull request and tag this issue by including #issue-number in the description, where issue-number is the number of this issue.

🙋 If you'd like others to help you with the translation, you can also post in our forums or tag @_lewtun on Twitter to gain some visibility.

Chapters

0 - Setup

1 - Transformer models

2 - Using 🤗 Transformers

3 - Fine-tuning a pretrained model

4 - Sharing models and tokenizers

5 - The 🤗 Datasets library

6 - The 🤗 Tokenizers library

7 - Main NLP tasks

8 - How to ask for help

Events

@a-krirk
Copy link
Contributor

a-krirk commented Mar 30, 2022

I would like to help translating chapter 3

@cstorm125
Copy link

We'll refactor thai2transformers and get back to you within the next month or so.

wav2vec2 tutorial for Thai
https://github.com/vistec-AI/wav2vec2-large-xlsr-53-th/blob/main/notebooks/wav2vec2_finetuning_tutorial.ipynb

Extractive QA
https://github.com/vistec-AI/thai2transformers/blob/dev/notebooks/train_question_answering_lm_finetuning.ipynb

@pattanun-np
Copy link

I would like to translating chapter 5

@lewtun
Copy link
Member

lewtun commented Mar 30, 2022

Hey @a-krirk and @pattanunNP, thank you very much for your offer to help with the translation! I've added your names to the list :)

@cstorm125 we're currently focusing on doing direct translations of the English course content to Thai. If you'd like to contribute new content on training models in Thai, I suggest opening an issue in transformers to see whether our official tutorials could benefit from a translation :)

@peeraponw
Copy link
Contributor Author

I will take care of chapter 0 and 1 first.

@jomariya23156
Copy link
Contributor

I'm interested in translating Chapter 4 :)

@lewtun
Copy link
Member

lewtun commented Mar 31, 2022

Hello, we have two PRs now ready for Thai!

Unfortunately, I don't speak Thai so would someone here be willing to have a quick look at the translations?

@lewtun
Copy link
Member

lewtun commented Mar 31, 2022

Hey @jomariya23156 thank you for offering to help! I've added your name to the list :)

@ckingkan
Copy link
Contributor

ckingkan commented Apr 1, 2022

Hi all, I'm happy to help translating chapter 2

@peeraponw
Copy link
Contributor Author

I'll continue on the rest of chapter 1.

@lewtun
Copy link
Member

lewtun commented Apr 1, 2022

I've just deployed the first part of the Thai translation 🥳 ! You can view it here: https://huggingface.co/course/th/chapter1/1

Next week we'll fix the language dropdown so that you can access all the languages from the same place.

@ckingkan thank you for offering to help! I've added your name to the list

@peeraponw
Copy link
Contributor Author

Just curious how often should we do a PR? I am afraid @lewtun would be overloaded if we make a PR per section, which means 69 PRs per language. Should it be like once per a couple of sections, or like one third of the chapter per PR?

@ckingkan
Copy link
Contributor

ckingkan commented Apr 3, 2022

I think one third would be a good idea.

@lewtun
Copy link
Member

lewtun commented Apr 4, 2022

Just curious how often should we do a PR? I am afraid @lewtun would be overloaded if we make a PR per section, which means 69 PRs per language. Should it be like once per a couple of sections, or like one third of the chapter per PR?

Thanks for raising this issue @peeraponw! Indeed a PR per chapter would be ideal, but I understand that not everyone has the time to do this. So my general suggestion would be:

  • If you're planning to tackle a whole chapter => open a PR for that chapter
  • If you're planning to tackle a few sections of a chapter => open a PR for those sections

For the reviews themselves, I'll need all your help to do a quick check that the translations are OK, so having PRs covering multiple sections will help there too :)

Speaking of PRs, this one from @a-krirk is ready for someone to have a look at: #73

Edit: here's another great PR for review: #83

Edit: here's another PR for review: #85

@lewtun
Copy link
Member

lewtun commented Apr 4, 2022

One thing you might want to discuss here is how to handle English technical terms like "pretrained model": https://github.com/huggingface/course/pull/73/files#r841659145

I'll leave this decision to you, but here's a few suggestions:

  1. Leave the technical terms in English
  2. Translate the technical terms, but add the English equivalent in parentheses. This is similar to how Wikipedia handles translations e.g. here
  3. Translate the technical terms where it makes sense and add a new "glossary" chapter with the corresponding terms in Thai and English

I think the Italian and Persian translations are going to adopt choice 3, so that might be the best option right now :)

@peeraponw
Copy link
Contributor Author

One thing you might want to discuss here is how to handle English technical terms like "pretrained model": https://github.com/huggingface/course/pull/73/files#r841659145

I'll leave this decision to you, but here's a few suggestions:

  1. Leave the technical terms in English
  2. Translate the technical terms, but add the English equivalent in parentheses. This is similar to how Wikipedia handles translations e.g. here
  3. Translate the technical terms where it makes sense and add a new "glossary" chapter with the corresponding terms in Thai and English

I think the Italian and Persian translations are going to adopt choice 3, so that might be the best option right now :)

As mentioned in #73, my translation and @a-krirk translation seems to align with the 2. choice.

@lewtun
Copy link
Member

lewtun commented Apr 4, 2022

As mentioned in #73, my translation and @a-krirk translation seems to align with the 2. choice.

OK great, let's stick with that. We can always add a glossary later on :)

@meanna
Copy link
Contributor

meanna commented Apr 25, 2022

I'd like to translate chapter 6.

@lewtun
Copy link
Member

lewtun commented Apr 26, 2022

I'd like to translate chapter 6.

Fantastic! I've added your name to the list :)

@meanna
Copy link
Contributor

meanna commented May 1, 2022

I'd like to translate chapter 6.

Fantastic! I've added your name to the list :)

Nice, thank you. I'm having an exam at the uni this week. I think I can start next week:)

@drniwech
Copy link

Hi,

Are there any chapters that need to be translated? Let me know if you could use an extra hand.

@peeraponw
Copy link
Contributor Author

Hi @drniwech ,

Thank you for giving us a hand, and reviving this project! Chapter 7 and 8 are still available. Please let me know which one you would like and I will update the status accordingly.

@drniwech
Copy link

Hi @drniwech ,

Thank you for giving us a hand, and reviving this project! Chapter 7 and 8 are still available. Please let me know which one you would like and I will update the status accordingly.

I will work on Chapter 7. Thank you

drniwech added a commit to drniwech/course that referenced this issue Jul 7, 2024
Issue#huggingface#64: Thai Translation of Chapter7.
@drniwech
Copy link

drniwech commented Jul 7, 2024

Hi,

Could someone please help fix this pull request for Chapter 7? #712
It failed at the "build_pr_document" stage. I am not sure why it failed in Chapter 3. There is no change in that Chapter though. Is there any way to replicate this issue locally?

Screenshot 2024-07-07 at 3 32 37 PM

The doc building for the Thai language was completed successfully.

Generating docs for language th
Building docs for course ../course/chapters//th ../build_dir/course/pr_712/th

Building the MDX files:   0%|          | 0/60 [00:00<?, ?it/s]
Building the MDX files: 100%|██████████| 60/60 [00:00<00:00, 720.13it/s]

However, it failed for the zh-TW language. Not sure what to do.

Generating` docs for language zh-TW
Building docs for course ../course/chapters//zh-TW ../build_dir/course/pr_712/zh-TW

Building the MDX files:   0%|          | 0/90 [00:00<?, ?it/s]
Building the MDX files: 100%|██████████| 90/90 [00:00<00:00, 1000.19it/s]

[vite-plugin-svelte] /tmp/tmpomqswhw3/kit/src/routes/chapter3/2/+page.svelte:393:6 Cannot have an {:else} block outside an {#if ...} or {#each ...} block
file: `/tmp/tmpomqswhw3/kit/src/routes/chapter3/2/+page.svelte:393:6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants