Skip to content

Commit

Permalink
Bump release (#568)
Browse files Browse the repository at this point in the history
* Refactor tokenization of targets for transformers v4.22 (#316)

* Refactor tokenization of targets for transformers v4.22

* typo fix (#319)

line no.49 Changed _SQuAD_it-text.json_-> _SQuAD_it-test.json_

* [FR] Many corrections (#318)

* Fix URL to the Pile (#324)

* Fix URL to the Pile

* [RU] ch5  (#317)

* fix: book url (#323)

* zh-CN - Chapter 7,8,9finished (#315)

Co-authored-by: Lewis Tunstall <[email protected]>

* Refactor events (#261)

* Fix whole word masking labels (#326)

* Fix question answering indices (#327)

* Add translation checker (#329)

* [FR] Refactor events (#330)

* Translation Chapter 4 (#325)

* update author list (de) (#331)

* Fix Russian ToC (#332)

* Refactor dataset upload in Chapter 5 / section 5 (#334)

* Fix id2label types (#337)

* Fix keywords in de quiz chapter 3 (#338)

Noticed two `undefined` in the new render, because the `text` key was capitalized.

* Tweak course validator (#340)

* [Italian] Added Ch2/3 and Ch2/4 (#322)

* Completes chapter 1 (#341)

* Create 5.mdx and translate it into Japanese.

* Create 6.mdx and translate it into Japanese.

* done chapter1.2,1.3

* Create 4.mdx and translate it into Japanese.

* Slightly modified

* Slightly modified

* Slightly modified

* TF generation fixes (#344)

* Fixes to chapter 7

Co-authored-by: lewtun <[email protected]>

* i18n: ES - translate file chapter2/6.mdx (#346)

* Typo in russian translation (#349)

It should be "Обучающий цикл" not "Обучающий цикла"

* Remove translated string (#350)

* [It] Ch2/5, Ch2/6, Ch2/7 (#353)

* Add FAQ (#354)

* i18n: ES - translate file chapter2/7.mdx (#347)

* [id] Add translation to Bahasa Indonesia for chapter0 & some of chapter1 (#351)

* i18n: ES - chapter2/8.mdx (#352)

* Update 4.mdx based on the advice.

* [de] Translation Chapter 1 (#336)

* Update 1.mdx (#356)

* Update 1.mdx (#357)

* removed original english texts to open pull request

* removed original english texts to open pull request

* removed original english texts to open pull request

* add lines for chap1/4 to 6

* Slightly modified

* modify 2.mdx, 3.mdx

* modify _toctree.yml

* Update pr docs actions (#369)

* Add Python syntax highlighting (#370)

* [FR] Add FAQ and more (#367)


Co-authored-by: Lewis Tunstall <[email protected]>

* [RU] Chapter 6 (1/2) finished (#368)

* Spanish translation of Chapter 5 (#366)



Co-authored-by: Lewis Tunstall <[email protected]>

* Add Japanese trasnlation of chapter 1/ 7 to 10  (#359)

* Adding Portuguese Translation to Chapter3 (#361)

* make style

* Typo in Chapter 2, Section 2 (#364)

Replace "inputs" with "outputs".

* Revert "Update pr docs actions (#369)"

This reverts commit 44f77be.

* Typo (#374)

* Chapter 9 - Italian (#373)

* Fix notebook link (#378)

* docs: feat: chapter2-1 in Korean (#375)

Review by @lewtun 22/11/22
docs: fix: remove commented toc for future contributors

* Migrate Spaces URLs to new domain (#379)

* docs: feat: same links across languages (#380)

Added custom anchors using double square brackets, e.g. [[formatted-anchor]]

* Add video transcripts  (#150)

* docs: fix: Accurate for the origin (English) subtitles (#384)

* docs: i18n: add zh-CN machine translation (#385)

* [FR] Notebooks links (#386)

* Upgrade python version in the workflow (#402)

* Update README.md (#389)

Add that preview does not work with windows

* translated chapter2_1-3 (#392)

* fixes small typos (#397)

* Add Chap2/4.mdx and 5.mdx (#391)

Co-authored-by: 長澤春希 <[email protected]>

* created new script for converting bilingual captions to monolingual caption (#399)

* Add French YouTube videos transcription (#410)

* docs(zh-cn): Reviewed 56_data-processing-for-masked-language-modeling.srt (#400)

* docs(zh-cn): Reviewed 57_what-is-perplexity.srt (#401)

* reviewed ep.58 (#405)

* reviewed ep.59 (#406)

* docs(zh-cn): Reviewed 60_what-is-the-bleu-metric.srt (#407)

* finished review (#408)

* docs(zh-cn): Reviewed 61_data-processing-for-summarization.srt (#409)

* Fix subtitle - translation data processing (#411)

* [FR] Final PR (#412)

* [ko] Add chapter 8 translation (#417)

* docs(zh-cn): Reviewed 62_what-is-the-rouge-metric.srt (#419)

* finished review

* fixed errors in original english subtitle

* fixed errors (#420)

* docs(zh-cn): Reviewed 63_data-processing-for-causal-language-modeling.srt (#421)

* Update 63_data-processing-for-causal-language-modeling.srt

* finished review

* Update 63_data-processing-for-causal-language-modeling.srt

* docs(zh-cn): Reviewed 65_data-processing-for-question-answering.srt (#423)

* finished review

* finished review

* finished review (#422)

* Add Ko chapter2 2.mdx (#418)

* Add Ko chapter2 2.mdx

* [ko] Add chapter 8 translation (#417)

* docs(zh-cn): Reviewed 62_what-is-the-rouge-metric.srt (#419)

* finished review

* fixed errors in original english subtitle

* fixed errors (#420)

* docs(zh-cn): Reviewed 63_data-processing-for-causal-language-modeling.srt (#421)

* Update 63_data-processing-for-causal-language-modeling.srt

* finished review

* Update 63_data-processing-for-causal-language-modeling.srt

* docs(zh-cn): Reviewed 65_data-processing-for-question-answering.srt (#423)

* finished review

* finished review

* finished review (#422)

* Add Ko chapter2 2.mdx

Co-authored-by: IL-GU KIM <[email protected]>
Co-authored-by: Yuan <[email protected]>

* update textbook link (#427)

* Visual fixes (#428)

* finish first round review (#429)

* Fix French subtitles + refactor conversion script (#431)

* Fix subtitles and scripts

* Fix subtitle

* Add tokenizer to MLM Trainer (#432)

* Fix FR video descriptions (#433)

* Fix FR video descriptions

* Rename file

* Fix dead GPT model docs link. (#430)

* Translate into Korean: 2-3 (#434)

Co-authored-by: “Ryan” <“[email protected]”>

* Add korean translation of chapter5 (1,2) (#441)

update toctree for chapter 5 (1, 2)
ensure same title for 5-2
add updates from upstream English with custom anchors

Co-Authored-By: Minho Ryu <[email protected]>

Co-authored-by: Meta Learner응용개발팀 류민호 <[email protected]>
Co-authored-by: Minho Ryu <[email protected]>

* Update 3.mdx (#444)

* docs(zh-cn): Reviewed 67_the-post-processing-step-in-question-answering-(tensorflow).srt (#447)

* Update 67_the-post-processing-step-in-question-answering-(tensorflow).srt

* finished review

* docs(zh-cn): Reviewed 66_the-post-processing-step-in-question-answering-(pytorch).srt (#448)

* Update 66_the-post-processing-step-in-question-answering-(pytorch).srt

* finished review

* refined translation

* docs(zh-cn): Reviewed 01_the-pipeline-function.srt (#452)

* finish review

* Update subtitles/zh-CN/01_the-pipeline-function.srt

Co-authored-by: Luke Cheng <[email protected]>

Co-authored-by: Luke Cheng <[email protected]>

* finish review (#453)

* Revise some unnatural translations (#458)

Some unnatural translations have been revised to use expressions more popular with Chinese readers

* Fix chapter 5 links (#461)

* fix small typo (#460)

* Add Ko chapter2 3~8.mdx & Modify Ko chapter2 2.mdx typo (#446)

* Add captions for tasks videos (#464)

* Add captions for tasks videos

* Fix script

* [FR] Add 🤗  Tasks videos (#468)

* Synchronous Chinese course update

Update the Chinese Course document to
sha:f71cf6c3b4cb235bc75a14416c6e8a57fc3d00a7
sha date: 2023/01/06 00:02:26 UTC+8

* review sync

* Update 3.mdx

* format zh_CN

* format all mdx

* Remove temp folder

* finished review (#449)

* docs(zh-cn): Reviewed 31_navigating-the-model-hub.srt (#451)

* docs(zh-cn): Reviewed No. 08 - What happens inside the pipeline function? (PyTorch) (#454)

* docs(zh-cn): Reviewed 03_what-is-transfer-learning.srt (#457)

* docs(zh-cn): 32_managing-a-repo-on-the-model-hub.srt (#469)

* docs(zh-cn): Reviewed No. 10 - Instantiate a Transformers model (PyTorch) (#472)

* update Chinese translation

有一些英文句子与中文语序是相反的,我直接按照最终的中文语序排列了,这样是否可以?

* finish first round review

* finish second round review

* finish second round review

* branch commit

* Update subtitles/zh-CN/10_instantiate-a-transformers-model-(pytorch).srt

Co-authored-by: Luke Cheng <[email protected]>

* Update subtitles/zh-CN/10_instantiate-a-transformers-model-(pytorch).srt

Co-authored-by: Luke Cheng <[email protected]>

---------

Co-authored-by: Luke Cheng <[email protected]>

* docs(zh-cn): 33_the-push-to-hub-api-(pytorch).srt (#473)

* docs(zh-cn): Reviewed 34_the-push-to-hub-api-(tensorflow).srt (#479)

* running python utils/code_formatter.py

* review 05 cn translations

* review 06 cn translations

* Review No.11

* translate no.24

* review 06 cn translations

* review 07 cn translations

* Update 23_what-is-dynamic-padding.srt

* Update 23_what-is-dynamic-padding.srt

* Update 23_what-is-dynamic-padding.srt

* Update subtitles/zh-CN/23_what-is-dynamic-padding.srt

Co-authored-by: Luke Cheng <[email protected]>

* Update subtitles/zh-CN/23_what-is-dynamic-padding.srt

Co-authored-by: Luke Cheng <[email protected]>

* add blank

* Review No. 11, No. 12

* Review No. 13

* Review No. 12

* Review No. 14

* finished review

* optimized translation

* optimized translation

* docs(zh-cn): Reviewed No. 29 - Write your training loop in PyTorch

* Review 15

* Review 16

* Review 17

* Review 18

* Review ch 72 translation

* Update 72 cn translation

* To be reviewed No.42-No.54

* No.11 check-out

* No.12 check-out

* No. 13 14 check-out

* No. 15 16 check-out

* No. 17 18 check-out

* Add note for "token-*"

* Reviewed No.8, 9, 10

* Reviewed No.42

* Review No.43

* finished review

* optimized translation

* finished review

* optimized translation

* Review 44(need refine)

* Review 45(need refine)

* Review No. 46 (need refine)

* Review No.47

* Review No.46

* Review No.45

* Review No.44

* Review No.48

* Review No.49

* Review No.50

* Modify Ko chapter2 8.mdx (#465)

* Add Ko chapter2 2.mdx

* Add Ko chapter2 2.mdx

* Add Ko chapter2 3.mdx & 4.mdx

* Modify Ko chapter2 3.mdx & 4.mdx

* Modify Ko chapter2 3.mdx & 4.mdx

* Modify Ko chapter2 3.mdx & 4.mdx

* Modify _toctree.yml

* Add Ko chapter2 5.mdx

* Modify Ko chapter2 4.mdx

* Add doc-builder step

* Add Ko chapter2 6~8.mdx & Modify Ko chapter2 2.mdx typo

* Modify Ko _toctree.yml

* Modify Ko chapter2 8.mdx & README.md

* Fixed typo (#471)

* fixed subtitle errors (#474)

timestamp: 00:00:26,640 --> 00:00:28,620
modification: notification --> authentication

timestamp: 00:04:21,113 --> 00:04:22,923
modification: of --> or

* Fixed a typo (#475)

* Update 3.mdx (#526)

Fix typo

* [zh-TW] Added chapters 1-9 (#477)

The translation is based on Simplified Chinese version, converted via OpenCC and fixed some formatting issues.

* finished review

* Explain why there are more tokens, than reviews (#476)

* Explain why there are more tokens, than reviews

* Update chapters/en/chapter5/3.mdx

---------

Co-authored-by: lewtun <[email protected]>

* [RU] Subtitles for Chapter 1 of the video course (#489)

* Created a directory for the russian subtitles.

Created a folder for Russian subtitles for the video course and published a translation of the introductory video from chapter 1.

* Uploaded subtitles for chapter 1

Uploaded subtitles for the remaining videos for chapter 1 of the video course.

* Added subtitles for chapter 2 of the video course

Added STR subtitle files for the second chapter of the YouTube video course.

* Delete subtitles/ru directory

Removed the old translation. Incorrect timestamping.

* Create 00_welcome-to-the-hugging-face-course.srt

Create a directory and upload a subtitle file for the introductory video of the course.

* Add files via upload

Upload subtitle files for the first chapter of the course.

* Review No.52

* [ru] Added the glossary and translation guide (#490)

* Added the glossary and translation guide

* Fixed casing

* Minor fixes

* Updated glossary

* Glossary update

* Glossary update

* Glossary update

* [ru] Chapters 0 and 1 proofreading, updating and translating missing sections (#491)

* Chapter 0 proofreading

* Chapter 1 Section 1 proofreading
- Added new people from English version;
- Added links to creator's pages;
- Added FAQ translation;

* Chapter 1 Sections 2-5 proofreading

* Chapter 1 Sections 6-9 proofreading

* Final proofreading and added missing quiz section

* Minor spelling corrections

* Review No.51

* Review No.53

* Review No.54

* finished review

* modified translation

* modified translation

* modified subtitle

use the same text appeared in video

* translated

* Fix typo (#532)

* review chapter4/2

* review chapter4/2

* review chapter4/2

* Review 75

* Review No.20, need review some

* docs(zh-cn): Reviewed Chapter 7/1

* Update 1.mdx

* Review No.22

* Review No.21 (need refinement)

* Review No.30, need review: 26 27 28 30 73 74

* Review 30 (good)

* Review 20

* Review 21 (refine)

* Review 21

* Review 22

* Review 26

* Review 27

* Review 28

* Review 30

* Review 73

* Review 74

* Review 26-28, 42-54, 73-75

* Demo link fixes (#562)

* demo link fixes

* minor demo fix

---------

Co-authored-by: Aravind Kumar <[email protected]>
Co-authored-by: lbourdois <[email protected]>
Co-authored-by: Pavel <[email protected]>
Co-authored-by: buti1021 <[email protected]>
Co-authored-by: 1375626371 <[email protected]>
Co-authored-by: Fabrizio Damicelli <[email protected]>
Co-authored-by: Jesper Dramsch <[email protected]>
Co-authored-by: Acciaro Gennaro Daniele <[email protected]>
Co-authored-by: Caterina Bonan <[email protected]>
Co-authored-by: Haruki Nagasawa <[email protected]>
Co-authored-by: blackdoor571 <[email protected]>
Co-authored-by: Matt <[email protected]>
Co-authored-by: Angel Mendez <[email protected]>
Co-authored-by: Artem Vysotsky <[email protected]>
Co-authored-by: Gusti Adli Anshari <[email protected]>
Co-authored-by: Marcus Fraaß <[email protected]>
Co-authored-by: Christopher Akiki <[email protected]>
Co-authored-by: Mishig <[email protected]>
Co-authored-by: David Gilbertson <[email protected]>
Co-authored-by: Camilo Martínez Burgos <[email protected]>
Co-authored-by: Hiroaki Funayama <[email protected]>
Co-authored-by: Cesar0106 <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Filippo Broggini <[email protected]>
Co-authored-by: Mishig <[email protected]>
Co-authored-by: Nanachi <[email protected]>
Co-authored-by: Edoardo Abati <[email protected]>
Co-authored-by: Wonhyeong Seo <[email protected]>
Co-authored-by: Luke Cheng <[email protected]>
Co-authored-by: xianbaoqian <[email protected]>
Co-authored-by: Thomas Simonini <[email protected]>
Co-authored-by: Subaru Kimura <[email protected]>
Co-authored-by: Carlos Santos Garcia <[email protected]>
Co-authored-by: 長澤春希 <[email protected]>
Co-authored-by: Yuan <[email protected]>
Co-authored-by: IL-GU KIM <[email protected]>
Co-authored-by: Kim Bo Geum <[email protected]>
Co-authored-by: Bartosz Szmelczynski <[email protected]>
Co-authored-by: Shawn Lee <[email protected]>
Co-authored-by: Naveen Reddy D <[email protected]>
Co-authored-by: rainmaker <[email protected]>
Co-authored-by: “Ryan” <“[email protected]”>
Co-authored-by: Meta Learner응용개발팀 류민호 <[email protected]>
Co-authored-by: Minho Ryu <[email protected]>
Co-authored-by: richardachen <[email protected]>
Co-authored-by: beyondguo <[email protected]>
Co-authored-by: bsenst <[email protected]>
Co-authored-by: 1375626371 <[email protected]>
Co-authored-by: yaoqih <[email protected]>
Co-authored-by: 李洋 <[email protected]>
Co-authored-by: PowerChina <[email protected]>
Co-authored-by: chenglu99 <[email protected]>
Co-authored-by: iCell <[email protected]>
Co-authored-by: Qi Zhang <[email protected]>
Co-authored-by: researcher <[email protected]>
Co-authored-by: simpleAI <[email protected]>
Co-authored-by: FYJNEVERFOLLOWS <[email protected]>
Co-authored-by: zhangchaosd <[email protected]>
Co-authored-by: TK Buristrakul <[email protected]>
Co-authored-by: Carlos Aguayo <[email protected]>
Co-authored-by: ateliershen <[email protected]>
Co-authored-by: Pavel Nesterov <[email protected]>
Co-authored-by: Artyom Boyko <[email protected]>
Co-authored-by: Kirill Milintsevich <[email protected]>
Co-authored-by: jybarnes21 <[email protected]>
Co-authored-by: gxy-gxy <[email protected]>
Co-authored-by: iLeGend <[email protected]>
Co-authored-by: Maria Khalusova <[email protected]>
  • Loading branch information
Show file tree
Hide file tree
Showing 264 changed files with 28,757 additions and 2,940 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ This repo contains the content that's used to create the **[Hugging Face course]
| [Bahasa Indonesia](https://huggingface.co/course/id/chapter1/1) (WIP) | [`chapters/id`](https://github.com/huggingface/course/tree/main/chapters/id) | [@gstdl](https://github.com/gstdl) |
| [Italian](https://huggingface.co/course/it/chapter1/1) (WIP) | [`chapters/it`](https://github.com/huggingface/course/tree/main/chapters/it) | [@CaterinaBi](https://github.com/CaterinaBi), [@ClonedOne](https://github.com/ClonedOne), [@Nolanogenn](https://github.com/Nolanogenn), [@EdAbati](https://github.com/EdAbati), [@gdacciaro](https://github.com/gdacciaro) |
| [Japanese](https://huggingface.co/course/ja/chapter1/1) (WIP) | [`chapters/ja`](https://github.com/huggingface/course/tree/main/chapters/ja) | [@hiromu166](https://github.com/@hiromu166), [@younesbelkada](https://github.com/@younesbelkada), [@HiromuHota](https://github.com/@HiromuHota) |
| [Korean](https://huggingface.co/course/ko/chapter1/1) (WIP) | [`chapters/ko`](https://github.com/huggingface/course/tree/main/chapters/ko) | [@Doohae](https://github.com/Doohae), [@wonhyeongseo](https://github.com/wonhyeongseo), [@dlfrnaos19](https://github.com/dlfrnaos19) |
| [Korean](https://huggingface.co/course/ko/chapter1/1) (WIP) | [`chapters/ko`](https://github.com/huggingface/course/tree/main/chapters/ko) | [@Doohae](https://github.com/Doohae), [@wonhyeongseo](https://github.com/wonhyeongseo), [@dlfrnaos19](https://github.com/dlfrnaos19), [@nsbg](https://github.com/nsbg) |
| [Portuguese](https://huggingface.co/course/pt/chapter1/1) (WIP) | [`chapters/pt`](https://github.com/huggingface/course/tree/main/chapters/pt) | [@johnnv1](https://github.com/johnnv1), [@victorescosta](https://github.com/victorescosta), [@LincolnVS](https://github.com/LincolnVS) |
| [Russian](https://huggingface.co/course/ru/chapter1/1) (WIP) | [`chapters/ru`](https://github.com/huggingface/course/tree/main/chapters/ru) | [@pdumin](https://github.com/pdumin), [@svv73](https://github.com/svv73) |
| [Thai](https://huggingface.co/course/th/chapter1/1) (WIP) | [`chapters/th`](https://github.com/huggingface/course/tree/main/chapters/th) | [@peeraponw](https://github.com/peeraponw), [@a-krirk](https://github.com/a-krirk), [@jomariya23156](https://github.com/jomariya23156), [@ckingkan](https://github.com/ckingkan) |
Expand Down
2 changes: 1 addition & 1 deletion chapters/en/chapter4/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ training_args = TrainingArguments(

When you call `trainer.train()`, the `Trainer` will then upload your model to the Hub each time it is saved (here every epoch) in a repository in your namespace. That repository will be named like the output directory you picked (here `bert-finetuned-mrpc`) but you can choose a different name with `hub_model_id = "a_different_name"`.

To upload you model to an organization you are a member of, just pass it with `hub_model_id = "my_organization/my_repo_name"`.
To upload your model to an organization you are a member of, just pass it with `hub_model_id = "my_organization/my_repo_name"`.

Once your training is finished, you should do a final `trainer.push_to_hub()` to upload the last version of your model. It will also generate a model card with all the relevant metadata, reporting the hyperparameters used and the evaluation results! Here is an example of the content you might find in a such a model card:

Expand Down
2 changes: 1 addition & 1 deletion chapters/en/chapter5/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -387,7 +387,7 @@ ArrowInvalid: Column 1 named condition expected length 1463 but got length 1000

Oh no! That didn't work! Why not? Looking at the error message will give us a clue: there is a mismatch in the lengths of one of the columns, one being of length 1,463 and the other of length 1,000. If you've looked at the `Dataset.map()` [documentation](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.map), you may recall that it's the number of samples passed to the function that we are mapping; here those 1,000 examples gave 1,463 new features, resulting in a shape error.

The problem is that we're trying to mix two different datasets of different sizes: the `drug_dataset` columns will have a certain number of examples (the 1,000 in our error), but the `tokenized_dataset` we are building will have more (the 1,463 in the error message). That doesn't work for a `Dataset`, so we need to either remove the columns from the old dataset or make them the same size as they are in the new dataset. We can do the former with the `remove_columns` argument:
The problem is that we're trying to mix two different datasets of different sizes: the `drug_dataset` columns will have a certain number of examples (the 1,000 in our error), but the `tokenized_dataset` we are building will have more (the 1,463 in the error message; it is more than 1,000 because we are tokenizing long reviews into more than one example by using `return_overflowing_tokens=True`). That doesn't work for a `Dataset`, so we need to either remove the columns from the old dataset or make them the same size as they are in the new dataset. We can do the former with the `remove_columns` argument:

```py
tokenized_dataset = drug_dataset.map(
Expand Down
2 changes: 1 addition & 1 deletion chapters/en/chapter6/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ We can see that the tokenizer's special tokens `[CLS]` and `[SEP]` are mapped to

<Tip>

The notion of what a word is is complicated. For instance, does "I'll" (a contraction of "I will") count as one or two words? It actually depends on the tokenizer and the pre-tokenization operation it applies. Some tokenizers just split on spaces, so they will consider this as one word. Others use punctuation on top of spaces, so will consider it two words.
The notion of what a word is complicated. For instance, does "I'll" (a contraction of "I will") count as one or two words? It actually depends on the tokenizer and the pre-tokenization operation it applies. Some tokenizers just split on spaces, so they will consider this as one word. Others use punctuation on top of spaces, so will consider it two words.

✏️ **Try it out!** Create a tokenizer from the `bert-base-cased` and `roberta-base` checkpoints and tokenize "81s" with them. What do you observe? What are the word IDs?

Expand Down
4 changes: 2 additions & 2 deletions chapters/en/chapter7/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ This process of fine-tuning a pretrained language model on in-domain data is usu

By the end of this section you'll have a [masked language model](https://huggingface.co/huggingface-course/distilbert-base-uncased-finetuned-imdb?text=This+is+a+great+%5BMASK%5D.) on the Hub that can autocomplete sentences as shown below:

<iframe src="https://course-demos-distilbert-base-uncased-finetune-7400b54.hf.space" frameBorder="0" height="300" title="Gradio app" class="block dark:hidden container p-0 flex-grow space-iframe" allow="accelerometer; ambient-light-sensor; autoplay; battery; camera; document-domain; encrypted-media; fullscreen; geolocation; gyroscope; layout-animations; legacy-image-formats; magnetometer; microphone; midi; oversized-images; payment; picture-in-picture; publickey-credentials-get; sync-xhr; usb; vr ; wake-lock; xr-spatial-tracking" sandbox="allow-forms allow-modals allow-popups allow-popups-to-escape-sandbox allow-same-origin allow-scripts allow-downloads"></iframe>
<iframe src="https://course-demos-distilbert-base-uncased-finetuned-imdb.hf.space" frameBorder="0" height="300" title="Gradio app" class="block dark:hidden container p-0 flex-grow space-iframe" allow="accelerometer; ambient-light-sensor; autoplay; battery; camera; document-domain; encrypted-media; fullscreen; geolocation; gyroscope; layout-animations; legacy-image-formats; magnetometer; microphone; midi; oversized-images; payment; picture-in-picture; publickey-credentials-get; sync-xhr; usb; vr ; wake-lock; xr-spatial-tracking" sandbox="allow-forms allow-modals allow-popups allow-popups-to-escape-sandbox allow-same-origin allow-scripts allow-downloads"></iframe>

Let's dive in!

Expand Down Expand Up @@ -1035,7 +1035,7 @@ Neat -- our model has clearly adapted its weights to predict words that are more

<Youtube id="0Oxphw4Q9fo"/>

This wraps up our first experiment with training a language model. In [section 6](/course/chapter7/section6) you'll learn how to train an auto-regressive model like GPT-2 from scratch; head over there if you'd like to see how you can pretrain your very own Transformer model!
This wraps up our first experiment with training a language model. In [section 6](/course/en/chapter7/section6) you'll learn how to train an auto-regressive model like GPT-2 from scratch; head over there if you'd like to see how you can pretrain your very own Transformer model!

<Tip>

Expand Down
8 changes: 1 addition & 7 deletions chapters/en/chapter9/5.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,19 +23,13 @@ import gradio as gr
title = "GPT-J-6B"
description = "Gradio Demo for GPT-J 6B, a transformer model trained using Ben Wang's Mesh Transformer JAX. 'GPT-J' refers to the class of model, while '6B' represents the number of trainable parameters. To use it, simply add your text, or click one of the examples to load them. Read more at the links below."
article = "<p style='text-align: center'><a href='https://github.com/kingoflolz/mesh-transformer-jax' target='_blank'>GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model</a></p>"
examples = [
["The tower is 324 metres (1,063 ft) tall,"],
["The Moon's orbit around Earth has"],
["The smooth Borealis basin in the Northern Hemisphere covers 40%"],
]

gr.Interface.load(
"huggingface/EleutherAI/gpt-j-6B",
inputs=gr.Textbox(lines=5, label="Input Text"),
title=title,
description=description,
article=article,
examples=examples,
enable_queue=True,
).launch()
```

Expand Down
2 changes: 1 addition & 1 deletion chapters/en/chapter9/7.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,6 @@ with gr.Blocks() as block:
block.launch()
```

<iframe src="https://course-demos-blocks-update-component-properti-833c723.hf.space" frameBorder="0" height="300" title="Gradio app" class="container p-0 flex-grow space-iframe" allow="accelerometer; ambient-light-sensor; autoplay; battery; camera; document-domain; encrypted-media; fullscreen; geolocation; gyroscope; layout-animations; legacy-image-formats; magnetometer; microphone; midi; oversized-images; payment; picture-in-picture; publickey-credentials-get; sync-xhr; usb; vr ; wake-lock; xr-spatial-tracking" sandbox="allow-forms allow-modals allow-popups allow-popups-to-escape-sandbox allow-same-origin allow-scripts allow-downloads"></iframe>
<iframe src="https://course-demos-blocks-update-component-properties.hf.space" frameBorder="0" height="300" title="Gradio app" class="container p-0 flex-grow space-iframe" allow="accelerometer; ambient-light-sensor; autoplay; battery; camera; document-domain; encrypted-media; fullscreen; geolocation; gyroscope; layout-animations; legacy-image-formats; magnetometer; microphone; midi; oversized-images; payment; picture-in-picture; publickey-credentials-get; sync-xhr; usb; vr ; wake-lock; xr-spatial-tracking" sandbox="allow-forms allow-modals allow-popups allow-popups-to-escape-sandbox allow-same-origin allow-scripts allow-downloads"></iframe>

We just explored all the core concepts of `Blocks`! Just like with `Interfaces`, you can create cool demos that can be shared by using `share=True` in the `launch()` method or deployed on [Hugging Face Spaces](https://huggingface.co/spaces).
2 changes: 1 addition & 1 deletion chapters/en/events/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ You can find all the demos that the community created under the [`Gradio-Blocks`

**Natural language to SQL**

<iframe src="https://huggingface.co/spaces/Gradio-Blocks/Words_To_SQL/+" frameBorder="0" height="640" title="Gradio app" class="container p-0 flex-grow space-iframe" allow="accelerometer; ambient-light-sensor; autoplay; battery; camera; document-domain; encrypted-media; fullscreen; geolocation; gyroscope; layout-animations; legacy-image-formats; magnetometer; microphone; midi; oversized-images; payment; picture-in-picture; publickey-credentials-get; sync-xhr; usb; vr ; wake-lock; xr-spatial-tracking" sandbox="allow-forms allow-modals allow-popups allow-popups-to-escape-sandbox allow-same-origin allow-scripts allow-downloads"></iframe>
<iframe src="https://huggingface.co/spaces/Curranj/Words_To_SQL" frameBorder="0" height="640" title="Gradio app" class="container p-0 flex-grow space-iframe" allow="accelerometer; ambient-light-sensor; autoplay; battery; camera; document-domain; encrypted-media; fullscreen; geolocation; gyroscope; layout-animations; legacy-image-formats; magnetometer; microphone; midi; oversized-images; payment; picture-in-picture; publickey-credentials-get; sync-xhr; usb; vr ; wake-lock; xr-spatial-tracking" sandbox="allow-forms allow-modals allow-popups allow-popups-to-escape-sandbox allow-same-origin allow-scripts allow-downloads"></iframe>
2 changes: 1 addition & 1 deletion chapters/it/chapter2/2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ outputs = model(inputs)
```
{/if}

Ora, se osserviamo la forma dei nostri input, la dimensionalità sarà molto più bassa: la model head prende in input i vettori ad alta dimensionalità che abbiamo visto prima e produce vettori contenenti due valori (uno per etichetta):
Ora, se osserviamo la forma dei nostri output, la dimensionalità sarà molto più bassa: la model head prende in input i vettori ad alta dimensionalità che abbiamo visto prima e produce vettori contenenti due valori (uno per etichetta):

```python
print(outputs.logits.shape)
Expand Down
10 changes: 5 additions & 5 deletions chapters/ko/chapter2/8.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -88,16 +88,16 @@
<Question
choices={[
{
text: "A component of the base Transformer network that redirects tensors to their correct layers",
text: "기본 Transformer 네트워크의 요소로, 텐서를 적합한 레이어로 리디렉션합니다.",
explain: "오답입니다! 그런 요소는 없습니다."
},
{
text: "Also known as the self-attention mechanism, it adapts the representation of a token according to the other tokens of the sequence",
explain: "Incorrect! The self-attention layer does contain attention \"heads,\" but these are not adaptation heads."
text: "셀프 어텐션 메커니즘이라고도 부르며, 시퀀스 내 다른 토큰에 따라 토큰의 표현을 조정합니다.",
explain: "오답입니다! 셀프 어텐션 레이어는 어텐션 \"헤드,\"를 포함하고 있지만 어텐션 헤드가 적응 헤드는 아닙니다."
},
{
text: "An additional component, usually made up of one or a few layers, to convert the transformer predictions to a task-specific output",
explain: "That's right. Adaptation heads, also known simply as heads, come up in different forms: language modeling heads, question answering heads, sequence classification heads... ",
text: "하나 또는 여러 개의 레이어로 이루어진 추가적인 요소로 트랜스포머의 예측 결과를 task-specific한 출력으로 변환합니다.",
explain: "정답입니다. 헤드라고 알려진 적응 헤드는 언어 모델링 헤드, 질의 응답 헤드, 순차 분류 헤드 등과 같이 다양한 형태로 나타납니다.",
correct: true
}
]}
Expand Down
47 changes: 47 additions & 0 deletions chapters/ru/TRANSLATING.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
1. We use the formal "you" (i.e. "вы" instead of "ты") to keep the neutral tone.
However, don't make the text too formal to keep it more engaging.

2. Don't translate industry-accepted acronyms. e.g. TPU or GPU.

3. The Russian language accepts English words especially in modern contexts more than
many other languages (i.e. Anglicisms). Check for the correct usage of terms in
computer science and commonly used terms in other publications.

4. Russian word order is often different from English. If after translating a sentence
it sounds unnatural try to change the word or clause order to make it more natural.

5. Beware of "false friends" in Russian and English translations. Translators are trained
for years to specifically avoid false English friends and avoid anglicised translations.
e.g. "точность" is "accuracy", but "carefulness" is "аккуратность". For more examples refer to:
http://falsefriends.ru/ffslovar.htm

6. Keep voice active and consistent. Don't overdo it but try to avoid a passive voice.

7. Refer and contribute to the glossary frequently to stay on top of the latest
choices we make. This minimizes the amount of editing that is required.

8. Keep POV consistent.

9. Smaller sentences are better sentences. Apply with nuance.

10. If translating a technical word, keep the choice of Russian translation consistent.
This does not apply for non-technical choices, as in those cases variety actually
helps keep the text engaging.

11. This is merely a translation. Don't add any technical/contextual information
not present in the original text. Also don't leave stuff out. The creative
choices in composing this information were the original authors' to make.
Our creative choices are in doing a quality translation.

12. Be exact when choosing equivalents for technical words. Package is package.
Library is library. Don't mix and match. Also, since both "batch" and "package"
can be translated as "пакет", use "батч" for "batch" and "пакет" for "package" to
avoid ambiguity.

13. Library names are kept in the original forms, e.g. "🤗 Datasets", however,
the word dataset in a sentence gets a translation to "датасет".

14. As a style choice prefer the imperative over constructions with auxiliary words
to avoid unnecessary verbosity and addressing of the reader, which seems
unnatural in Russian. e.g. "см. главу X" - "See chapter X" instead of
"Вы можете найти это в главе X" - "You can see this in chapter X".
Loading

0 comments on commit ae1b02d

Please sign in to comment.