Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

模型原理 #12

Open
shyzzz521 opened this issue Oct 18, 2023 · 5 comments
Open

模型原理 #12

shyzzz521 opened this issue Oct 18, 2023 · 5 comments

Comments

@shyzzz521
Copy link

你好,如果有时间的话可以更新一下各个模块的原理图或者原理描述吗,谢谢啦!

@shyzzz521
Copy link
Author

训练数据是否可以提供一下,或者描述也可以

@yaoxiaoyuan
Copy link
Owner

原理可以参考如下论文:

  1. Attention Is All You Need
  2. Language Models are Unsupervised Multitask Learners
  3. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

训练数据来自于一些开源数据以及自己爬取的数据,因为版权和隐私问题无法完全公布。你可以自行搜索一些相关论文开源的数据资源。

@shyzzz521
Copy link
Author

shyzzz521 commented Oct 18, 2023 via email

@yaoxiaoyuan
Copy link
Owner

什么格式都可以,只要是合理的输入输出,并且开发好解析数据的代码就可以。可以参考例子example_train_seq2seq.py,这个例子里所用的数据格式为每条数据对应一条json,输入为src字段,输出为trg字段。

@shyzzz521
Copy link
Author

ok,谢谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants