-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
模型原理 #12
Comments
训练数据是否可以提供一下,或者描述也可以 |
原理可以参考如下论文:
训练数据来自于一些开源数据以及自己爬取的数据,因为版权和隐私问题无法完全公布。你可以自行搜索一些相关论文开源的数据资源。 |
非常感谢,能否知道tag任务的训练数据格式呢☺
…---- 回复的原邮件 ----
| 发件人 | ***@***.***> |
| 日期 | 2023年10月18日 20:23 |
| 收件人 | ***@***.***> |
| 抄送至 | ***@***.***>***@***.***> |
| 主题 | Re: [yaoxiaoyuan/mimix] 模型原理 (Issue #12) |
原理可以参考如下论文:
Attention Is All You Need
Language Models are Unsupervised Multitask Learners
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
训练数据来自于一些开源数据以及自己爬取的数据,因为版权和隐私问题无法完全公布。你可以自行搜索一些相关论文开源的数据资源。
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
什么格式都可以,只要是合理的输入输出,并且开发好解析数据的代码就可以。可以参考例子example_train_seq2seq.py,这个例子里所用的数据格式为每条数据对应一条json,输入为src字段,输出为trg字段。 |
ok,谢谢 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
你好,如果有时间的话可以更新一下各个模块的原理图或者原理描述吗,谢谢啦!
The text was updated successfully, but these errors were encountered: