-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Codec downstream task support: TTS #5763
Conversation
This pull request is now in conflict :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great work! I've noted some comments in the PR. Seems that for current stage, the most important thing is to decide how the json format for multi-modal data would be organized. Could you give us some examples and design concept?
# previously can become incompatible. New tokens can be | ||
# added - there are enough slots | ||
|
||
special_tokens = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect users to add their own tokens?
@ftshijt Thanks for the review. The code is very temporary at this moment. I would respond and clarify many concerns after the framework is complete. Currently I only respond a small part of the comments :) |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for the implementation, I left a few minor comments. After your fix, we can merge the current effort~
import logging | ||
import os | ||
import sys | ||
import json | ||
|
||
logging.basicConfig( | ||
format="%(asctime)s | %(levelname)s | %(name)s | %(message)s", | ||
datefmt="%Y-%m-%d %H:%M:%S", | ||
level=os.environ.get("LOGLEVEL", "INFO").upper(), | ||
stream=sys.stdout, | ||
) | ||
logger = logging.getLogger("find all example list based on train_jsons") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import logging | |
import os | |
import sys | |
import json | |
logging.basicConfig( | |
format="%(asctime)s | %(levelname)s | %(name)s | %(message)s", | |
datefmt="%Y-%m-%d %H:%M:%S", | |
level=os.environ.get("LOGLEVEL", "INFO").upper(), | |
stream=sys.stdout, | |
) | |
logger = logging.getLogger("find all example list based on train_jsons") | |
import os | |
import sys | |
import json |
What?
Support downstream task of codec project, specifically TTS.
Model architecture: Valle (https://arxiv.org/abs/2301.02111)