Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Total CDR3 sequences of TCRdb #1

Open
BioLaoXu opened this issue Oct 26, 2020 · 1 comment
Open

Total CDR3 sequences of TCRdb #1

BioLaoXu opened this issue Oct 26, 2020 · 1 comment

Comments

@BioLaoXu
Copy link

TCRdb is a very powerful CDR3 database, with nearly 300 million CDR3 sequences. For the purpose of research, we need the data set for machine learning training, so we counted the data amount of this data(POST and crawl ), but found that there are only over 40 million data in the database, which is a little weird,is there still data not published?thanks

@aqzas
Copy link
Contributor

aqzas commented Oct 26, 2020

Hi, @xf78
Thank for using TCRdb. The database does contain 300 million sequences, all the analysis and statistics are based on these sequences. But consider the load of the server, we did not make it available to download all the sequences (only top 1000 for each sample).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants