Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Effect of "Assembly" from pooled bins #85

Open
Somebodyatthdoor opened this issue Sep 13, 2022 · 2 comments
Open

Effect of "Assembly" from pooled bins #85

Somebodyatthdoor opened this issue Sep 13, 2022 · 2 comments

Comments

@Somebodyatthdoor
Copy link

Hi,

I have a slightly unusual problem, based on a misunderstanding by a collaborator on how das_tool works. I was wondering if you could give your opinion on what the result of the methods they have used might have been? We have done several analyses on the bins that were output from das_tool, and it would be a shame to throw all those analyses away if the effect of the mistake were minimal. But obviously if there may be a negative effect then we would rather know.

Method:

  1. Bins were created from multiple samples using four different pipelines.
  2. These bins were checked for quality using checkm, then anything with >5% contamination or <80% completion was discarded.
  3. These bins were all used as input for das_tool. They originated from multiple assemblies. Instead of an assembly being used as the input for option -c, a fasta file was used which was a concatenation of all of the bin fasta files.
  4. The das_tool output bins then went through another step where they were dereplicated using drep.

I am aware that this is not the usual way of running das_tool and that it is designed to use an assembly as the input fasta. However, I can't work out from the documentation whether any actual harm would come from doing this. We did actually see an improvement in the bins after running das_tool (see image below).

image

Thanks for your help,
Laura

@cmks
Copy link
Owner

cmks commented Sep 15, 2022

Hi Laura,

Based on what you've described, I don't see a big issue with your approach. You may have gotten more high quality bins if you'd skipped the filtering in step 2 and only filtered in the end during your dereplication step. DAS Tool works better on the full set of bins and is able to 'decontaminate' bins in certain cases. Step 3 is not a problem, because DAS Tool can implicitly handle multiple assemblies, as long as all contigs/bins have unique identifiers across assemblies/binning-pipelines.

I hope this is helpful.

Cheers,
Christian

@Somebodyatthdoor
Copy link
Author

Somebodyatthdoor commented Sep 15, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants