Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error #7

Open
pjmaughan43 opened this issue Nov 17, 2018 · 1 comment
Open

Error #7

pjmaughan43 opened this issue Nov 17, 2018 · 1 comment

Comments

@pjmaughan43
Copy link

I'm using the anaconda installation of HG-CoLoR, but running into the following error and hoping you point me in the right direction.

Here's the slurm output:
[Fri Nov 16 15:45:54 MST 2018] Correcting the short reads
[Fri Nov 16 18:28:37 MST 2018] Removing short reads containing weak k-mers
[Fri Nov 16 20:52:45 MST 2018] Building the graph
[Fri Nov 16 23:12:53 MST 2018] Aligning the short reads on the long reads
[Fri Nov 16 23:18:18 MST 2018] Removing short alignments
Traceback (most recent call last):
File "/fslgroup/fslg_pws_module/compute/software/.conda/envs/hg-color_v1.0.0/bin/filterOutShortAlignments.py", line 36, in
out.write(finalString)
NameError: name 'out' is not defined
[Fri Nov 16 23:18:18 MST 2018] Generating the corrected long reads
[Fri Nov 16 23:18:19 MST 2018] Removing temporary files
[Fri Nov 16 23:19:11 MST 2018] Exiting

Here is the HG-CoLoR.stdout:

----- QuorUM -----
----- revseq -----
----- KMC -----
1st stage: 174.623s
2nd stage: 942.759s
Total : 1117.38s
Tmp size : 81150MB

Stats:
No. of k-mers below min. threshold : 635141104
No. of k-mers above max. threshold : 0
No. of unique k-mers : 2185558665
No. of unique counted k-mers : 1550417561
Total no. of k-mers : 87948736520
Total no. of reads : 542495294
Total no. of super-k-mers : 3458687932
----- KMC_tools -----
----- KMC_dump -----
----- PgSAgen_hgcolor -----
Reading reads set
reads count: 1452761050
all reads length: 92976707200
reads length is constant
maxReadLength: 64
symbolsCount: 4
symbols: ACGT

Found 0 duplicates.
Start overlapping.
6820830 reads left after 63 overlap
6080059 reads left after 62 overlap
5530567 reads left after 61 overlap
5111566 reads left after 60 overlap
4764051 reads left after 59 overlap
4458165 reads left after 58 overlap
4199852 reads left after 57 overlap
3972326 reads left after 56 overlap
3763990 reads left after 55 overlap
3584244 reads left after 54 overlap
3420892 reads left after 53 overlap
3267275 reads left after 52 overlap
3131134 reads left after 51 overlap
3004685 reads left after 50 overlap
2884016 reads left after 49 overlap
2775758 reads left after 48 overlap
2672523 reads left after 47 overlap
2574105 reads left after 46 overlap
2482159 reads left after 45 overlap
2396981 reads left after 44 overlap
2313961 reads left after 43 overlap
2236406 reads left after 42 overlap
2163594 reads left after 41 overlap
2090936 reads left after 40 overlap
2024386 reads left after 39 overlap
1961520 reads left after 38 overlap
1899004 reads left after 37 overlap
1840888 reads left after 36 overlap
1786350 reads left after 35 overlap
1731747 reads left after 34 overlap
1680180 reads left after 33 overlap
1631172 reads left after 32 overlap
1582092 reads left after 31 overlap
1535704 reads left after 30 overlap
1490125 reads left after 29 overlap
1445786 reads left after 28 overlap
1403406 reads left after 27 overlap
1361982 reads left after 26 overlap
1321812 reads left after 25 overlap
1281887 reads left after 24 overlap
1242301 reads left after 23 overlap
1205276 reads left after 22 overlap
1170604 reads left after 21 overlap
1136651 reads left after 20 overlap
1103473 reads left after 19 overlap
1071540 reads left after 18 overlap
1040447 reads left after 17 overlap
1008606 reads left after 16 overlap
975248 reads left after 15 overlap
937730 reads left after 14 overlap
889391 reads left after 13 overlap
816493 reads left after 12 overlap
701226 reads left after 11 overlap
538223 reads left after 10 overlap
372007 reads left after 9 overlap
244437 reads left after 8 overlap
159908 reads left after 7 overlap
103350 reads left after 6 overlap
66897 reads left after 5 overlap
41576 reads left after 4 overlap
28021 reads left after 3 overlap
18225 reads left after 2 overlap
14822 reads left after 1 overlap
14822 pseudo-genome components
Overlapping done in 954060 msec

1579325618 bytes after overlapping
14822 pseudo-genome components
0 single reads
Pseudogenome assembled in 1193560 msec

Found 62521749 reads containing duplicate 11-mers in 16510 msec!
SA creation start.
Written 2147483644 bytes
Written 2147483644 bytes
Written 2022335952 bytes
SAIS generation time 615460 msec!
SA generation time 1112850 msec!
4194305 elements in SA lookup
SA LUT generation time 23110 msec!
Written 1579325682 bytes
Written 13074849459 bytes
Written 7896628100 bytes
Written 16777220 bytes

Thanks in advance.

@morispi
Copy link
Owner

morispi commented Jan 10, 2019

Hi,

Sorry for taking so long to answer.

From what I'm seeing, the script allowing to filter out alignments of the SR to the LR that are too short is causing the problem.

There might be various solutions to your issue, so if you still have the temporary files of your run, could you please provide me:

  • The code of the bin/filterOutShortAlignments.py script (I don't maintain the anaconda version myself, so it might be outdated)
  • The first few lines of the SR/LR alignments file (should be in the tmp directory you've chosen (HG_pid if you didn't chose one), under SR_on_LR.sam)

Pierre

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants