Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 30 #105

Open
sckinta opened this issue Apr 23, 2018 · 5 comments
Open

Comments

@sckinta
Copy link

sckinta commented Apr 23, 2018

Hi,

I ran the pipeline successfully before, but recently I got an error at atacqc step on all my runs. No qc summary (json/html) file was created in qc directory. I am wondering whether anyone has the same issue.

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 30: ordinal not in range(128)

Fatal error: /mnt/isilon/sfgi/programs/atac_dnase_pipelines/atac.bds, line 1612, pos 2. Task/s failed.

Below is server info that may help you for debugging.


$ uname -a
Linux l-0-01 3.10.0-514.2.2.el7.x86_64 #1 SMP Tue Dec 6 23:06:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
$ conda env list
# conda environments:
#
bds_atac                 /mnt/isilon/sfgi/programs/miniconda3/envs/bds_atac
bds_atac_py3             /mnt/isilon/sfgi/programs/miniconda3/envs/bds_atac_py3
root                  *  /mnt/isilon/sfgi/programs/miniconda3

$ source activate bds_atac
(bds_atac) 

$ which conda
/mnt/isilon/sfgi/programs/miniconda3/envs/bds_atac/bin/conda

$ conda list
# packages in environment at /mnt/isilon/sfgi/programs/miniconda3/envs/bds_atac:
#
argcomplete               1.0.0                    py27_1  
argh                      0.26.2                   py27_0    bioconda
bcftools                  1.4                           0    bioconda
bedtools                  2.26.0                        0    bioconda
bioconductor-biocgenerics 0.18.0                 r3.2.2_0    bioconda
bioconductor-biocparallel 1.4.3                  r3.2.2_0    bioconda
bioconductor-biostrings   2.38.4                        0    bioconda
bioconductor-genomeinfodb 1.6.3                         0    bioconda
bioconductor-genomicranges 1.22.4                        0    bioconda
bioconductor-iranges      2.4.8                         0    bioconda
bioconductor-rsamtools    1.22.0                 r3.2.2_1    bioconda
bioconductor-s4vectors    0.8.11                        0    bioconda
bioconductor-xvector      0.10.0                        1    bioconda
bioconductor-zlibbioc     1.16.0                 r3.2.2_1    bioconda
biopython                 1.67                np110py27_0  
boost                     1.57.0                        4  
bowtie                    1.1.2                    py27_2    bioconda
bowtie2                   2.2.6                    py27_0    bioconda
bx-python                 0.7.3               np110py27_1    bioconda
bzip2                     1.0.6                         3  
cairo                     1.14.8                        0  
curl                      7.52.1                        0  
cutadapt                  1.9.1                    py27_0    bioconda
cycler                    0.10.0                   py27_0  
cython                    0.25.2                   py27_0  
expat                     2.1.0                         0  
fastqc                    0.11.5                        1    bioconda
fisher                    0.1.4                    py27_0    bioconda
fontconfig                2.12.1                        3  
freetype                  2.5.5                         2  
gffutils                  0.8.7.1                  py27_1    bioconda
ghostscript               9.16                          0    asmeurer
glib                      2.43.0                        2    asmeurer
gnuplot                   4.6.0                         1    bioconda
graphviz                  2.38.0                        4    anaconda
gsl                       1.16                          1    asmeurer
harfbuzz                  0.9.35                        6    asmeurer
htslib                    1.4                           0    bioconda
icu                       54.1                          0  
java-jdk                  8.0.92                        1    bioconda
jbig                      2.1                           0  
jinja2                    2.9.6                    py27_0  
jpeg                      9b                            0  
libffi                    3.0.13                        3    asmeurer
libgcc                    4.8.5                         1    asmeurer
libgfortran               3.0.0                         1  
libiconv                  1.14                          0  
libpng                    1.6.27                        0  
libtiff                   4.0.6                         3  
libtool                   2.4.2                         0    asmeurer
libxml2                   2.9.4                         0  
macs2                     2.1.0                         0    bioconda
markupsafe                0.23                     py27_2  
matplotlib                1.5.1               np110py27_0  
metaseq                   0.5.6                    py27_0    bioconda
mkl                       11.3.3                        0  
mysql                     5.5.24                        0  
ncurses                   5.9                           5    asmeurer
nose                      1.3.7                    py27_1  
numpy                     1.10.2                   py27_0  
openblas                  0.2.14                        4  
openssl                   1.0.2k                        1  
pandas                    0.18.0              np110py27_0  
pango                     1.36.8                        3    asmeurer
pcre                      8.39                          1  
perl-threaded             5.22.0                       10    bioconda
picard                    1.126                         4    bioconda
pigz                      2.3                           0  
pip                       9.0.1                    py27_1  
pixman                    0.34.0                        0  
preseq                    2.0.2                 gsl1.16_0    bioconda
pybedtools                0.6.9                    py27_0    bcbio
pycairo                   1.10.0                   py27_0  
pyfaidx                   0.4.7.1                  py27_0    bioconda
pyparsing                 2.1.4                    py27_0  
pyqt                      4.10.4                   py27_0    asmeurer
pysam                     0.8.2.1                  py27_0    bcbio
python                    2.7.13                        0  
python-dateutil           2.2                      py27_0    asmeurer
python-levenshtein        0.12.0                   py27_1    bioconda
pytz                      2017.2                   py27_0  
pyyaml                    3.12                     py27_0  
qt                        4.8.5                         0    asmeurer
r                         3.2.2                         0    asmeurer
r-base                    3.2.2                         0    asmeurer
r-bitops                  1.0_6                  r3.2.2_1    asmeurer
r-boot                    1.3_17                 r3.2.2_0    asmeurer
r-catools                 1.17.1                 r3.2.2_2    asmeurer
r-class                   7.3_14                 r3.2.2_0    asmeurer
r-cluster                 2.0.3                  r3.2.2_0    asmeurer
r-codetools               0.2_14                 r3.2.2_0    asmeurer
r-foreign                 0.8_66                 r3.2.2_0    asmeurer
r-futile.logger           1.4.1                  r3.2.2_0    bioconda
r-futile.options          1.0.0                  r3.2.2_0    bioconda
r-kernsmooth              2.23_15                r3.2.2_0    asmeurer
r-lambda.r                1.1.7                  r3.2.2_0    bioconda
r-lattice                 0.20_33                r3.2.2_0    asmeurer
r-mass                    7.3_44                 r3.2.2_0    asmeurer
r-matrix                  1.2_2                  r3.2.2_0    asmeurer
r-mgcv                    1.8_7                  r3.2.2_0    asmeurer
r-nlme                    3.1_122                r3.2.2_0    asmeurer
r-nnet                    7.3_11                 r3.2.2_0    asmeurer
r-recommended             3.2.2                  r3.2.2_0    asmeurer
r-rpart                   4.1_10                 r3.2.2_0    asmeurer
r-snow                    0.4_1                  r3.2.2_0    bioconda
r-snowfall                1.84_6.1               r3.2.2_0    bioconda
r-spatial                 7.3_11                 r3.2.2_0    asmeurer
r-spp                     1.13                   r3.2.2_0    bioconda
r-survival                2.38_3                 r3.2.2_0    asmeurer
readline                  6.2                           2  
sambamba                  0.6.5                         0    bioconda
samtools                  1.2                           2    bioconda
scikit-learn              0.17.1              np110py27_2  
scipy                     0.17.0              np110py27_4  
setuptools                27.2.0                   py27_0  
simplejson                3.10.0                   py27_0  
sip                       4.15.5                   py27_0    asmeurer
six                       1.10.0                   py27_0  
sqlite                    3.13.0                        0  
system                    5.8                           2  
tk                        8.5.18                        0  
trim-galore               0.4.1                         0    bioconda
ucsc-bedclip              332                           0    bioconda
ucsc-bedgraphtobigwig     323                           0    daler
ucsc-bedtobigbed          323                           0    daler
ucsc-bigwigaverageoverbed 332                           0    bioconda
ucsc-bigwiginfo           332                           0    bioconda
ucsc-fetchchromsizes      323                           0    daler
ucsc-twobittofa           332                           0    bioconda
ucsc-wigtobigwig          323                           0    daler
wheel                     0.29.0                   py27_0  
xz                        5.2.2                         1  
yaml                      0.1.6                         0  
zlib                      1.2.8 

PS: our cluster server has experienced several updates since my last successful run. I do not know what caused the problem here.

Thank you,
Chun

@vervacity
Copy link
Collaborator

Hi, could you pass in the tail end (or full) BDS log file? (Should end in *.log). Looks like a unicode error when the ATAQC module is writing out the html output, but would help to know exactly which function in the module has the problem. It's likely due to the server update, but good to fix this for anyone else who might run into this on other similar server environments.

@sckinta
Copy link
Author

sckinta commented Apr 24, 2018

Hi. Thank you for quick reply.

Here are the last 20 lines of bds.log.

$ tail -n 20 bds.log
		  File "/mnt/isilon/sfgi/programs/atac_dnase_pipelines/ataqc/run_ataqc.py", line 1598, in <module>
		    main()
		  File "/mnt/isilon/sfgi/programs/atac_dnase_pipelines/ataqc/run_ataqc.py", line 1454, in main
		    raw_peak_summ, raw_peak_dist = get_region_size_metrics(PEAKS)
		  File "/mnt/isilon/sfgi/programs/atac_dnase_pipelines/ataqc/run_ataqc.py", line 752, in get_region_size_metrics
		    ax.set_title('Peak width distribution for {0}'.format(filename))
		  File "/mnt/isilon/sfgi/programs/miniconda3/envs/bds_atac/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 172, in set_title
		    title.set_text(label)
		  File "/mnt/isilon/sfgi/programs/miniconda3/envs/bds_atac/lib/python2.7/site-packages/matplotlib/text.py", line 1206, in set_text
		    self._text = '%s' % (s,)
		UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 30: ordinal not in range(128)

Fatal error: /mnt/isilon/sfgi/programs/atac_dnase_pipelines/atac.bds, line 1612, pos 2. Task/s failed.
atac.bds, line 82 :	main()
atac.bds, line 85 :	void main() { // atac pipeline starts here
atac.bds, line 109 :		ataqc()
atac.bds, line 1601 :	void ataqc() {
atac.bds, line 1612 :		wait

Creating checkpoint file: Config or command line option disabled checkpoint file creation, nothing done.

@vervacity
Copy link
Collaborator

Thanks for the log - that was helpful! It looks like there's some issue with setting up the filename, which I'm extrapolating to think that there may be some nonstandard character in your prefix set up (causing the ASCII/Unicode issue). This is something that our code should handle (rather than the user), and I will input a fix for it, but it will take a few days for me to get to it properly - in the meantime, you are welcome to try changing your input prefix and seeing if that resolves your problem, otherwise stay tuned on this issue for the fix to come through. thanks for bringing our attention to it!

@sckinta
Copy link
Author

sckinta commented Apr 25, 2018

Thank you! I think you are right about nonstandard character, since the name for the library is Naïve instead of naive....

But now, I have a new issue, which happened to all the libraries I am working on. The error is ValueError: all the input array dimensions except for the concatenation axis must match exactly

It looks like "_nx.concatenate()" in some python files caused problems. I could not figure out why I did not see this error in my successfully run before. Do you have any way to fix this problem too.

Here are more detailed standard error from run_atacseq.py

--------------------Stderr--------------------
Picked up _JAVA_OPTIONS: -Xms256M -Xmx45G -XX:ParallelGCThreads=1 -Djava.io.tmpdir=/mnt/isilon/sfgi/suc1/tmp
ERROR	2018-04-24 10:24:41	ProcessExecutor	Warning messages:
ERROR	2018-04-24 10:24:41	ProcessExecutor	1: In arrows(metrics$GC, metrics$NORMALIZED_COVERAGE - metrics$ERROR_BAR_WIDTH,  :
ERROR	2018-04-24 10:24:41	ProcessExecutor	  zero-length arrow is of indeterminate angle and so skipped
ERROR	2018-04-24 10:24:41	ProcessExecutor	2: In arrows(metrics$GC, metrics$NORMALIZED_COVERAGE - metrics$ERROR_BAR_WIDTH,  :
ERROR	2018-04-24 10:24:41	ProcessExecutor	  zero-length arrow is of indeterminate angle and so skipped
ERROR	2018-04-24 10:24:41	ProcessExecutor	3: In arrows(metrics$GC, metrics$NORMALIZED_COVERAGE - metrics$ERROR_BAR_WIDTH,  :
ERROR	2018-04-24 10:24:41	ProcessExecutor	  zero-length arrow is of indeterminate angle and so skipped
ERROR	2018-04-24 10:24:41	ProcessExecutor	4: In arrows(metrics$GC, metrics$NORMALIZED_COVERAGE - metrics$ERROR_BAR_WIDTH,  :
ERROR	2018-04-24 10:24:41	ProcessExecutor	  zero-length arrow is of indeterminate angle and so skipped
ERROR	2018-04-24 10:24:41	ProcessExecutor	5: In arrows(metrics$GC, metrics$NORMALIZED_COVERAGE - metrics$ERROR_BAR_WIDTH,  :
ERROR	2018-04-24 10:24:41	ProcessExecutor	  zero-length arrow is of indeterminate angle and so skipped
ERROR	2018-04-24 10:24:41	ProcessExecutor	6: In arrows(metrics$GC, metrics$NORMALIZED_COVERAGE - metrics$ERROR_BAR_WIDTH,  :
ERROR	2018-04-24 10:24:41	ProcessExecutor	  zero-length arrow is of indeterminate angle and so skipped
Picked up _JAVA_OPTIONS: -Xms256M -Xmx45G -XX:ParallelGCThreads=1 -Djava.io.tmpdir=/mnt/isilon/sfgi/suc1/tmp
[bam_sort_core] merging from 20 files...
[bam_sort_core] merging from 17 files...
Picked up _JAVA_OPTIONS: -Xms256M -Xmx45G -XX:ParallelGCThreads=1 -Djava.io.tmpdir=/mnt/isilon/sfgi/suc1/tmp
Picked up _JAVA_OPTIONS: -Xms256M -Xmx45G -XX:ParallelGCThreads=1 -Djava.io.tmpdir=/mnt/isilon/sfgi/suc1/tmp
processing chromosomes
Traceback (most recent call last):
  File "/mnt/isilon/sfgi/programs/atac_dnase_pipelines/ataqc/run_ataqc.py", line 1598, in 
    main()
  File "/mnt/isilon/sfgi/programs/atac_dnase_pipelines/ataqc/run_ataqc.py", line 1460, in main
    ROADMAP_META, OUTPUT_PREFIX)
  File "/mnt/isilon/sfgi/programs/atac_dnase_pipelines/ataqc/run_ataqc.py", line 905, in compare_to_roadmap
    sample_mean0_col)
  File "/mnt/isilon/sfgi/programs/miniconda3/envs/bds_atac/lib/python2.7/site-packages/scipy/stats/stats.py", line 3310, in spearmanr
    rs = np.corrcoef(ar, br, rowvar=axisout)
  File "/mnt/isilon/sfgi/programs/miniconda3/envs/bds_atac/lib/python2.7/site-packages/numpy/lib/function_base.py", line 2145, in corrcoef
    c = cov(x, y, rowvar)
  File "/mnt/isilon/sfgi/programs/miniconda3/envs/bds_atac/lib/python2.7/site-packages/numpy/lib/function_base.py", line 2024, in cov
    X = np.vstack((X, y))
  File "/mnt/isilon/sfgi/programs/miniconda3/envs/bds_atac/lib/python2.7/site-packages/numpy/core/shape_base.py", line 230, in vstack
    return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
ValueError: all the input array dimensions except for the concatenation axis must match exactly

@vervacity
Copy link
Collaborator

interesting - it could be related to a different output format from a different version of ucsc tools - can you provide the top of the *signal file in the qc folder if you have it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants