Seems the HDFS file path contains ':' colon will throw exception #4

stevegy · 2016-01-03T05:47:43Z

I had download this whole source code and built it successfully. When i tried to run a crawl test:
bin/crawl urls/ TestCrawl/ http://localhost:8983/solr/nutch 2
I run into this URI path name issue.
hadoop.log.zip

i have this log file attached. It seems the HDFS file path name special characters issue is still there?

2016-01-03 13:27:08,405 INFO fetcher.Fetcher - Fetcher: starting at 2016-01-03 13:27:08
2016-01-03 13:27:08,405 INFO fetcher.Fetcher - Fetcher: segment: TestCrawl/segments/drwxr-xr-xnn4nstevennstaffnn136nJannn3n13:24n20160103090925
2016-01-03 13:27:08,406 INFO fetcher.Fetcher - Fetcher Timelimit set for : 1451809628406
2016-01-03 13:27:08,631 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-01-03 13:27:08,677 ERROR fetcher.Fetcher - Fetcher: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: drwxr-xr-xnn4nstevennstaffnn136nJannn3n13:24n20160103090925
at org.apache.hadoop.fs.Path.initialize(Path.java:148)
at org.apache.hadoop.fs.Path.(Path.java:126)
at org.apache.hadoop.fs.Path.(Path.java:50)

The text was updated successfully, but these errors were encountered:

petarR · 2016-01-18T16:01:57Z

Hi,

You could try the fix given in this thread.

Or simply use the following command to start the crawl:
runtime/local/bin/nutch crawl urls/ -solr http://localhost:8983/solr/ -dir TestCrawl -depth 3 -topN 50

RobertMeusel closed this as completed Mar 9, 2016

RobertMeusel reopened this Mar 9, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seems the HDFS file path contains ':' colon will throw exception #4

Seems the HDFS file path contains ':' colon will throw exception #4

stevegy commented Jan 3, 2016

petarR commented Jan 18, 2016

Seems the HDFS file path contains ':' colon will throw exception #4

Seems the HDFS file path contains ':' colon will throw exception #4

Comments

stevegy commented Jan 3, 2016

petarR commented Jan 18, 2016