-
Notifications
You must be signed in to change notification settings - Fork 560
Issues running H2O on Hadoop
Kevin Normoyle edited this page Sep 29, 2013
·
10 revisions
Not pointing to the jar correctly
Exception in thread "main" java.io.IOException: Error opening job jar: ../../h2o_downloaded/hadoop/h2odriver-cdh3.jar
at org.apache.hadoop.util.RunJar.main(RunJar.java:124)
Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:127)
at java.util.jar.JarFile.<init>(JarFile.java:136)
at java.util.jar.JarFile.<init>(JarFile.java:73)
at org.apache.hadoop.util.RunJar.main(RunJar.java:122)
Wrong jobtracker port used (50030)
Determining driver host interface for mapper->driver callback...
[Possible callback IP address: 192.168.1.180]
[Possible callback IP address: 127.0.0.1]
Using mapper->driver callback IP address and port: 192.168.1.180:46479
(You can override these with -driverif and -driverport.)
Driver program compiled with MapReduce V1 (Classic)
Memory Settings:
mapred.child.java.opts: -Xms8g -Xmx8g
mapred.map.child.java.opts: -Xms8g -Xmx8g
Extra memory percent: 10
mapreduce.map.memory.mb: 9011
13/09/28 01:07:15 ERROR security.UserGroupInformation: PriviledgedActionException as:kevin (auth:SIMPLE) cause:java.io.IOException: Call to /192.168.1.176:50030 failed on local exception: java.io.EOFException
ERROR: Call to /192.168.1.176:50030 failed on local exception: java.io.EOFException
java.io.IOException: Call to /192.168.1.176:50030 failed on local exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1187)
at org.apache.hadoop.ipc.Client.call(Client.java:1155)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
at org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:511)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:496)
at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:479)
at org.apache.hadoop.mapreduce.Job$1.run(Job.java:539)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:537)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:525)
at water.hadoop.h2odriver.run2(h2odriver.java:747)
at water.hadoop.h2odriver.run(h2odriver.java:808)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at water.hadoop.h2odriver.main(h2odriver.java:830)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:858)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:767)
directory already exists
13/09/28 01:32:50 ERROR security.UserGroupInformation: PriviledgedActionException as:kevin (auth:SIMPLE) cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfsOutputDirName already exists
ERROR: Output directory hdfsOutputDirName already exists
..have to delete using hadoop dfs .. ..created under /user/
$ hadoop dfs -ls
drwxrwxrwx - kevin supergroup 0 2013-08-26 17:00 /user/kevin/.Trash
drwx------ - kevin supergroup 0 2013-09-28 01:32 /user/kevin/.staging
drwxrwxrwx - kevin supergroup 0 2013-09-28 01:26 /user/kevin/hdfsOutputDirName
$ hadoop dfs -rmr /user/kevin/hdfsOutputDirName
Moved to trash: hdfs://mr-0x6.0xdata.loc:8020/user/kevin/hdfsOutputDirName
correct operation. my_hh.sh has the following:
CDH3_JOBTRACKER=192.168.1.176:8021
CDH3_NODES=3
H2O_HADOOP=../../h2o-downloaded/hadoop
H2O_JAR=../../h2o-downloaded/h2o.jar
hadoop dfs -rmr /user/kevin/hdfsOutputDirName
# h2o-one-node is a file created with ip:port of one node to talk to
hadoop jar $H2O_HADOOP/h2odriver_cdh3.jar water.hadoop.h2odriver -jt $CDH3_JOBTRACKER -libjars $H2O_JAR -mapperXmx 8g -nodes 3 -output hdfsOutputDirName -notify h2o-one-node
$ ./my_hh.sh
rmr: cannot remove /user/kevin/hdfsOutputDirName: No such file or directory.
Determining driver host interface for mapper->driver callback...
[Possible callback IP address: 192.168.1.180]
[Possible callback IP address: 127.0.0.1]
Using mapper->driver callback IP address and port: 192.168.1.180:34724
(You can override these with -driverif and -driverport.)
Driver program compiled with MapReduce V1 (Classic)
Memory Settings:
mapred.child.java.opts: -Xms8g -Xmx8g
mapred.map.child.java.opts: -Xms8g -Xmx8g
Extra memory percent: 10
mapreduce.map.memory.mb: 9011
Job name 'H2O_43559' submitted
JobTracker job ID is 'job_201309210226_0034'
Waiting for H2O cluster to come up...
H2O node 192.168.1.176:54321 requested flatfile
H2O node 192.168.1.178:54321 requested flatfile
H2O node 192.168.1.177:54321 requested flatfile
Sending flatfiles to nodes...
[Sending flatfile to node 192.168.1.176:54321]
[Sending flatfile to node 192.168.1.178:54321]
[Sending flatfile to node 192.168.1.177:54321]
H2O node 192.168.1.178:54321 reports H2O cluster size 1
H2O node 192.168.1.177:54321 reports H2O cluster size 1
H2O node 192.168.1.176:54321 reports H2O cluster size 1
H2O node 192.168.1.177:54321 reports H2O cluster size 3
H2O cluster (3 nodes) is up
(Note: Use the -disown option to exit the driver after cluster formation)
(Press Ctrl-C to kill the cluster)
Blocking until the H2O cluster shuts down...
H2O node 192.168.1.178:54321 reports H2O cluster size 3
H2O node 192.168.1.176:54321 reports H2O cluster size 3
mapr. Need to be able to create/write to the user directory? fixed by creating it first (user is 0xdiag here)
2013-09-28 15:27:04,7712 ERROR Client fs/client/fileclient/cc/client.cc:1176 Thread: 140432888858368 TraverseAndCreateDirs failed, could not create 0xdiag in Fid 2059.16.2
2013-09-28 15:27:04,7808 ERROR JniCommon fs/client/fileclient/cc/jni_common.cc:1775 Thread: 140432888858368 mkdirs failed for /user/0xdiag/hdfsOutputDirName, error 13
ERROR: Error: Permission denied(13), file: hdfsOutputDirName
java.io.IOException: Error: Permission denied(13), file: hdfsOutputDirName
at com.mapr.fs.MapRFileSystem.makeDir(MapRFileSystem.java:633)
at com.mapr.fs.MapRFileSystem.mkdirs(MapRFileSystem.java:646)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1173)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:949)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:885)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:885)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:573)
at water.hadoop.h2odriver.run2(h2odriver.java:747)
at water.hadoop.h2odriver.run(h2odriver.java:808)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at water.hadoop.h2odriver.main(h2odriver.java:830)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
cdh4 Evidently the command line commands havehttps://github.com/0xdata/h2o/wiki/Issues-running-H2O-on-Hadoop/_edit# switched to "hdfs dfs" instead of "hadoop dfs" (deprecated). -rmr goes to '-rm -r'. Also 'hadoop job' is deprecated. Use 'mapred job'. for instance, 'mapred job -list' and 'mapred job -kill '
h2o on cdh4..Is this from h2o?
Determining driver host interface for mapper->driver callback...
[Possible callback IP address: 10.0.4.1]
[Possible callback IP address: 192.168.1.162]
[Possible callback IP address: 127.0.0.1]
Using mapper->driver callback IP address and port: 192.168.1.162:38064
(You can override these with -driverif and -driverport.)
Driver program compiled with MapReduce V1 (Classic)
Memory Settings:
mapred.child.java.opts: -Xms20g -Xmx20g
mapred.map.child.java.opts: -Xms20g -Xmx20g
Extra memory percent: 10
mapreduce.map.memory.mb: 22528
13/09/29 12:00:53 WARN conf.Configuration: mapred.map.child.java.opts is deprecated. Instead, use mapreduce.map.java.opts
13/09/29 12:00:53 WARN conf.Configuration: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
13/09/29 12:00:53 WARN conf.Configuration: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
13/09/29 12:00:53 WARN conf.Configuration: mapred.map.max.attempts is deprecated. Instead, use mapreduce.map.maxattempts
13/09/29 12:00:53 WARN conf.Configuration: mapred.job.reuse.jvm.num.tasks is deprecated. Instead, use mapreduce.job.jvm.numtasks
13/09/29 12:00:55 WARN conf.Configuration: mapred.job.classpath.files is deprecated. Instead, use mapreduce.job.classpath.files
13/09/29 12:00:55 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar
13/09/29 12:00:55 WARN conf.Configuration: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
13/09/29 12:00:55 WARN conf.Configuration: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
13/09/29 12:00:55 WARN conf.Configuration: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
13/09/29 12:00:55 WARN conf.Configuration: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
13/09/29 12:00:55 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name
13/09/29 12:00:55 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
13/09/29 12:00:55 WARN conf.Configuration: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
13/09/29 12:00:55 WARN conf.Configuration: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
13/09/29 12:00:55 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
13/09/29 12:00:55 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
13/09/29 12:00:55 WARN conf.Configuration: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
13/09/29 12:00:55 WARN conf.Configuration: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
13/09/29 12:00:55 WARN conf.Configuration: mapred.job.map.memory.mb is deprecated. Instead, use mapreduce.map.memory.mb
13/09/29 12:00:55 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir