Skip to content
This repository has been archived by the owner on Mar 31, 2022. It is now read-only.

Streaming fails #15

Open
elad opened this issue Mar 9, 2015 · 5 comments
Open

Streaming fails #15

elad opened this issue Mar 9, 2015 · 5 comments

Comments

@elad
Copy link

elad commented Mar 9, 2015

Hello,

I'm seeing a problem with hdfs2cass when I try to load Avro files on HDFS to Cassandra (2.1.2) using CQL.

I've attached logs (below) for both hdfs2cass and Cassandra. Note that for the latter, an "Unknown type 72" exception is thrown during deserialization.

If I modify the code to use the Thrift protocol (in ExternalSSTableLoaderClient) instead, loading is fine.

Any ideas on what it is I'm doing wrong, or what should change to make this work?

Thanks!

Hdfs2cass:

3486 [pool-16-thread-1] INFO org.apache.cassandra.streaming.StreamResultFuture  - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Executing streaming plan for Bulk Load
15/03/09 17:17:02 INFO streaming.StreamSession: [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Starting streaming to /127.0.0.1
3487 [StreamConnectionEstablisher:1] INFO org.apache.cassandra.streaming.StreamSession  - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Starting streaming to /127.0.0.1
15/03/09 17:17:02 INFO streaming.StreamResultFuture: [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6 ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 39 files(42428 bytes)
3493 [StreamConnectionEstablisher:1] INFO org.apache.cassandra.streaming.StreamResultFuture  - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6 ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 39 files(42428 bytes)
15/03/09 17:17:02 INFO thrift.ProgressIndicator: Session to 127.0.0.1 created
3497 [StreamConnectionEstablisher:1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - Session to 127.0.0.1 created
15/03/09 17:17:02 INFO streaming.StreamCoordinator: [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6, ID#0] Beginning stream session with /127.0.0.1
3498 [StreamConnectionEstablisher:1] INFO org.apache.cassandra.streaming.StreamCoordinator  - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6, ID#0] Beginning stream session with /127.0.0.1
15/03/09 17:17:02 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 1/39 (2%)] [total: 2% - 0MB/s (avg: 0MB/s)]
3501 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 1/39 (2%)] [total: 2% - 0MB/s (avg: 0MB/s)]
15/03/09 17:17:02 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 2/39 (5%)] [total: 5% - 0MB/s (avg: 0MB/s)]
3504 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 2/39 (5%)] [total: 5% - 0MB/s (avg: 0MB/s)]
15/03/09 17:17:02 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 3/39 (7%)] [total: 7% - 0MB/s (avg: 0MB/s)]
3506 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 3/39 (7%)] [total: 7% - 0MB/s (avg: 0MB/s)]
15/03/09 17:17:02 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 4/39 (10%)] [total: 10% - 2147483647MB/s (avg: 0MB/s)]
3507 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 4/39 (10%)] [total: 10% - 2147483647MB/s (avg: 0MB/s)]
15/03/09 17:17:02 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 5/39 (12%)] [total: 12% - 2147483647MB/s (avg: 0MB/s)]
3508 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 5/39 (12%)] [total: 12% - 2147483647MB/s (avg: 0MB/s)]
15/03/09 17:17:02 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 6/39 (15%)] [total: 15% - 2147483647MB/s (avg: 0MB/s)]
3509 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 6/39 (15%)] [total: 15% - 2147483647MB/s (avg: 0MB/s)]
15/03/09 17:17:02 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 7/39 (17%)] [total: 17% - 0MB/s (avg: 0MB/s)]
3510 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 7/39 (17%)] [total: 17% - 0MB/s (avg: 0MB/s)]
15/03/09 17:17:02 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 8/39 (20%)] [total: 20% - 2147483647MB/s (avg: 0MB/s)]
3511 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 8/39 (20%)] [total: 20% - 2147483647MB/s (avg: 0MB/s)]
15/03/09 17:17:02 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 9/39 (23%)] [total: 23% - 2147483647MB/s (avg: 0MB/s)]
3512 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 9/39 (23%)] [total: 23% - 2147483647MB/s (avg: 0MB/s)]
15/03/09 17:17:02 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 10/39 (25%)] [total: 25% - 0MB/s (avg: 0MB/s)]
3513 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 10/39 (25%)] [total: 25% - 0MB/s (avg: 0MB/s)]
15/03/09 17:17:03 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 11/39 (28%)] [total: 28% - 0MB/s (avg: 0MB/s)]
3514 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 11/39 (28%)] [total: 28% - 0MB/s (avg: 0MB/s)]
15/03/09 17:17:03 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 12/39 (30%)] [total: 30% - 0MB/s (avg: 0MB/s)]
3515 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 12/39 (30%)] [total: 30% - 0MB/s (avg: 0MB/s)]
15/03/09 17:17:03 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 13/39 (33%)] [total: 33% - 0MB/s (avg: 0MB/s)]
3517 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 13/39 (33%)] [total: 33% - 0MB/s (avg: 0MB/s)]
15/03/09 17:17:03 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 14/39 (35%)] [total: 35% - 0MB/s (avg: 0MB/s)]
3518 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 14/39 (35%)] [total: 35% - 0MB/s (avg: 0MB/s)]
15/03/09 17:17:03 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 15/39 (38%)] [total: 38% - 2147483647MB/s (avg: 0MB/s)]
3519 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 15/39 (38%)] [total: 38% - 2147483647MB/s (avg: 0MB/s)]
15/03/09 17:17:03 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 16/39 (41%)] [total: 41% - 2147483647MB/s (avg: 0MB/s)]
3520 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 16/39 (41%)] [total: 41% - 2147483647MB/s (avg: 0MB/s)]
15/03/09 17:17:03 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 17/39 (43%)] [total: 43% - 0MB/s (avg: 0MB/s)]
3521 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 17/39 (43%)] [total: 43% - 0MB/s (avg: 0MB/s)]
15/03/09 17:17:03 INFO streaming.StreamResultFuture: [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Session with /127.0.0.1 is complete
3522 [STREAM-IN-/127.0.0.1] INFO org.apache.cassandra.streaming.StreamResultFuture  - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Session with /127.0.0.1 is complete
15/03/09 17:17:03 INFO thrift.ProgressIndicator: Stream to 127.0.0.1 failed.
3523 [STREAM-IN-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - Stream to 127.0.0.1 failed.
15/03/09 17:17:03 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 18/39 (46%)] [total: 46% - 0MB/s (avg: 0MB/s)]
3523 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 18/39 (46%)] [total: 46% - 0MB/s (avg: 0MB/s)]
15/03/09 17:17:03 WARN streaming.StreamResultFuture: [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Stream failed
3524 [STREAM-IN-/127.0.0.1] WARN org.apache.cassandra.streaming.StreamResultFuture  - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Stream failed
15/03/09 17:17:03 INFO cql.CrunchCqlBulkRecordWriter: SSTables built. Now starting streaming
3524 [pool-16-thread-1] INFO com.spotify.hdfs2cass.cassandra.cql.CrunchCqlBulkRecordWriter  - SSTables built. Now starting streaming
15/03/09 17:17:03 INFO cql.CrunchCqlBulkRecordWriter: SSTableWriter wasn't instantiated, no streaming happened.
3525 [pool-16-thread-1] INFO com.spotify.hdfs2cass.cassandra.cql.CrunchCqlBulkRecordWriter  - SSTableWriter wasn't instantiated, no streaming happened.
15/03/09 17:17:03 INFO cql.CrunchCqlBulkRecordWriter: Streaming finished successfully
3525 [pool-16-thread-1] INFO com.spotify.hdfs2cass.cassandra.cql.CrunchCqlBulkRecordWriter  - Streaming finished successfully
15/03/09 17:17:03 INFO mapred.LocalJobRunner: reduce task executor complete.
3525 [Thread-20] INFO org.apache.hadoop.mapred.LocalJobRunner  - reduce task executor complete.
15/03/09 17:17:03 INFO thrift.ProgressIndicator: progress: [/127.0.0.1 19/39 (48%)] [total: 48% - 0MB/s (avg: 0MB/s)]
3525 [STREAM-OUT-/127.0.0.1] INFO com.spotify.hdfs2cass.cassandra.thrift.ProgressIndicator  - progress: [/127.0.0.1 19/39 (48%)] [total: 48% - 0MB/s (avg: 0MB/s)]
15/03/09 17:17:03 WARN mapred.LocalJobRunner: job_local102719284_0001
java.lang.Exception: java.lang.RuntimeException: Streaming to the following hosts failed: [/127.0.0.1]
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.lang.RuntimeException: Streaming to the following hosts failed: [/127.0.0.1]
    at com.spotify.hdfs2cass.cassandra.cql.CrunchCqlBulkRecordWriter.close(CrunchCqlBulkRecordWriter.java:174)
    at com.spotify.hdfs2cass.cassandra.cql.CrunchCqlBulkRecordWriter.close(CrunchCqlBulkRecordWriter.java:155)
    at org.apache.crunch.io.CrunchOutputs.close(CrunchOutputs.java:138)
    at org.apache.crunch.impl.mr.run.CrunchTaskContext.cleanup(CrunchTaskContext.java:72)
    at org.apache.crunch.impl.mr.run.CrunchReducer.cleanup(CrunchReducer.java:64)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed
    at hdfs2cass.com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
    at hdfs2cass.com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
    at hdfs2cass.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
    at hdfs2cass.com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135)
    at com.spotify.hdfs2cass.cassandra.cql.CrunchCqlBulkRecordWriter.close(CrunchCqlBulkRecordWriter.java:172)
    ... 13 more
Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
    at org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
    at hdfs2cass.com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
    at hdfs2cass.com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
    at hdfs2cass.com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
    at hdfs2cass.com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
    at hdfs2cass.com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
    at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
    at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
    at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:382)
    at org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:587)
    at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:442)
    at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:251)
    ... 1 more
3526 [Thread-20] WARN org.apache.hadoop.mapred.LocalJobRunner  - job_local102719284_0001
java.lang.Exception: java.lang.RuntimeException: Streaming to the following hosts failed: [/127.0.0.1]
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.lang.RuntimeException: Streaming to the following hosts failed: [/127.0.0.1]
    at com.spotify.hdfs2cass.cassandra.cql.CrunchCqlBulkRecordWriter.close(CrunchCqlBulkRecordWriter.java:174)
    at com.spotify.hdfs2cass.cassandra.cql.CrunchCqlBulkRecordWriter.close(CrunchCqlBulkRecordWriter.java:155)
    at org.apache.crunch.io.CrunchOutputs.close(CrunchOutputs.java:138)
    at org.apache.crunch.impl.mr.run.CrunchTaskContext.cleanup(CrunchTaskContext.java:72)
    at org.apache.crunch.impl.mr.run.CrunchReducer.cleanup(CrunchReducer.java:64)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed
    at hdfs2cass.com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
    at hdfs2cass.com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
    at hdfs2cass.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
    at hdfs2cass.com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135)
    at com.spotify.hdfs2cass.cassandra.cql.CrunchCqlBulkRecordWriter.close(CrunchCqlBulkRecordWriter.java:172)
    ... 13 more
Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
    at org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
    at hdfs2cass.com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
    at hdfs2cass.com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
    at hdfs2cass.com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
    at hdfs2cass.com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
    at hdfs2cass.com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
    at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
    at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
    at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:382)
    at org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:587)
    at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:442)
    at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:251)
    ... 1 more
15/03/09 17:17:03 ERROR streaming.StreamSession: [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Streaming error occurred
java.io.IOException: Broken pipe
    at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
    at sun.nio.ch.IOUtil.write(IOUtil.java:65)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
    at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
    at java.nio.channels.Channels.writeFully(Channels.java:98)
    at java.nio.channels.Channels.access$000(Channels.java:61)
    at java.nio.channels.Channels$1.write(Channels.java:174)
    at java.io.OutputStream.write(OutputStream.java:75)
    at java.nio.channels.Channels$1.write(Channels.java:155)
    at org.apache.cassandra.io.util.DataOutputStreamPlus.write(DataOutputStreamPlus.java:45)
    at org.apache.cassandra.io.util.AbstractDataOutput.writeInt(AbstractDataOutput.java:212)
    at org.apache.cassandra.streaming.messages.FileMessageHeader$FileMessageHeaderSerializer.serialize(FileMessageHeader.java:129)
    at org.apache.cassandra.streaming.messages.FileMessageHeader$FileMessageHeaderSerializer.serialize(FileMessageHeader.java:120)
    at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:48)
    at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:40)
    at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
    at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:346)
    at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:326)
    at java.lang.Thread.run(Thread.java:745)
3526 [STREAM-OUT-/127.0.0.1] ERROR org.apache.cassandra.streaming.StreamSession  - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Streaming error occurred
java.io.IOException: Broken pipe
    at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
    at sun.nio.ch.IOUtil.write(IOUtil.java:65)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
    at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
    at java.nio.channels.Channels.writeFully(Channels.java:98)
    at java.nio.channels.Channels.access$000(Channels.java:61)
    at java.nio.channels.Channels$1.write(Channels.java:174)
    at java.io.OutputStream.write(OutputStream.java:75)
    at java.nio.channels.Channels$1.write(Channels.java:155)
    at org.apache.cassandra.io.util.DataOutputStreamPlus.write(DataOutputStreamPlus.java:45)
    at org.apache.cassandra.io.util.AbstractDataOutput.writeInt(AbstractDataOutput.java:212)
    at org.apache.cassandra.streaming.messages.FileMessageHeader$FileMessageHeaderSerializer.serialize(FileMessageHeader.java:129)
    at org.apache.cassandra.streaming.messages.FileMessageHeader$FileMessageHeaderSerializer.serialize(FileMessageHeader.java:120)
    at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:48)
    at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:40)
    at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
    at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:346)
    at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:326)
    at java.lang.Thread.run(Thread.java:745)
15/03/09 17:17:03 ERROR streaming.StreamSession: [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Streaming error occurred
java.io.IOException: Broken pipe
    at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
    at sun.nio.ch.IOUtil.write(IOUtil.java:65)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
    at org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
    at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
    at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:346)
    at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:326)
    at java.lang.Thread.run(Thread.java:745)
3528 [STREAM-OUT-/127.0.0.1] ERROR org.apache.cassandra.streaming.StreamSession  - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Streaming error occurred
java.io.IOException: Broken pipe
    at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
    at sun.nio.ch.IOUtil.write(IOUtil.java:65)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
    at org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
    at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
    at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:346)
    at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:326)
    at java.lang.Thread.run(Thread.java:745)

[...a few more of the above Broken pipe exceptions...]

1 job failure(s) occurred:
com.spotify.hdfs2cass.Hdfs2Cass: Avro(/user/elad.efrat/single.avro)+S0+S1+GBK+ungroup+cql:... ID=1 (1/1)(1): Job failed!

Logs from Cassandra:

INFO  [STREAM-INIT-/127.0.0.1:51695] 2015-03-09 17:17:02,982 StreamResultFuture.java:109 - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6 ID#0] Creating new streaming plan for Bulk Load
INFO  [STREAM-INIT-/127.0.0.1:51695] 2015-03-09 17:17:02,990 StreamResultFuture.java:116 - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6, ID#0] Received streaming plan for Bulk Load
INFO  [STREAM-INIT-/127.0.0.1:51694] 2015-03-09 17:17:02,991 StreamResultFuture.java:116 - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6, ID#0] Received streaming plan for Bulk Load
INFO  [STREAM-IN-/127.0.0.1] 2015-03-09 17:17:02,997 StreamResultFuture.java:166 - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6 ID#0] Prepare completed. Receiving 39 files(42428 bytes), sending 0 files(0 bytes)
WARN  [STREAM-IN-/127.0.0.1] 2015-03-09 17:17:03,006 StreamSession.java:592 - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Retrying for following error
java.io.IOException: CF b9dda069-4d0a-37c3-b0e8-8c4cf88a4a9f was dropped during streaming
    at org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:71) ~[apache-cassandra-2.1.2.jar:2.1.2]
    at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:48) [apache-cassandra-2.1.2.jar:2.1.2]
    at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:38) [apache-cassandra-2.1.2.jar:2.1.2]
    at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55) [apache-cassandra-2.1.2.jar:2.1.2]
    at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245) [apache-cassandra-2.1.2.jar:2.1.2]
    at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
ERROR [STREAM-IN-/127.0.0.1] 2015-03-09 17:17:03,006 StreamSession.java:472 - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Streaming error occurred
java.lang.IllegalArgumentException: Unknown type 72
    at org.apache.cassandra.streaming.messages.StreamMessage$Type.get(StreamMessage.java:89) ~[apache-cassandra-2.1.2.jar:2.1.2]
    at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54) ~[apache-cassandra-2.1.2.jar:2.1.2]
    at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245) ~[apache-cassandra-2.1.2.jar:2.1.2]
    at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
INFO  [STREAM-IN-/127.0.0.1] 2015-03-09 17:17:03,010 StreamResultFuture.java:180 - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Session with /127.0.0.1 is complete
WARN  [STREAM-IN-/127.0.0.1] 2015-03-09 17:17:03,011 StreamResultFuture.java:207 - [Stream #56d5a8f2-c66f-11e4-a038-e3a1922f08b6] Stream failed
@agarwalpranaya
Copy link

I am seeing the same problem with cassandra 2.1.0-2 while adding new node. Any solution?

@rzvoncek
Copy link
Contributor

Hello,

We have not yet tried hdfs2cass with Cassandra 2.1. Therefore I can't be of much help at this time.

The 'unknown type' error probably comes from the difference in streaming protocol between 2.0.11 (depended on by hdfs2cass) and 2.1 (used by your Cassandra instance). I'd suggest bumping Cassandra version hdfs2cass depends on to whatever your cluster is on. Chances are it will just work.

Moreover, I'm still puzzled by this:

java.io.IOException: CF b9dda069-4d0a-37c3-b0e8-8c4cf88a4a9f was dropped during streaming

Does this mean the table got dropped? Why? Is this a cause, or just an effect of some other errors?

@elad
Copy link
Author

elad commented Apr 5, 2015

I'd suggest bumping Cassandra version hdfs2cass depends on to whatever your cluster is on. Chances are it will just work.

It doesn't. :/

It looks like CASSANDRA-8924 might be related and helpful in resolving this issue, I'll look into it when I get a chance.

@meyarivan
Copy link

Patched version of hdfs2cass (CASSANDRA-8924, Cassandra 2.1.6) at https://github.com/meyarivan/hdfs2cass might be helpful

@elad
Copy link
Author

elad commented Jun 16, 2015

I can confirm this fixes streaming to Cassandra 2.1.6, thanks!

@meyarivan could you please create a pull request so that there's a better chance @rzvoncek or someone else merges these changes?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants