You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This topic is just to a discussion about the stream flow so we users can understand better the way this plugin handle stream with clickhouse.
Considera a amount os data that takes 10GB. Make a readstream then do a stream is quite easy with the node-clickhouse. About that:
Can I consider that this flow does not supercharge clickhouse database?
Looking at the documentation, whats the diferente between this, does is has diference in terms of performance:
A. Insert with stream
constwritableStream=ch.query('INSERT INTO table FORMAT CSV',(err,result)=>{})
B. Insert large data (without callback)
constclickhouseStream=ch.query('INSERT INTO table FORMAT TSV')tsvStream.pipe(clickhouseStream)
I read the Clickhouse docs. This setting make things goes right when well set. How does can I use insert_quorum to make stream write faster considering a single server (without replicas)?
With node-clickhouse WriteStream, do I have to make my code take care of garbage collection so I must make use of pause/resume/drain?
The text was updated successfully, but these errors were encountered:
for large files, the streams should also consider failure handling, pause, resume options in case of connection problems or any other network errors. It is not clear if this package handles those checkpoints.
In real world production, I found it is weak to hold a write stream for large or long time entry insertion. Thus I wrote a wrap based on @apla/node-clickhouse, that supports: 1. failure retry. 2. restore data segments after process crash. and 3. single write process in node cluster mode. Hope that will be helpful: https://www.npmjs.com/package/clickhouse-cargo
Topic open to community of this module
This topic is just to a discussion about the stream flow so we users can understand better the way this plugin handle stream with clickhouse.
Considera a amount os data that takes 10GB. Make a readstream then do a stream is quite easy with the node-clickhouse. About that:
A. Insert with stream
B. Insert large data (without callback)
I read the Clickhouse docs. This setting make things goes right when well set. How does can I use insert_quorum to make stream write faster considering a single server (without replicas)?
With node-clickhouse WriteStream, do I have to make my code take care of garbage collection so I must make use of pause/resume/drain?
The text was updated successfully, but these errors were encountered: