-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing Documentation for Creating a Data Pump with MetPX Sarracenia to Subscribe Data to S3 Bucket #1379
Comments
1st make sure you have python driver for S3 protocol. fractal% sr3 features
Status: feature: python imports: Description:
Installed amqp amqp can connect to rabbitmq brokers
Absent azurestorage azure-storage-blob cannot connect natively to Azure Stoarge accounts
Installed appdirs appdirs place configuration and state files appropriately for platform (windows/mac/linux)
Installed filetypes magic able to set content headers
Absent ftppoll dateparser,pytz not able to poll with ftp
Installed humanize humanize,humanfriendly humans numbers that are easier to read.
Installed jsonlogs pythonjsonlogger can write json logs, in addition to text ones.
Installed mqtt paho.mqtt.client can connect to mqtt brokers
Installed process psutil can monitor, start, stop processes: Sr3 CLI should basically work
Installed reassembly flufl.lock can reassemble block segmented files transferred
Installed redis redis,redis_lock can use redis implementations of retry and nodupe
Installed retry jsonpickle can write messages to local queues to retry failed publishes/sends/downloads
Installed s3 boto3 able to connect natively to S3-compatible locations (AWS S3, Minio, etc..)
Installed sftp paramiko can use sftp or ssh based services
Installed vip netifaces able to use the vip option for high availability clustering
Installed watch watchdog watch directories
Installed xattr xattr will store file metadata in extended attributes
state dir: /home/peter/.cache/sr3
config dir: /home/peter/.config/sr3
fractal%
you see the s3 line? it means s3 is enabled by having the boto3 modules installed.
or a python modules:
|
step 2 put the credentials for the sr3 bucket you want to write to in ~/.config/sr3/credentials.conf
Then in a sender configuration.... ~/.config/sr3/upload_to_sr3.conf:
so... sr3 does not do direct 3 party transfers... if you are trying to transfer from one upstream, and push to an s3... it has to traverse the local machine, and then be pushed out. To do that, you need:
Documentation.Yeah, the documentation is really lacking in this... We need to try out some use cases First thing we should do is explain S3 credential fields in https://metpx.github.io/sarracenia/Reference/sr3_credentials.7.html Somebody did work out a great example here: https://github.com/MetPX/sr3-examples/tree/main/cloud-publisher-s3 but the problem with it is that the s3 support was re-written afterwards. The example above is with an s3 plugin, whereas the information I was giving you is for native support (no plugin needed.) We need to update that example. for now, that's the documentation we have. |
@tysonkaufmann maybe update the sr3-example to use the built-in driver? |
Thank you for your prompt reply. I managed to get to 2nd step already before. Just looking into the code I was able to spot the difference with previous versions of s3 data flows implemented with call back functions and the new developemnts. I followed your instructions and it seems I am missing something obvious:
This is my credentials.conf:
This is config for watch pump. which is working:
And this does not work (No file is send to bucket):
In my understanding, the files are already present locally. Am I wrong? And yes, I am ready to help with some s3 use-cases documentation once something will be working. ;-) |
Anyways, that should move forward a bit... maybe some messages you are seeing from the logs would make it easier to diagnose. |
oh... and delete on is not good... it will delete after download, before the sender can pick it up. |
Even clearer would be to put in the sender configuration:
This does the same thing as plain old delete but it is a bit easier to understand. UPDATE: had delete settings inverted. |
Hi there,
I'm new to using MetPX Sarracenia and I'm trying to set up a data pump to subscribe to some data and store it in an S3 bucket. However, I've found that the documentation for this specific use case is either incomplete or missing.
Steps I Followed:
Expected Behavior:
I expected to find detailed steps or a guide on how to configure MetPX Sarracenia to subscribe to data and store it in an S3 bucket.
Actual Behavior:
The documentation does not cover this specific scenario, making it difficult for a newcomer like myself to proceed.
Environment:
Additional Information:
I can see that there is a s3 transfer class existing in the code but I can't make sr3 start using it
Best regards,
Jure
The text was updated successfully, but these errors were encountered: