Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sos] Add 'upload' component to upload existing reports and files #3746

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 111 additions & 0 deletions man/en/sos-upload.1
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
.TH UPLOAD 1 "July 2024"

.SH NAME
sos_upload \- Upload files like previously generated sos reports or logs to a policy specific location
.SH SYNOPSIS
.B sos upload FILE [options]
[--case-id id]\fR
[--upload-url url]\fR
[--upload-user user]\fR
[--upload-pass pass]\fR
[--upload-directory dir]\fR
[--upload-method]\fR
[--upload-no-ssl-verify]\fR
[--upload-protocol protocol]\fR
Comment on lines +7 to +14
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcastill Could the --upload-protocol s3 flags be included in this work? Unfortunately, it contains unique flags that made S3 easier to implement at the time.

  [--upload-s3-endpoint endpoint]
  [--upload-s3-region region]
  [--upload-s3-bucket bucket]
  [--upload-s3-access-key access_key]
  [--upload-s3-secret-key secret_key]
  [--upload-s3-object-prefix object_prefix]

The existing flags and how the provided values were used were not well aligned for S3, even though valid for FTP, HTTP, etc. protocols. I didn't want to cause any breakage for existing upload protocols while trying to make them work for all protocols, so S3 ended up with unique flags.

I planned to attempt a refactor at some point (sos v5?) where the original protocols and s3 overlap. For example, allowing synonymous flags:

  • --upload-user ~ --upload-s3-access-key
  • --upload-pass ~ --upload-s3-secret-key
  • --upload-directory ~ --upload-s3-object-prefix
  • --upload-url ~ --upload-s3-endpoint

However, I haven't been able to dedicate the time yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the --upload-protocol s3 flags be included in this work? Unfortunately, it contains unique flags that made S3 easier to implement at the time.

Yes, I'll make sure I include them in the next iteration of this PR.

I planned to attempt a refactor at some point (sos v5?) where the original protocols and s3 overlap. For example, allowing synonymous flags:

--upload-user ~ --upload-s3-access-key
--upload-pass ~ --upload-s3-secret-key
--upload-directory ~ --upload-s3-object-prefix
--upload-url ~ --upload-s3-endpoint

However, I haven't been able to dedicate the time yet.

Let me know if I can help. My original idea was to have this PR as a starting point and then move stuff out of the generic policy and the OS-specific ones in a second PR, but that was rejected, so I'm working on the full change now. As soon as I finish with that, we can start working on the refactor of S3 it that's OK with you. In fact we need to do some work with S3 uploads for the RH customer portal, so we could do both things in parallel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if I can help. My original idea was to have this PR as a starting point and then move stuff out of the generic policy and the OS-specific ones in a second PR, but that was rejected, so I'm working on the full change now. As soon as I finish with that, we can start working on the refactor of S3 it that's OK with you. In fact we need to do some work with S3 uploads for the RH customer portal, so we could do both things in parallel.

When you have a branch published for public view and somewhat functional let me know. I'll branch off of it and start migrating the s3 portions in then submit a PR targeting your branch for you to review.


As for the s3 refactoring, we can look into it and I'd be more than happy to try and make some time. I believe a few lend themselves easily, or at least I don't immediately recall any issues with using them, like URL, user, and password. One I do recall bringing up some questions is the --upload-directory. For example, should this be only the prefixes inside the bucket? Or should it split the directory like {bucket}/{prefix} on only the first slash? There may have been others, but I would have to review the LinuxPolicy.get_upload_xxxx() functions and internal self._vars again.

Without some "group think" I decided not to implement something I (or others) may have been unhappy with later but stuck with unless making breaking changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be less to refactor than I first thought as I haven't reviewed the code in almost a year. I guess ended up implementing some of it already. Hope I'm still happy with my choices after a year 😄

def get_upload_s3_bucket(self):
"""Helper function to determine if we should use the policy default
upload bucket or one provided by the user
:returns: The S3 bucket to use for upload
:rtype: ``str``
"""
if self.upload_url and self.upload_url.startswith('s3://'):
bucket_and_prefix = self.upload_url[5:].split('/', 1)
self.upload_s3_bucket = bucket_and_prefix[0]
if len(bucket_and_prefix) > 1:
self.upload_s3_object_prefix = bucket_and_prefix[1]
if not self.upload_s3_bucket:
self.prompt_for_upload_s3_bucket()
return self.upload_s3_bucket or self._upload_s3_bucket


.PP
.SH DESCRIPTION
upload is an sos subcommand to upload sos reports, logs, vmcores, or other files to a policy defined remote location, or a user defined one.
.SH REQUIRED ARGUMENTS
.B FILE
.TP
The path to the archive that is to be uploaded.
.SH OPTIONS
.TP
.B \--case-id NUMBER
Specify a case identifier to associate with the archive.
Identifiers may include alphanumeric characters, commas and periods ('.').
.TP
.B \--upload-url URL
If a vendor does not provide a default upload location, or if you would like to upload
the archive to a different location, specify the address here.

An upload protocol MUST be specified in this URL. Currently uploading is supported
for HTTPS, SFTP, and FTP protocols.

If your destination server listens on a non-standard port, specify the listening
port in the URL.
.TP
.B \-\-upload-user USER
If a vendor does not provide a default user for uploading, specify the username here.

If --batch is used and this option is omitted, no username will
be collected and thus uploads will fail if no vendor default is set.

You also have the option of providing this value via the SOSUPLOADUSER environment
variable. If this variable is set, then no username prompt will occur and --batch
may be used provided all other required values (case number, upload password)
are provided.
.TP
.B \-\-upload-pass PASS
Specify the password to use for authentication with the destination server.

If this option is omitted and upload is requested, you will be prompted for one.

If --batch is used, this prompt will not occur, so any uploads are likely to fail unless
this option is used.

Note that this will result in the plaintext string appearing in `ps` output that may
be collected by sos and be in the archive. If a password must be provided by you
for uploading, it is strongly recommended to not use --batch and enter the password
when prompted rather than using this option.
jcastill marked this conversation as resolved.
Show resolved Hide resolved

You also have the option of providing this value via the SOSUPLOADPASSWORD environment
variable. If this variable is set, then no password prompt will occur and --batch may
be used provided all other required values (case number, upload user) are provided.
.TP
.B \--upload-directory DIR
Specify a directory to upload to, if one is not specified by a vendor default location
or if your destination server does not allow writes to '/'.
.TP
.B \--upload-method METHOD
Specify the HTTP method to use for uploading to the provided --upload-url. Valid
values are 'auto' (default), 'put', or 'post'. The use of 'auto' will default to
the method required by the policy-default upload location, if one exists.

This option has no effect on upload protocols other than HTTPS.
.TP
.B \--upload-no-ssl-verify
Disable SSL verification for HTTPS uploads. This may be used to allow uploading
to locations that have self-signed certificates, or certificates that are otherwise
untrusted by the local system.

Default behavior is to perform SSL verification against all upload locations.
.TP
.B \--upload-protocol PROTO
Manually specify the protocol to use for uploading to the target \fBupload-url\fR.

Normally this is determined via the upload address, assuming that the protocol is part
of the address provided, e.g. 'https://example.com'. By using this option, sos will skip
the protocol check and use the method defined for the specified PROTO.

For RHEL systems, setting this option to \fBsftp\fR will skip the initial attempt to
upload to the Red Hat Customer Portal, and only attempt an upload to Red Hat's SFTP server,
which is typically used as a fallback target.

Valid values for PROTO are: 'auto' (default), 'https', 'ftp', 'sftp'.


.SH SEE ALSO
.BR sos (1)
.BR sos-report (1)
.BR sos-clean (1)
.BR sos.conf (5)
.BR sos-collect (1)

.SH MAINTAINER
.nf
Maintained on GitHub at https://github.com/sosreport/sos
.fi
.SH AUTHORS & CONTRIBUTORS
See \fBAUTHORS\fR file in the package documentation.
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@
('share/man/man1', ['man/en/sosreport.1', 'man/en/sos-report.1',
'man/en/sos.1', 'man/en/sos-collect.1',
'man/en/sos-collector.1', 'man/en/sos-clean.1',
'man/en/sos-mask.1', 'man/en/sos-help.1']),
'man/en/sos-mask.1', 'man/en/sos-help.1',
'man/en/sos-upload.1']),
('share/man/man5', ['man/en/sos.conf.5']),
('share/licenses/sos', ['LICENSE']),
('share/doc/sos', ['AUTHORS', 'README.md']),
Expand Down
4 changes: 3 additions & 1 deletion sos/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,12 @@ def __init__(self, args):
import sos.report
import sos.cleaner
import sos.help
import sos.upload
self._components = {
'report': (sos.report.SoSReport, ['rep']),
'clean': (sos.cleaner.SoSCleaner, ['cleaner', 'mask']),
'help': (sos.help.SoSHelper, [])
'help': (sos.help.SoSHelper, []),
'upload': (sos.upload.SoSUpload, [])
}
# some distros do not want pexpect as a default dep, so try to load
# collector here, and if it fails add an entry that implies it is at
Expand Down
8 changes: 0 additions & 8 deletions sos/cleaner/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
from concurrent.futures import ThreadPoolExecutor
from datetime import datetime
from pwd import getpwuid
from textwrap import fill

import sos.cleaner.preppers

Expand Down Expand Up @@ -177,13 +176,6 @@ def log_info(self, msg, caller=None):
def log_error(self, msg, caller=None):
self.soslog.error(self._fmt_log_msg(msg, caller))

def _fmt_msg(self, msg):
width = 80
_fmt = ''
for line in msg.splitlines():
_fmt = _fmt + fill(line, width, replace_whitespace=False) + '\n'
return _fmt

@classmethod
def display_help(cls, section):
section.set_title("SoS Cleaner Detailed Help")
Expand Down
8 changes: 0 additions & 8 deletions sos/collector/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@
from getpass import getpass
from pathlib import Path
from shlex import quote
from textwrap import fill
from sos.cleaner import SoSCleaner
from sos.collector.sosnode import SosNode
from sos.options import ClusterOption, str_to_bool
Expand Down Expand Up @@ -712,13 +711,6 @@ def _get_archive_path(self):
compr = 'gz'
return self.tmpdir + '/' + self.arc_name + '.tar.' + compr

def _fmt_msg(self, msg):
width = 80
_fmt = ''
for line in msg.splitlines():
_fmt = _fmt + fill(line, width, replace_whitespace=False) + '\n'
return _fmt

def _load_group_config(self):
"""
Attempts to load the host group specified on the command line.
Expand Down
8 changes: 8 additions & 0 deletions sos/component.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
import sys
import time

from textwrap import fill
from argparse import SUPPRESS
from datetime import datetime
from getpass import getpass
Expand Down Expand Up @@ -458,6 +459,13 @@ def _setup_logging(self):
def get_temp_file(self):
return self.tempfile_util.new()

def _fmt_msg(self, msg):
width = 80
_fmt = ''
for line in msg.splitlines():
_fmt = _fmt + fill(line, width, replace_whitespace=False) + '\n'
return _fmt


class SoSMetadata():
"""This class is used to record metadata from a sos execution that will
Expand Down
6 changes: 4 additions & 2 deletions sos/help/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,8 @@ def get_obj_for_topic(self):
'collector': 'SoSCollector',
'collector.transports': 'RemoteTransport',
'collector.clusters': 'Cluster',
'policies': 'Policy'
'policies': 'Policy',
'upload': 'SoSUpload'
}

cls = None
Expand Down Expand Up @@ -206,7 +207,8 @@ def display_self_help(self):
'report.plugins.$plugin': 'Information on a specific $plugin',
'clean': 'Detailed help on the clean command',
'collect': 'Detailed help on the collect command',
'policies': 'How sos operates on different distributions'
'upload': 'Detailed help on the upload command',
'policies': 'How sos operates on different distributions',
}

for sect, value in sections.items():
Expand Down
7 changes: 3 additions & 4 deletions sos/policies/distros/redhat.py
Original file line number Diff line number Diff line change
Expand Up @@ -298,7 +298,7 @@ def get_upload_url(self):
self.ui_log.info("No case id provided, uploading to SFTP")
return RH_SFTP_HOST
rh_case_api = "/support/v1/cases/%s/attachments"
return RH_API_HOST + rh_case_api % self.case_id
return RH_API_HOST + rh_case_api % self.commons['cmdlineopts'].case_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? AFAIK self.case_id might be blank (and common's case_id set) only in scenario "case id not in cmdline, batch not in cmdline" - should not upload query for case_id, then? (or am I wrong here with my assumption)?

(that concern is valid for sure (while I can be wrong on its impact to this code change):

# python3 bin/sos upload /var/tmp/sosreport-pmoravec-rhel8-012345678-2024-08-13-gbiatgg.tar.xz

sos upload (version 4.7.2)
This utility is used to upload files to a policy-default location.

The archive to be uploaded may contain data considered sensitive and its content
should be reviewed by the originating organization before being passed to any
third party.

No configuration changes will be made to the system running this utility.


Press ENTER to continue, or CTRL-C to quit


Attempting to upload file /var/tmp/sosreport-pmoravec-rhel8-012345678-2024-08-13-gbiatgg.tar.xz to case 
No case id provided, uploading to SFTP
No case id provided, uploading to SFTP
Attempting upload to Red Hat Secure FTP
..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the things we talked about internally when I first started playing with 'upload'. If you remember, the issue was that without this change, we were getting 'None' on the case_id and it was failing to build the url, and so failed to upload. I have the feeling that I've done something wrong on the upload side and I'm not passing the case_id correctly.
My hope is that more experienced eyes, or at least fresher, can tell me where I'm failing.


def _get_upload_https_auth(self):
str_auth = f"Bearer {self._device_token}"
Expand Down Expand Up @@ -441,16 +441,15 @@ def check_file_too_big(self, archive):
f"{convert_bytes(self._max_size_request)} "
" via sos http upload. \n")
)
return RH_SFTP_HOST
return RH_API_HOST
self.upload_url = RH_SFTP_HOST

def upload_archive(self, archive):
"""Override the base upload_archive to provide for automatic failover
from RHCP failures to the public RH dropbox
"""
try:
if self.get_upload_url().startswith(RH_API_HOST):
self.upload_url = self.check_file_too_big(archive)
self.check_file_too_big(archive)
jcastill marked this conversation as resolved.
Show resolved Hide resolved
uploaded = super().upload_archive(archive)
except Exception as e:
uploaded = False
Expand Down
Loading
Loading