-
Notifications
You must be signed in to change notification settings - Fork 437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: Check of knowledge document exists #1192
base: main
Are you sure you want to change the base?
Conversation
Resolves instructlab#1188 Signed-off-by: Christian Kadner <[email protected]>
Signed-off-by: Christian Kadner <[email protected]>
Signed-off-by: Christian Kadner <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is worth the complexity since it only covers some specific scenarios and not the case in general.
"https://raw.githubusercontent.com/" | ||
) | ||
for file_path in patterns: | ||
if "*" in file_path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this only works if no glob is used on a github repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. I agree it's usefulness is limited
for file_path in patterns: | ||
if "*" in file_path: | ||
continue | ||
url = f"{repo}/{commit}/{file_path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will probably fail if the user supplies a proper git URL ending in .git
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, I looked at our summit bounty submissions and thought we could start with something simple, but agree this is a narrow use case
commit = parsed["document"].get("commit") | ||
patterns = parsed["document"].get("patterns") | ||
|
||
line = self.get_yaml_line( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could end up reporting no yq
command twice for the same yaml file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right and then we would fix our CI setup ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is not on the CI where we already know yq is installed. It is on the user's workstation where they run this script that they may not have yq installed.
@@ -20,6 +20,9 @@ | |||
from referencing.exceptions import NoSuchResource | |||
from referencing.jsonschema import DRAFT202012 | |||
from referencing.typing import URI | |||
from urllib.request import Request, urlopen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a 3rd party library, then we probably also need to add it to requirements.txt, no? If it is not, then these statements should be moved under Standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, built in
@@ -253,12 +247,65 @@ def check(self) -> int: | |||
message="The file must be non-empty", | |||
) | |||
|
|||
if "document" in parsed: | |||
repo = parsed["document"].get("repo") | |||
commit = parsed["document"].get("commit") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to validate the commit value is a valid SHA value. This is probably best in the schema were we can use a regex pattern.
https://json-schema.org/understanding-json-schema/reference/string#regexp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened instructlab/schema#30 for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, that is probably the most/more cost effective change here
Add a primitive check to linter to verify knowledge document exists with the specified commit sha
FYI @bjhargrave -- I thought I could do this with 5 lines of Python code. Turned into a few lines more. I mark it a draft as this still needs more thought 😊
Resolves #1188