-
Notifications
You must be signed in to change notification settings - Fork 437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Change document
to source
#661
Comments
Capture same comment here too instructlab/instructlab#776 (comment) PS: this issue is just a proposal, in the hopes that a discussion will ensue about the priority of this work. Totally reasonable to just ignore is if others don't see it as a high priority issue/change will take a lot of effort to get through before opening and team does not have cycles to implement the change 😀 |
This issue should probably be in the https://github.com/instruct-lab/schema/ repo. |
@anik120 I think it is beyond when this change could be made. Perhaps we can close this issue? |
Capturing a discussion with @shivchander:
I was writing up a test case for the lmdk cli to test knowledge workflow, but the way that I laid out my qna.yaml is as follows:
Essentially, the
seed_example
question/answers I have there are from the overarching project websites https://operatorframework.io/, https://olm.operatorframework.io/ and https://sdk.operatorframework.io/, and the documents I have in https://github.com/anik120/knowledge-doc-test are README.mds from the components' GitHub repositories. In other words, theseed_example
question/answers do not actually come from the documents hosted indocument.repo
.The way I laid things out, the seed_examples are "product pitch/summary description" and
document.repo
contains all the docs I want the model to learn about.Shiv tells me that that's the wrong way of thinking about it, and the verb
document
should besource
in reality, andseed_examples
are examples of questions/answers that can be answered by the model once it's been trained on the docs hosted indocs.repo
.Eureka moment: Even after learning* how the taxonomy interacts with the model, I was thinking about the structure of my
qna.yaml
, the wrong way. It's likely that other users will also confuse the taxonomy/model interactions and lay out theqna.yaml
files the wrong way, leading to PR submissions that'll likely not improve model quality.*only a little while ago, ie fresh info being processed by brain still
Proposed fix: Change document to source
cc: @xukai92 @abhi1092 @aldopareja
The text was updated successfully, but these errors were encountered: