Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to trigger pipeline with historical Git commit after material url change #11961

Open
xiazihang opened this issue Sep 6, 2023 · 3 comments

Comments

@xiazihang
Copy link

xiazihang commented Sep 6, 2023

Issue Type
  • Feature enhancement
Summary

Hi team, recently our project is updating the VCS platform and it requires a change of the materials' url in the GoCD pipeline configuration, however we noticed that after we changed the material url to point to the new system, we lost some of the commits histories when we trigger the pipeline. Understand from @chadlwilson this is due to the nature of the implementation of the GoCD metadata persistence, however would like to check with the team is there any idea/plan to enhance a bit wrt this feature as this is a quite common situation which might encountered by other users

Steps to Reproduce
  1. Find a test repo which has enough commits history
  2. Have your GoCD pipeline's git material point to the test repo
  3. Update the name of the test repo
  4. Update your GoCD pipeline's git material to point to the new url
  5. Trigger this pipeline with options
  6. Try to search some legacy commits

You might not be able replicate this issue if your branch doesnt have enough historical commits

@chadlwilson chadlwilson changed the title GoCD Git material lost commits data when change the material's url Unable to trigger pipeline his historical Git commit after material url change Sep 7, 2023
@chadlwilson chadlwilson changed the title Unable to trigger pipeline his historical Git commit after material url change Unable to trigger pipeline with historical Git commit after material url change Sep 7, 2023
@chadlwilson
Copy link
Member

chadlwilson commented Oct 31, 2023

@arvindsv do you have any opinions/context on this?

It feels that there should be a way to override this, or if the GoCD awareness/storage of the 'material revision' is critical, to force the GoCD server to find/load a historical commit that it does not currently know about.

I'm sure it's a minefield, but this is a pretty nasty problem. The bigger picture problem is the fingerprint and material identity system not allowing you to indicate 'this url is logically identical to this other URL, just a different repository manager' but that one seems a bit intractable to me 😅

@arvindsv
Copy link
Member

arvindsv commented Nov 5, 2023

Yes, a bit of a minefield. It'll be a little hard to think of the effects of making this change. The fingerprint decided based on the config information, at runtime, needs to match the one in the DB.

One challenge I see is that, from a user experience perspective, the change in config will likely be done first (as in Step 4, mentioned above). At this point, the material will be added to the DB with a different fingerprint. There could also be material updates which happen and find commits, which will be inserted into the DB against that material.

Now, we need to "merge" those two materials in the DB:

  1. We can't set both materials to the same fingerprint. Even if we could, that would probably cause other issues, since "single material = single fingerprint" will be a base expectation.

  2. We could delete the new material and change the old material's URL to that of the new. As long as the code that is generating the fingerprint from the config is aware of something like this, it could be possible. But, if we do this, then if the URL is changed again to that of the old one, it'll cause problems. Now, the old URL will look like a new material, which maps to the fingerprint of an existing material in the DB.

  3. With point 2 above, even if we do make the merge happen, there could be commits against that old material, with builds against that commit.

The main problem I see is that the config can be changed outside of GoCD APIs, in a sense. If we had a single point / API call that provided this functionality, we could provide an option while that change is being made, to rewrite the URL in the DB or something.

What are you thinking? Do you see a way to do this safely, without causing a lot of problems?

PS: Remember that this "functionality" is something we use and recommend for people who sometimes have trouble with the VSM ("add a slash at the end of the URL"). :/

@chadlwilson
Copy link
Member

PS: Remember that this "functionality" is something we use and recommend for people who sometimes have trouble with the VSM ("add a slash at the end of the URL"). :/

Indeed :-)

What are you thinking? Do you see a way to do this safely, without causing a lot of problems?

I'm not really thinking too hard in any direction, but I certainly wasn't thinking of going down some fingerprint merging path. Just too difficult and messy as you allude to.

I was wondering if something simpler might unblock things. At root, it seems that the issue is when a material is added, we do not go back through and insert material revisions/MODIFICATIONS for the entire history so we don't know about anything other than the latest revision.

Option 1
What if the user interaction with the 'trigger with options' dialog to search for/retrieve a specific material revision could get GoCD to trigger the relevant SCM and populate the MODIFICATIONS from $queriedRevision -> HEAD?

I haven't thought it through well at all, but essentially

  • the material_search API would ask the material for a latest-revisions-since = $queriedRevision~1 (yes, this is git-specific, haven't quite thought this through properly to find a way without requiring SCM plugin API change)
    public String search(Request request, Response response) throws IOException {
    String pipelineName = request.queryParams("pipeline_name");
    String fingerprint = request.queryParams("fingerprint");
    String searchText = request.queryParamOrDefault("search_text", "");
    HttpLocalizedOperationResult result = new HttpLocalizedOperationResult();
    List<MatchedRevision> matchedRevisions = materialService.searchRevisions(pipelineName, fingerprint, searchText, currentUsername(), result);
    if (result.isSuccessful()) {
    return writerForTopLevelArray(request, response, outputListWriter -> MatchedRevisionRepresenter.toJSON(outputListWriter, matchedRevisions));
    } else {
    return renderHTTPOperationResult(result, request, response);
    }
    }
  • if it is not found in the database, some logic would allow for the material to be directly queried (not for everything though, perhaps only for things that are exact revisions, if possible to ask the material!?) expanding this with some kind of second-level search
    MaterialConfig materialConfig = goConfigService.materialForPipelineWithFingerprint(pipelineName, fingerprint);
    return materialRepository.findRevisionsMatching(materialConfig, searchString);
  • if the material finds that the revision exists in the branch/refspec's history (with some limits going backwards from HEAD, perhaps 1000 revisions or something) it'd back-populate the modifications via
    public void saveModifications(MaterialInstance materialInstance, List<Modification> newChanges) {
  • The revision should now be able to be retrieved into the UI and selected for triggering

This would achieve everything one might want to achieve re: fan-in and such, but I think it'd at least allow you to trigger a build with a commit from history - either for a brand new material, OR one where the URL has changed to a new underlying material.

User experience would possibly suck too.

Option 2
Allow the UI "trigger with options" to be triggered without the commit being known to the history, and resolved "later" when the material is asked to checkout at a specific revision. I think you'd still need to populate the material's history to avoid other weirdness though (I remember you telling me that these modifications are somehow used in fan in logic somehow.... eugh), so this perhaps seems messier.

Option 3
Wild idea - when inserting new materials, always get N revisions (as well as the latest) from the history - or allow the user to somehow indicate this. Yeah, one would be guessing whether you need it, and there is probably no # of revisions that'd be good enough without catching people out.


I've no idea what consequences there would be for either of these - would probably need testing :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants