Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: adds a status check for NCs #2716

Closed
wants to merge 1 commit into from
Closed

Conversation

abel2-code
Copy link

Reason for Change:
This is part of my intern project to help improve error messages.

Issue Fixed:
N/A

Requirements:

Notes:

@abel2-code abel2-code requested a review from a team as a code owner April 29, 2024 19:01
@abel2-code abel2-code requested a review from rbtr April 29, 2024 19:01
@@ -135,6 +137,21 @@ type containerstatus struct {
VfpUpdateComplete bool // True when VFP programming is completed for the NC
}

type multiContainerStatus map[string]containerstatus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have to create a type for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. This does not need to be a type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable is actually not needed as you are simply duplicating 'service.state.ContainerStatus'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somewhat disagree with the sentiment in this thread. This should be an error implementation if anything, but the problem as it currently stands is that GetUnsuccessfulStatusErrors can return "" making it a poor error (since in its typical usage err != nil, but it is, in fact, nil).

So the idea of having some function determine whether or not there were errors is a good one, but it should actually produce an error or nil--not a string.

It should read something like this:

err := anyFailed(service.state.ContainerStatus)
if err != nil {
       // create the response but use `err.Error()`
}

You could have an error type like this to better deal with this:

type MultiContainerError map[string]v1alpha1.NCStatus

func (m *MultiContainerError) Error() string {
       out := bytes.NewBufferString("multiple NCs failed: ")
       for ncid, status := range m {
              fmt.Fprintf(out, "%s (%s) ")
       }
       return out.String()
}

@@ -135,6 +137,21 @@ type containerstatus struct {
VfpUpdateComplete bool // True when VFP programming is completed for the NC
}

type multiContainerStatus map[string]containerstatus

func (mcs *multiContainerStatus) GetUnsuccessfulStatusErrors() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of making this a struct method, can we just make it a func that accepts `map[string]containerstatus instead?

Suggested change
func (mcs *multiContainerStatus) GetUnsuccessfulStatusErrors() string {
func GetUnsuccessfulStatusErrors(mcs map[string]containerstatus) string {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1/2) also not a fan of this function returning a string, maybe just return a list of NCs that doesn't have the NCUpdateSuccess status, and then we can do the processing in the caller?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the method is the create the CNI Error message so it is ok to return the string here directly.

@@ -116,6 +116,23 @@ func (service *HTTPRestService) RequestIPConfigHandler(w http.ResponseWriter, r
return
}

// Check the status of the NC
ncStatuses := multiContainerStatus(service.state.ContainerStatus)
if unsuccessfulStatuses := ncStatuses.GetUnsuccessfulStatusErrors(); unsuccessfulStatuses != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(2/2) and then we can just check for the length of the list returned.

reserveResp := &cns.IPConfigResponse{
Response: cns.Response{
ReturnCode: types.FailedToAllocateIPConfig,
Message: unsuccessfulStatuses,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All you are trying to do is get the NC Status and create an error message from it. So change the name of the method to 'GetNcStatusErrorMessage'

Comment on lines +142 to +153
func (mcs *multiContainerStatus) GetUnsuccessfulStatusErrors() string {
var unsuccessfulStatuses []string
for ncID := range *mcs {
ncStatus := (*mcs)[ncID].CreateNetworkContainerRequest.NCStatus
if ncStatus != cns.NCUpdateSuccess {
unsuccessfulStatus := fmt.Sprintf("Expected status for NC %s to be %s but got %s", ncID, string(cns.NCUpdateSuccess), string(ncStatus))
unsuccessfulStatuses = append(unsuccessfulStatuses, unsuccessfulStatus)
}
}

return strings.Join(unsuccessfulStatuses, "\n")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (mcs *multiContainerStatus) GetUnsuccessfulStatusErrors() string {
var unsuccessfulStatuses []string
for ncID := range *mcs {
ncStatus := (*mcs)[ncID].CreateNetworkContainerRequest.NCStatus
if ncStatus != cns.NCUpdateSuccess {
unsuccessfulStatus := fmt.Sprintf("Expected status for NC %s to be %s but got %s", ncID, string(cns.NCUpdateSuccess), string(ncStatus))
unsuccessfulStatuses = append(unsuccessfulStatuses, unsuccessfulStatus)
}
}
return strings.Join(unsuccessfulStatuses, "\n")
}
func GetNcStatusErrorMessages(mcs map[string]containerstatus) string {
var unsuccessfulStatuses []string
for ncID, containerStatus := range mcs {
ncStatus := containerStatus.CreateNetworkContainerRequest.NCStatus
if ncStatus != cns.NCUpdateSuccess {
unsuccessfulStatuses = append(unsuccessfulStatuses, fmt.Sprintf("Expected status for NC %s to be %s but got %s. ", ncID, string(cns.NCUpdateSuccess), string(ncStatus)))
}
}
return strings.Join(unsuccessfulStatuses, "\n")
}

Comment on lines +119 to +134
// Check the status of the NC
ncStatuses := multiContainerStatus(service.state.ContainerStatus)
if unsuccessfulStatuses := ncStatuses.GetUnsuccessfulStatusErrors(); unsuccessfulStatuses != "" {
// If the status is anything other than success, we send a response back with the actual status and NC ID.
reserveResp := &cns.IPConfigResponse{
Response: cns.Response{
ReturnCode: types.FailedToAllocateIPConfig,
Message: unsuccessfulStatuses,
},
}

w.Header().Set(cnsReturnCode, reserveResp.Response.ReturnCode.String())
err = service.Listener.Encode(w, &reserveResp)
logger.ResponseEx(service.Name+operationName, ipconfigRequest, reserveResp, reserveResp.Response.ReturnCode, err)
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By checking the NCStatus early you are blocking the call to the service. I don't think it is the right behavior here. This will block IP allocations from the IPs that are available on the node itself and are getting reclaimed while the NC status is SubnetFull or something else. We should not be determining the response state from the state of the NC. Basically it should be more of a helper method to add to the response message on line 153.

for ncID := range *mcs {
ncStatus := (*mcs)[ncID].CreateNetworkContainerRequest.NCStatus
if ncStatus != cns.NCUpdateSuccess {
unsuccessfulStatus := fmt.Sprintf("Expected status for NC %s to be %s but got %s", ncID, string(cns.NCUpdateSuccess), string(ncStatus))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good output for a test, but not for an error. Generally, it should be concise and lower-case. As in my other comment, I think something like this would be more typical:

multiple NCs failed: uuid1 (status1), uuid2 (status2), uuid3 (status3)

It's implied that the expectation was success, so logging that just consumes log disk space. Though that is perfectly fine (and desirable!) to mention in go test output.

}
}

return strings.Join(unsuccessfulStatuses, "\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it's unusual to join by newlines for errors since the log lines themselves are newline separated

Copy link

This pull request is stale because it has been open for 2 weeks with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added the stale Stale due to inactivity. label May 17, 2024
Copy link

Pull request closed due to inactivity.

@github-actions github-actions bot closed this May 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Stale due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants