Bug Report
Description
When using the Istio Gateway API integration with networking.istio.io/service-type: ClusterIP on a Gateway resource, the aws-load-balancer-controller's Service reconciler panics with a nil pointer dereference. The panic is recovered, so it is log noise only (ALBs and NLBs are unaffected), but it fires on every reconcile loop.
What happened
Istiod creates a ClusterIP-typed Service (e.g. dev-gateway-external-istio) for a Gateway annotated with networking.istio.io/service-type: ClusterIP. This suppresses NLB provisioning, with the intent that an external ALB is provisioned separately via a Kubernetes Ingress resource targeting the Gateway pods directly (target-type: ip).
The Service reconciler picks up this ClusterIP service and calls buildModel, which correctly returns lb == nil. The if lb == nil cleanup branch runs and completes successfully (no-op since there is no finalizer). However, the reconciler then falls through to reconcileLoadBalancerResources(ctx, svc, stack, lb, backendSGRequired) with lb still nil. Inside that function, lb.DNSName().Resolve(ctx) dereferences the nil pointer and panics.
Observed log
{
"level": "error",
"controller": "service",
"name": "dev-gateway-external-istio",
"namespace": "istio-ingress",
"error": "panic: runtime error: invalid memory address or nil pointer dereference [recovered]"
}
Root cause
In controllers/service/service_controller.go, the reconcile() function is missing a return nil after the successful cleanup path when lb == nil:
// controllers/service/service_controller.go
func (r *serviceReconciler) reconcile(ctx context.Context, req reconcile.Request) error {
...
if lb == nil {
cleanupLoadBalancerFn := func() {
err = r.cleanupLoadBalancerResources(ctx, svc, stack)
}
r.metricsCollector.ObserveControllerReconcileLatency(controllerName, "cleanup_load_balancer", cleanupLoadBalancerFn)
if err != nil {
return ctrlerrors.NewErrorWithMetrics(controllerName, "cleanup_load_balancer_error", err, r.metricsCollector)
}
// BUG: missing `return nil` here — falls through to reconcileLoadBalancerResources with lb == nil
}
return r.reconcileLoadBalancerResources(ctx, svc, stack, lb, backendSGRequired)
}
The nil dereference occurs at this line inside reconcileLoadBalancerResources:
lbDNS, err = lb.DNSName().Resolve(ctx) // panic: lb is nil
Proposed fix
Add a return nil after the successful cleanup when lb == nil:
if lb == nil {
cleanupLoadBalancerFn := func() {
err = r.cleanupLoadBalancerResources(ctx, svc, stack)
}
r.metricsCollector.ObserveControllerReconcileLatency(controllerName, "cleanup_load_balancer", cleanupLoadBalancerFn)
if err != nil {
return ctrlerrors.NewErrorWithMetrics(controllerName, "cleanup_load_balancer_error", err, r.metricsCollector)
}
return nil // <-- fix: do not fall through to reconcileLoadBalancerResources when there is no LB
}
return r.reconcileLoadBalancerResources(ctx, svc, stack, lb, backendSGRequired)
Steps to reproduce
- Install the aws-load-balancer-controller (tested on v2.17.1; bug present on current
main).
- Install Istio with Gateway API support.
- Create a Gateway resource with
networking.istio.io/service-type: ClusterIP in the annotations:
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: my-external-gateway
namespace: istio-ingress
annotations:
networking.istio.io/service-type: ClusterIP
spec:
gatewayClassName: istio
listeners:
- name: https
port: 443
protocol: HTTPS
...
- Observe repeated panic logs from the Service reconciler for the
<gateway-name>-istio ClusterIP service.
Environment
- aws-load-balancer-controller version: v2.17.1 (also confirmed unfixed on current
main / v3.2.2)
- Kubernetes version: 1.32
- Istio version: 1.24.x
- Gateway API version: v1.2.x
Impact
The panic is recovered by the runtime.HandleReconcileError wrapper, so actual load balancer provisioning is unaffected. ALBs (provisioned via Ingress) and NLBs (provisioned via LoadBalancer-typed Services) continue to function correctly. The issue produces repeated error log noise on every reconcile interval for the affected service.
Bug Report
Description
When using the Istio Gateway API integration with
networking.istio.io/service-type: ClusterIPon a Gateway resource, the aws-load-balancer-controller's Service reconciler panics with a nil pointer dereference. The panic is recovered, so it is log noise only (ALBs and NLBs are unaffected), but it fires on every reconcile loop.What happened
Istiod creates a
ClusterIP-typed Service (e.g.dev-gateway-external-istio) for a Gateway annotated withnetworking.istio.io/service-type: ClusterIP. This suppresses NLB provisioning, with the intent that an external ALB is provisioned separately via a Kubernetes Ingress resource targeting the Gateway pods directly (target-type: ip).The Service reconciler picks up this ClusterIP service and calls
buildModel, which correctly returnslb == nil. Theif lb == nilcleanup branch runs and completes successfully (no-op since there is no finalizer). However, the reconciler then falls through toreconcileLoadBalancerResources(ctx, svc, stack, lb, backendSGRequired)withlbstillnil. Inside that function,lb.DNSName().Resolve(ctx)dereferences the nil pointer and panics.Observed log
Root cause
In
controllers/service/service_controller.go, thereconcile()function is missing areturn nilafter the successful cleanup path whenlb == nil:The nil dereference occurs at this line inside
reconcileLoadBalancerResources:Proposed fix
Add a
return nilafter the successful cleanup whenlb == nil:Steps to reproduce
main).networking.istio.io/service-type: ClusterIPin the annotations:<gateway-name>-istioClusterIP service.Environment
main/ v3.2.2)Impact
The panic is recovered by the
runtime.HandleReconcileErrorwrapper, so actual load balancer provisioning is unaffected. ALBs (provisioned via Ingress) and NLBs (provisioned via LoadBalancer-typed Services) continue to function correctly. The issue produces repeated error log noise on every reconcile interval for the affected service.