Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Envoy Proxy for Terminating Gateway fails to configure dynamic cluster #20439

Open
cyclops23 opened this issue Apr 18, 2024 · 0 comments
Open

Comments

@cyclops23
Copy link

Nomad version

Nomad v1.7.6
BuildDate 2024-03-12T07:27:36Z
Revision 594fedbfbc4f0e532b65e8a69b28ff9403eb822e

Consul version

Consul v1.18.1
Revision 98cb473c
Build Date 2024-03-26T21:59:08Z

Operating system and Environment details

Linux ip-XX-XX-XXX-XXX 5.15.0-1056-aws #61~20.04.1-Ubuntu SMP Wed Mar 13 17:45:04 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

Issue

I'm attempting to set up a terminating gateway for DynamoDB. The Envoy proxy is started successfully but the dynamic cluster representing the terminating gateway service is never added.

In the Consul / Nomad UIs everything looks good:

Screenshot 2024-04-18 at 12 03 21 Screenshot 2024-04-18 at 12 03 49

however the external service is not accessible through the service mesh.

Reproduction steps

  1. Create an external service in Consul
  2. Create a terminating gateway job in Nomad

I've uploaded the relevant configuration files to https://github.com/cyclops23/nomad-bug-tgw

Expected Result

The external service should be accessible through the terminating gateway (or some meaningful error message should be provided if there is a problem with the configuration).

Expect to see the dynamic cluster representing the external service to be added to Envoy like this example:

[2024-04-18 11:42:34.877][1][info][upstream] [source/common/upstream/cluster_manager_impl.cc:222] cm init: initializing cds
[2024-04-18 11:42:34.879][1][info][main] [source/server/server.cc:934] starting main dispatch loop
[2024-04-18 11:42:34.884][1][info][upstream] [source/common/upstream/cds_api_helper.cc:32] cds: add 1 cluster(s), remove 0 cluster(s)
[2024-04-18 11:42:34.919][1][info][upstream] [source/common/upstream/cds_api_helper.cc:71] cds: added/updated 1 cluster(s), skipped 0 unmodified cluster(s)
[2024-04-18 11:42:34.921][1][info][upstream] [source/common/upstream/cluster_manager_impl.cc:226] cm init: all clusters initialized

Actual Result

Requests to the external service are routed to the terminating gateway and fail.

Inspecting the Envoy logs shows that the cluster for the gateway is never added via xDS:

[info] cm init: initializing cds
[info] starting main dispatch loop
[debug] [Tags: \"ConnectionId\":\"0\"] connected
[debug] [Tags: \"ConnectionId\":\"0\"] connected
[debug] [Tags: \"ConnectionId\":\"0\"] attaching to next stream
[debug] [Tags: \"ConnectionId\":\"0\"] creating stream
[debug] [Tags: \"ConnectionId\":\"0\",\"StreamId\":\"10539488469085721247\"] pool ready
[debug] [Tags: \"ConnectionId\":\"0\",\"StreamId\":\"10539488469085721247\"] upstream headers complete: end_stream=false
[debug] async http request response headers (end_stream=false):\n':status', '200'\n'content-type', 'application/grpc'\n
[debug] Received DeltaDiscoveryResponse for type.googleapis.com/envoy.config.cluster.v3.Cluster at version 
[info] cds: add 0 cluster(s), remove 0 cluster(s)
[info] cds: added/updated 0 cluster(s), skipped 0 unmodified cluster(s)
[debug] maybe finish initialize state: 4
[debug] maybe finish initialize primary init clusters empty: true
[debug] maybe finish initialize secondary init clusters empty: true
[debug] maybe finish initialize cds api ready: true
[info] cm init: all clusters initialized

Additional config / debug info

# consul config read -kind terminating-gateway -name ext-dynamodb-tgw
{
    "Kind": "terminating-gateway",
    "Name": "ext-dynamodb-tgw",
    "Services": [
        {
            "Name": "ext-dynamodb",
            "CAFile": "/etc/ssl/certs/Amazon_Root_CA_1.pem",
            "SNI": "dynamodb.us-east-1.amazonaws.com"
        }
    ],
    "CreateIndex": 438709,
    "ModifyIndex": 438709
}
# curl -s -H "X-Consul-Token:${CONSUL_HTTP_TOKEN}" "${CONSUL_HTTP_ADDR}/v1/catalog/service/ext-dynamodb-tgw" | jq  '.[] | { ServiceKind, ServiceName, ServiceID }'
{
  "ServiceKind": "terminating-gateway",
  "ServiceName": "ext-dynamodb-tgw",
  "ServiceID": "_nomad-task-f0b1b6d5-ef0f-ec7c-14c6-3112685453aa-group-ext-dynamodb-tgw-ext-dynamodb-tgw-connect-terminating-ext-dynamodb-tgw"
}
# cat .envoy_bootstrap.cmd
connect envoy -grpc-addr unix://alloc/tmp/consul_grpc.sock -http-addr 127.0.0.1:8501 -admin-bind 127.0.0.2:19000 -address 127.0.0.1:19100 -proxy-id _nomad-task-f0b1b6d5-ef0f-ec7c-14c6-3112685453aa-group-ext-dynamodb-tgw-ext-dynamodb-tgw-connect-terminating-ext-dynamodb-tgw -bootstrap -gateway terminating -token <REDACTED> -grpc-ca-file /opt/consul/tls/ca.pem -ca-file /opt/consul/tls/ca.pem -client-cert /opt/nomad/tls/cert.pem -client-key /opt/nomad/tls/private-key.pem
# cat .envoy_bootstrap.env
[
    "LANG=C.UTF-8",
    "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin",
    "HOME=/root",
    "LOGNAME=root",
    "USER=root",
    "SHELL=/bin/sh",
    "INVOCATION_ID=92dfa72f396a42d69b4c3d62c526fcc5",
    "JOURNAL_STREAM=8:27232",
    "CONSUL_HTTP_SSL=true",
    "CONSUL_HTTP_SSL_VERIFY=false",
    "NOMAD_ALLOC_ID=f0b1b6d5-ef0f-ec7c-14c6-3112685453aa",
    "NOMAD_SHORT_ALLOC_ID=f0b1b6d5",
    "NOMAD_ALLOC_NAME=ext-dynamodb-tgw.ext-dynamodb-tgw[0]",
    "NOMAD_GROUP_NAME=ext-dynamodb-tgw",
    "NOMAD_JOB_NAME=ext-dynamodb-tgw",
    "NOMAD_JOB_ID=ext-dynamodb-tgw",
    "NOMAD_NAMESPACE=default",
    "NOMAD_REGION=global"
]
# cat envoy_bootstrap.json
{
  "admin": {
    "access_log": [
      {
        "name": "Consul Listener Filter Log",
        "typedConfig": {
          "@type": "type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog",
          "logFormat": {
            ...
          }
        }
      }
    ],
    "address": {
      "socket_address": {
        "address": "127.0.0.2",
        "port_value": 19000
      }
    }
  },
  "node": {
    "cluster": "terminating-gateway",
    "id": "_nomad-task-f0b1b6d5-ef0f-ec7c-14c6-3112685453aa-group-ext-dynamodb-tgw-ext-dynamodb-tgw-connect-terminating-ext-dynamodb-tgw",
    "metadata": {
      "namespace": "default",
      "partition": "default"
    }
  },
  "layered_runtime": {
    "layers": [
      {
        "name": "base",
        "static_layer": {
          "re2.max_program_size.error_level": 1048576
        }
      }
    ]
  },
  "static_resources": {
    "clusters": [
      {
        "name": "local_agent",
        "ignore_health_on_host_removal": false,
        "connect_timeout": "1s",
        "type": "STATIC",
        "transport_socket": {
          "name": "tls",
          "typed_config": {
            "@type": "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext",
            "common_tls_context": {
              "validation_context": {
                "trusted_ca": {
                  "inline_string": "-----BEGIN CERTIFICATE-----\n<REDACTED\n-----END CERTIFICATE-----\n"
                }
              }
            }
          }
        },
        "typed_extension_protocol_options": {
          "envoy.extensions.upstreams.http.v3.HttpProtocolOptions": {
            "@type": "type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions",
            "explicit_http_config": {
              "http2_protocol_options": {}
            }
          }
        },
        "loadAssignment": {
          "clusterName": "local_agent",
          "endpoints": [
            {
              "lbEndpoints": [
                {
                  "endpoint": {
                    "address": {
                      "pipe": {
                        "path": "alloc/tmp/consul_grpc.sock"
                      }
                    }
                  }
                }
              ]
            }
          ]
        }
      },
      {
        "name": "self_admin",
        "ignore_health_on_host_removal": false,
        "connect_timeout": "5s",
        "type": "STATIC",
        "typed_extension_protocol_options": {
          "envoy.extensions.upstreams.http.v3.HttpProtocolOptions": {
            "@type": "type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions",
            "explicit_http_config": {
              "http_protocol_options": {}
            }
          }
        },
        "loadAssignment": {
          "clusterName": "self_admin",
          "endpoints": [
            {
              "lbEndpoints": [
                {
                  "endpoint": {
                    "address": {
                      "socket_address": {
                        "address": "127.0.0.2",
                        "port_value": 19000
                      }
                    }
                  }
                }
              ]
            }
          ]
        }
      }
    ],
    "listeners": [
      {
        "name": "envoy_prometheus_metrics_listener",
        "address": {
          "socket_address": {
            "address": "127.0.0.1",
            "port_value": 9102
          }
        },
        "filter_chains": [
          {
            "filters": [
              {
                "name": "envoy.filters.network.http_connection_manager",
                "typedConfig": {
                  "@type": "type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager",
                  "stat_prefix": "envoy_prometheus_metrics",
                  "codec_type": "HTTP1",
                  "route_config": {
                    "name": "self_admin_route",
                    "virtual_hosts": [
                      {
                        "name": "self_admin",
                        "domains": [
                          "*"
                        ],
                        "routes": [
                          {
                            "match": {
                              "path": "/metrics"
                            },
                            "route": {
                              "cluster": "self_admin",
                              "prefix_rewrite": "/stats/prometheus"
                            }
                          },
                          {
                            "match": {
                              "prefix": "/"
                            },
                            "direct_response": {
                              "status": 404
                            }
                          }
                        ]
                      }
                    ]
                  },
                  "http_filters": [
                    {
                      "name": "envoy.filters.http.router",
                      "typedConfig": {
                        "@type": "type.googleapis.com/envoy.extensions.filters.http.router.v3.Router"
                      }
                    }
                  ]
                }
              }
            ]
          }
        ]
      }
    ]
  },
  "stats_config": {
    "stats_tags": [
      ...
    ],
    "use_all_default_tags": true
  },
  "dynamic_resources": {
    "lds_config": {
      "ads": {},
      "initial_fetch_timeout": "0s",
      "resource_api_version": "V3"
    },
    "cds_config": {
      "ads": {},
      "initial_fetch_timeout": "0s",
      "resource_api_version": "V3"
    },
    "ads_config": {
      "api_type": "DELTA_GRPC",
      "transport_api_version": "V3",
      "grpc_services": {
        "initial_metadata": [
          {
            "key": "x-consul-token",
            "value": "<REDACTED>"
          }
        ],
        "envoy_grpc": {
          "cluster_name": "local_agent"
        }
      }
    }
  }
}

From the agent where the terminating gateway is running:

# curl -s http://127.0.0.0:8500/v1/agent/services | jq '.["_nomad-task-f0b1b6d5-ef0f-ec7c-14c6-3112685453aa-group-ext-dynamodb-tgw-ext-dynamodb-tgw-connect-terminating-ext-dynamodb-tgw"]'
{
  "Kind": "terminating-gateway",
  "ID": "_nomad-task-f0b1b6d5-ef0f-ec7c-14c6-3112685453aa-group-ext-dynamodb-tgw-ext-dynamodb-tgw-connect-terminating-ext-dynamodb-tgw",
  "Service": "ext-dynamodb-tgw",
  "Tags": [],
  "Meta": {
    "external-source": "nomad"
  },
  "Port": 28117,
  "Address": "<REDACTED>",
  "TaggedAddresses": {
    "lan_ipv4": {
      "Address": "<REDACTED>",
      "Port": 28117
    },
    "wan_ipv4": {
      "Address": "<REDACTED>",
      "Port": 28117
    }
  },
  "Weights": {
    "Passing": 1,
    "Warning": 1
  },
  "EnableTagOverride": false,
  "Proxy": {
    "Config": {
      "component_log_level": "upstream:trace,http:trace,router:trace,config:trace",
      "connect_timeout_ms": 5000,
      "envoy_gateway_bind_addresses": {
        "default": {
          "Address": "0.0.0.0",
          "Port": 28117
        }
      },
      "envoy_gateway_no_default_bind": true,
      "envoy_prometheus_bind_addr": "127.0.0.1:9102",
      "log_level": "debug",
      "protocol": "tcp"
    },
    "MeshGateway": {},
    "Expose": {},
    "AccessLogs": {
      "Enabled": true
    }
  },
  "Datacenter": "aws-us-east-1"
}

Please let me know if there are additional debugging steps you can suggest, or if you need more information on the issue.

@tgross tgross added this to Needs Triage in Nomad - Community Issues Triage via automation Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

No branches or pull requests

1 participant