Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pulumi Automation API throws Grpc.Core.RpcException #126

Open
klyse opened this issue Apr 5, 2023 · 6 comments
Open

Pulumi Automation API throws Grpc.Core.RpcException #126

klyse opened this issue Apr 5, 2023 · 6 comments
Assignees
Labels
kind/bug Some behavior is incorrect or out of spec

Comments

@klyse
Copy link

klyse commented Apr 5, 2023

What happened?

I have a ASP.net core 7 project that uses the inline Pulumi automation API.

Example:

var stack = await GetStackAsync(vm, cancellationToken);
await stack.CancelAsync(token);
await stack.RefreshAsync(cancellationToken: token);
await stack.UpAsync(cancellationToken: token);

I use Sentry as Error tracking system. I'm getting a log of errors (16k in the last 24h) that I only partially understand. They seem benign.

Grpc.Core.RpcException: Status(StatusCode="Unavailable", Detail="Error starting gRPC call. HttpRequestException: Connection refused (127.0.0.1:37649) SocketException: Connection refused", DebugException="System.Net.Http.HttpRequestException: Connection refused (127.0.0.1:37649)
 ---> System.Net.Sockets.SocketException (111): Connection refused
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
   at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|281_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.AddHttp2ConnectionAsync(QueueItem queueItem)
   at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.HttpConnectionWaiter`1.WaitForConnectionAsync(Boolean async, CancellationToken requestCancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout)")
  ?, in async Task GrpcEngine.LogAsync(LogRequest request)
  ?, in async Task EngineLogger.LogAsync(LogSeverity severity, string message, Resource resource, int? streamId, bool? ephemeral)

The errors are not shown in the console and they are also not caught in a try catch block. My understanding is that in

private class InlineLanguageHost : IAsyncDisposable
{
private readonly TaskCompletionSource<int> _portTcs = new TaskCompletionSource<int>(TaskCreationOptions.RunContinuationsAsynchronously);
private readonly CancellationToken _cancelToken;
private readonly IHost _host;
private readonly CancellationTokenRegistration _portRegistration;
public InlineLanguageHost(
PulumiFn program,
ILogger? logger,
CancellationToken cancellationToken)
{
this._cancelToken = cancellationToken;
this._host = Host.CreateDefaultBuilder()
.ConfigureWebHostDefaults(webBuilder =>
{
webBuilder
.ConfigureKestrel(kestrelOptions =>
{
kestrelOptions.Listen(IPAddress.Loopback, 0, listenOptions =>
{
listenOptions.Protocols = HttpProtocols.Http2;
});
})
.ConfigureAppConfiguration((context, config) =>
{
// clear so we don't read appsettings.json
// note that we also won't read environment variables for config
config.Sources.Clear();
})
.ConfigureLogging(loggingBuilder =>
{
// disable default logging
loggingBuilder.ClearProviders();
})
.ConfigureServices(services =>
{
// to be injected into LanguageRuntimeService
var callerContext = new LanguageRuntimeService.CallerContext(program, logger, cancellationToken);
services.AddSingleton(callerContext);
services.AddGrpc(grpcOptions =>
{
grpcOptions.MaxReceiveMessageSize = LanguageRuntimeService.MaxRpcMesageSize;
grpcOptions.MaxSendMessageSize = LanguageRuntimeService.MaxRpcMesageSize;
});
})
.Configure(app =>
{
app.UseRouting();
app.UseEndpoints(endpoints =>
{
endpoints.MapGrpcService<LanguageRuntimeService>();
});
});
})
.Build();
// before starting the host, set up this callback to tell us what port was selected
this._portRegistration = this._host.Services.GetRequiredService<IHostApplicationLifetime>().ApplicationStarted.Register(() =>
{
try
{
var serverFeatures = this._host.Services.GetRequiredService<IServer>().Features;
var addresses = serverFeatures.Get<IServerAddressesFeature>()!.Addresses.ToList();
Debug.Assert(addresses.Count == 1, "Server should only be listening on one address");
var uri = new Uri(addresses[0]);
this._portTcs.TrySetResult(uri.Port);
}
catch (Exception ex)
{
this._portTcs.TrySetException(ex);
}
});
}
public Task StartAsync()
=> this._host.StartAsync(this._cancelToken);
public Task<int> GetPortAsync()
=> this._portTcs.Task;
public bool TryGetExceptionInfo([NotNullWhen(true)] out ExceptionDispatchInfo? info)
{
var callerContext = this._host.Services.GetRequiredService<LanguageRuntimeService.CallerContext>();
if (callerContext.ExceptionDispatchInfo is null)
{
info = null;
return false;
}
info = callerContext.ExceptionDispatchInfo;
return true;
}
public async ValueTask DisposeAsync()
{
this._portRegistration.Unregister();
await this._host.StopAsync(this._cancelToken).ConfigureAwait(false);
this._host.Dispose();
}
}
static void ApplyUpdateOptions(UpdateOptions options, List<string> args)
you create a GRPC host which lets the go runtime connect to the c# runtime. Is my assumption correct? Then when the Pulumi UpAsync has completed the server is destroyed before the client which generates this exception.

Expected Behavior

Error is not generated if Pulumi UpAsync completed successfully.

Steps to reproduce

Create a LocalWorkspace and then execute UpAsync

var stackArgs = new InlineProgramArgs("instance", stackId, program);
stackArgs.ProjectSettings!.Backend = new ProjectBackend
{
	Url = "backend"
};

var stack = await LocalWorkspace.CreateOrSelectStackAsync(stackArgs, cancellationToken);

Catching the error is the tricky part. I can only see it being caught in Sentry. Maybe there is a different API one can consume to catch those unhandled exceptions.

Output of pulumi about

pulumi about
CLI
Version      3.60.0
Go Version   go1.20.2
Go Compiler  gc

Host
OS       darwin
Version  13.3
Arch     arm64

Backend
Name           pulumi.com
URL            https://app.pulumi.com/xxx
User           klyse
Organizations  xxx

Pulumi locates its logs in xxx by default
warning: Failed to read project: no Pulumi.yaml project file found (searching upwards from xxx). If you have not created a project yet, use `pulumi new` to do so: no project file found
warning: Failed to get information about the current stack: no Pulumi.yaml project file found (searching upwards from xxx). If you have not created a project yet, use `pulumi new` to do so: no project file found
warning: A new version of Pulumi is available. To upgrade from version '3.60.0' to '3.61.0', run
   $ brew update && brew upgrade pulumi
or visit https://pulumi.com/docs/reference/install/ for manual instructions and release notes.

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

@klyse klyse added kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team labels Apr 5, 2023
@dixler
Copy link
Contributor

dixler commented Apr 5, 2023

Thanks for filing this. @Frassle as secondary, do you mind looking into this?

@Frassle Frassle self-assigned this Apr 5, 2023
@Frassle
Copy link
Member

Frassle commented Apr 6, 2023

I think what's happening is the dotnet inline program isn't waiting for all logs to write before telling the engine it's safe to shut down. That appears to be the case when the inline program throws an exception, less clear why you'd be seeing this for programs that don't error.

@klyse
Copy link
Author

klyse commented Apr 6, 2023

I think this is happening for every operation we make Successful and Unsuccessful and maybe even for RefreshAsync. Otherwise I'm unsure how we could accumulate 16k in 24h.

@Frassle
Copy link
Member

Frassle commented Apr 6, 2023

Yeh will keep looking, it's definitely a problem for error'd stacks so I'll get that fixed and then see if I can repro this for non-error stacks as well.

@dixler dixler removed the needs-triage Needs attention from the triage team label Apr 6, 2023
@klyse
Copy link
Author

klyse commented May 31, 2023

Do you have any updates on this @Frassle ?

@Frassle
Copy link
Member

Frassle commented May 31, 2023

No sorry, I'm not confident #127 is the right fix for this but this fell off my work list. I do mean to come back to this but it's probably gonna be a couple of weeks still as we work on some other stuff with deadlines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Some behavior is incorrect or out of spec
Projects
None yet
Development

No branches or pull requests

3 participants