Error managing cluster: Failure updating scheduler state when checking-in: Exception while reading from stream #2314

adrianmoraret-mdsol · 2024-03-29T18:40:04Z

Describe the bug
Getting the following error: "Error managing cluster: Failure updating scheduler state when checking-in: Exception while reading from stream"

StackTrace:

Quartz.JobPersistenceException: Failure updating scheduler state when checking-in: Exception while reading from stream
 ---> Npgsql.NpgsqlException (0x80004005): Exception while reading from stream
 ---> System.TimeoutException: Timeout during reading attempt
   at Npgsql.Internal.NpgsqlConnector.<ReadMessage>g__ReadMessageLong|211_0(NpgsqlConnector connector, Boolean async, DataRowLoadingMode dataRowLoadingMode, Boolean readingNotifications, Boolean isReadingPrependedMessage)
   at Npgsql.NpgsqlDataReader.NextResult(Boolean async, Boolean isConsuming, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteNonQuery(Boolean async, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.StdAdoDelegate.UpdateSchedulerState(ConnectionAndTransactionHolder conn, String instanceName, DateTimeOffset checkInTime, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.JobStoreSupport.ClusterCheckIn(ConnectionAndTransactionHolder conn, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at Quartz.Impl.AdoJobStore.JobStoreSupport.ClusterCheckIn(ConnectionAndTransactionHolder conn, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.JobStoreSupport.DoCheckin(Guid requestorId, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.JobStoreSupport.DoCheckin(Guid requestorId, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.ClusterManager.Manage() [See nested exception: Npgsql.NpgsqlException (0x80004005): Exception while reading from stream
 ---> System.TimeoutException: Timeout during reading attempt
   at Npgsql.Internal.NpgsqlConnector.<ReadMessage>g__ReadMessageLong|211_0(NpgsqlConnector connector, Boolean async, DataRowLoadingMode dataRowLoadingMode, Boolean readingNotifications, Boolean isReadingPrependedMessage)
   at Npgsql.NpgsqlDataReader.NextResult(Boolean async, Boolean isConsuming, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteNonQuery(Boolean async, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.StdAdoDelegate.UpdateSchedulerState(ConnectionAndTransactionHolder conn, String instanceName, DateTimeOffset checkInTime, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.JobStoreSupport.ClusterCheckIn(ConnectionAndTransactionHolder conn, CancellationToken cancellationToken)]

Version used

3.8.1

To Reproduce

Unable to replicate the issue.

Expected behavior

Expecting the quartz to recover from this failed state.

Additional context
Quartz configuration

  services.AddQuartz(opt =>
  {
      opt.UsePersistentStore(storeOptions =>
      {
          storeOptions.UseProperties = true;
          storeOptions.UseClustering();
          storeOptions.UsePostgres(connectionString);
          storeOptions.UseNewtonsoftJsonSerializer();
      });
  });
  services.AddQuartzHostedService(options => options.WaitForJobsToComplete = true );

The text was updated successfully, but these errors were encountered:

lahma · 2024-04-02T08:58:01Z

So Quartz doesn't recover or is it just a logged error and a passing problem?

adrianmoraret-mdsol · 2024-04-02T14:55:15Z

Yes. Quartz did not recover from this. Not sure if there is some additional configuration that I am missing.

lahma · 2024-04-03T12:09:16Z

Looking at the code all exceptions should be tolerated and cluster manager should just continue after that.

adrianmoraret-mdsol · 2024-04-05T17:32:42Z

Job was not running. I looked for exceptions and there was a permission missing which was causing the job to fail.
But I noticed the failed exception stop showing in the logs for the last day(job was scheduled to run every 30 minutes), so that means quartz didn't try to run the job anymore. I fixed the permission but job was still not picked up.
I restarted the service and everything worked.

I looked for additional exceptions and the one from this issue is the only one I found.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error managing cluster: Failure updating scheduler state when checking-in: Exception while reading from stream #2314

Error managing cluster: Failure updating scheduler state when checking-in: Exception while reading from stream #2314

adrianmoraret-mdsol commented Mar 29, 2024 •

edited

lahma commented Apr 2, 2024

adrianmoraret-mdsol commented Apr 2, 2024

lahma commented Apr 3, 2024

adrianmoraret-mdsol commented Apr 5, 2024

Error managing cluster: Failure updating scheduler state when checking-in: Exception while reading from stream #2314

Error managing cluster: Failure updating scheduler state when checking-in: Exception while reading from stream #2314

Comments

adrianmoraret-mdsol commented Mar 29, 2024 • edited

lahma commented Apr 2, 2024

adrianmoraret-mdsol commented Apr 2, 2024

lahma commented Apr 3, 2024

adrianmoraret-mdsol commented Apr 5, 2024

adrianmoraret-mdsol commented Mar 29, 2024 •

edited