Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error managing cluster: Failure updating scheduler state when checking-in: Exception while reading from stream #2314

Open
adrianmoraret-mdsol opened this issue Mar 29, 2024 · 4 comments

Comments

@adrianmoraret-mdsol
Copy link

adrianmoraret-mdsol commented Mar 29, 2024

Describe the bug
Getting the following error: "Error managing cluster: Failure updating scheduler state when checking-in: Exception while reading from stream"

StackTrace:

Quartz.JobPersistenceException: Failure updating scheduler state when checking-in: Exception while reading from stream
 ---> Npgsql.NpgsqlException (0x80004005): Exception while reading from stream
 ---> System.TimeoutException: Timeout during reading attempt
   at Npgsql.Internal.NpgsqlConnector.<ReadMessage>g__ReadMessageLong|211_0(NpgsqlConnector connector, Boolean async, DataRowLoadingMode dataRowLoadingMode, Boolean readingNotifications, Boolean isReadingPrependedMessage)
   at Npgsql.NpgsqlDataReader.NextResult(Boolean async, Boolean isConsuming, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteNonQuery(Boolean async, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.StdAdoDelegate.UpdateSchedulerState(ConnectionAndTransactionHolder conn, String instanceName, DateTimeOffset checkInTime, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.JobStoreSupport.ClusterCheckIn(ConnectionAndTransactionHolder conn, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at Quartz.Impl.AdoJobStore.JobStoreSupport.ClusterCheckIn(ConnectionAndTransactionHolder conn, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.JobStoreSupport.DoCheckin(Guid requestorId, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.JobStoreSupport.DoCheckin(Guid requestorId, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.ClusterManager.Manage() [See nested exception: Npgsql.NpgsqlException (0x80004005): Exception while reading from stream
 ---> System.TimeoutException: Timeout during reading attempt
   at Npgsql.Internal.NpgsqlConnector.<ReadMessage>g__ReadMessageLong|211_0(NpgsqlConnector connector, Boolean async, DataRowLoadingMode dataRowLoadingMode, Boolean readingNotifications, Boolean isReadingPrependedMessage)
   at Npgsql.NpgsqlDataReader.NextResult(Boolean async, Boolean isConsuming, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteNonQuery(Boolean async, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.StdAdoDelegate.UpdateSchedulerState(ConnectionAndTransactionHolder conn, String instanceName, DateTimeOffset checkInTime, CancellationToken cancellationToken)
   at Quartz.Impl.AdoJobStore.JobStoreSupport.ClusterCheckIn(ConnectionAndTransactionHolder conn, CancellationToken cancellationToken)]

Version used

3.8.1

To Reproduce

Unable to replicate the issue.

Expected behavior

Expecting the quartz to recover from this failed state.

Additional context
Quartz configuration

  services.AddQuartz(opt =>
  {
      opt.UsePersistentStore(storeOptions =>
      {
          storeOptions.UseProperties = true;
          storeOptions.UseClustering();
          storeOptions.UsePostgres(connectionString);
          storeOptions.UseNewtonsoftJsonSerializer();
      });
  });
  services.AddQuartzHostedService(options => options.WaitForJobsToComplete = true );
@lahma
Copy link
Member

lahma commented Apr 2, 2024

So Quartz doesn't recover or is it just a logged error and a passing problem?

@adrianmoraret-mdsol
Copy link
Author

Yes. Quartz did not recover from this. Not sure if there is some additional configuration that I am missing.

@lahma
Copy link
Member

lahma commented Apr 3, 2024

Looking at the code all exceptions should be tolerated and cluster manager should just continue after that.

@adrianmoraret-mdsol
Copy link
Author

Job was not running. I looked for exceptions and there was a permission missing which was causing the job to fail.
But I noticed the failed exception stop showing in the logs for the last day(job was scheduled to run every 30 minutes), so that means quartz didn't try to run the job anymore. I fixed the permission but job was still not picked up.
I restarted the service and everything worked.

I looked for additional exceptions and the one from this issue is the only one I found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants