Skip to content

Upgrading to 2.48.0 caused spike in AppSync costs #4007

@MrAsterisco

Description

@MrAsterisco

Describe the bug

Our application uses the Amplify SDK to subscribe to custom mutation that our backend sends over AppSync. Recently, we upgraded the Amplify SDK from version 2.29.0 to version 2.48.0 and this caused a sudden spike (which is constantly increasing) in our AppSync costs.

Analyzing the (very limited) debug possibilities from the cloud, we have identified a correlation between the sudden increase and the release of the app version where we updated to version 2.48.0.

Specifically, in CloudWatch, we were able to identify an unexpected increase in the following metrics:

  • SubscribeClientError (went from <10/day to 500k/day)
  • SubscribeSuccess (went from 5k/day to 144k/day)
  • ActiveSubscriptions (went from 100k/day to 4M+/day)
Image

We are unable to reproduce when running locally: the subscription seems to work correctly and it doesn't appear to be causing any issue. However, turning on WAF on AppSync and setting up a counter of connections by IP address, we can see that our app is spamming connections to AppSync continuously, which is consistent with what we're seeing from the metrics.

We have CloudWatch enhanced logging enabled for AppSync, but unfortunately the logs are not useful. There are no failures reported.

Only once, we were able to reproduce an issue locally (which may be unrelated), where AppSync was unable to connect and returned the following error:

AppSyncRealTimeResponse(id: nil, payload: Optional(Amplify.JSONValue.object(["errors": Amplify.JSONValue.array([Amplify.JSONValue.object(["message": Amplify.JSONValue.string("Valid authorization header not provided."), "errorType": Amplify.JSONValue.string("UnauthorizedException"), "errorCode": Amplify.JSONValue.number(401.0)])])])), type: AWSAPIPlugin.AppSyncRealTimeResponse.EventType.connectionError

Our app has a logic implemented to capture signals from Amplify when the refresh token expires, but in this case it was not triggered. We were able to perform other actions using the access token, but AppSync was not working. Logging out and logging back in a new session of the app solved the issue. It's also important to note that the SDK did not seem to be trying again: it failed once and gave up.

As the server-side logs are not useful, we have no ideas on how to debug this issue further. We are sure this is caused by the Amplify SDK update, but we can't say exactly what's wrong. Based on the AWS documentation, it seems that the unauthorized errors won't be logged to CloudWatch, so that would be compatible with the error we saw locally.

Steps To Reproduce

At the moment, we don't have repro steps.

Expected behavior

  • The GraphQL API should never be spammed by the SDK causing spikes in costs.
  • If the problem is related to the authentication or authorization, the SDK should try once and then stop.

Amplify Framework Version

2.48.0

Amplify Categories

API

Dependency manager

Swift PM

Swift version

5.9

CLI version

We don't use the Amplify CLI

Xcode version

16.4

Relevant log output

N/A

Is this a regression?

Yes

Regression additional context

Before, we were using version 2.29.0 and everything was working as expected.

Platforms

iOS, macOS

OS Version

Irrelevant, all OS versions are affected

Device

Irrelevant, all devices are affected

Specific to simulators

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    apiIssues related to the API categorybugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions