refactor(test): provide means to validate metrics and observations by mkleczek · Pull Request #4671 · PostgREST/postgrest

mkleczek · 2026-02-25T14:46:00Z

DISCLAIMER:
This commit was authored entirely by a human without the assistance of LLMs.

Some helpers are provided for introspecting metrics already (used in JWT cache tests). This change provides facilities to additionally validate emited Observation events.
A new Spec module is also implemented, adding basic tests of schema cache reloading - their main goal is to excercise the new infrastructure.

This is a prerequisite for:
#4672 (and subsequent #4643)
#4668
#4645

steve-chavez · 2026-02-26T15:38:01Z

FYI I had a question about testing the order of the logs on #4668 (comment)

mkleczek · 2026-02-26T16:02:03Z

FYI I had a question about testing the order of the logs on #4668 (comment)

@steve-chavez - #4668 (comment) (copying it here as well, I guess that's a better place to discuss it).

That depends on how observations order is related to debug log entries order. They should be the same but we don't test it in HSpec tests - we only read observations from a Chan.
There is a simple helper function waitFor which reads from the Chan skipping messages until it reads the message meeting provided criteria (or hangs waiting for an observation (or timeout) if none available). I think it is sufficient primitive for most cases.
You can test if required observations happen in right order (but possibly with other observations in between), eg:

    waitFor (2 * sec) "PoolFlushed 2" $ \x -> [ o | o@PoolFlushed <- pure x ]
    waitFor (1 * sec) "SchemaCacheLoadedObs" $ \x -> [ o | o@SchemaCacheLoadedObs{} <- pure x ]

This will skip SchemaCacheQueriedObs in between.

If necessary we can implement non-lenient validation.

test/spec/Feature/MetricsSpec.hs

test/spec/SpecHelper.hs

test/spec/Main.hs

test/spec/SpecHelper.hs

steve-chavez · 2026-02-26T21:27:05Z

test/spec/SpecHelper.hs

+        readUntil = void . untilM (pure . isJust . f) . readChan
+    loc = fold (head (prettySrcLoc . snd <$> getCallStack callStack))
+    -- execute effectful computation until result meets provided condition
+    untilM cond m = fix $ \loop -> m >>= \a -> ifM (cond a) (pure a) loop


Is it the same as untilM? Perhaps we can reuse this package?

Is it the same as untilM? Perhaps we can reuse this package?

Not sure if it is worth pulling in the whole package for a one-liner (we don't want npm story, I guess :-) ). Besides, monad-loops looks dormant...

Not sure if it is worth pulling in the whole package for a one-liner [..] Besides, monad-loops looks dormant...

I think it would be worth it. monad-loops is a complete library for monadic iteration related functions. It looks dormant because I guess it is complete and bug free for years.

Besides, there are other handy functions in that library which could be useful in the future to improve our test infrastructure.

Not sure if it is worth pulling in the whole package for a one-liner [..] Besides, monad-loops looks dormant...

I think it would be worth it. monad-loops is a complete library for monadic iteration related functions. It looks dormant because I guess it is complete and bug free for years.

I double-checked this library and it does not provide quite the function that we need here (ie. loop until the returned value meets provided criteria). I could reimplement the logic in terms of monad-loops untilJust function probably, but bear with me: pulling a library and reimplement larger function to adjust it to this library API, just to replace a one-liner, does not meet the bar to use the library IMHO.

Besides, there are other handy functions in that library which could be useful in the future to improve our test infrastructure.

Quite possibly you're right. But this is out of scope of this PR.

In general, importing a library "just in case it is needed in the future" is not a good idea IMO.

test/spec/SpecHelper.hs

test/spec/Feature/MetricsSpec.hs

test/spec/SpecHelper.hs

mkleczek · 2026-03-02T22:54:38Z

@steve-chavez @taimoorzaeem do you think it is now in mergeable state?

taimoorzaeem · 2026-03-03T01:41:08Z

@steve-chavez @taimoorzaeem do you think it is now in mergeable state?

Almost there!

I am quite sold on new interface for testing metrics and observation. However, the implementation part needs careful review because lots of new stuff there (async exceptions, masking/unmasking are new for me) and it's taking some more time from my side.

taimoorzaeem · 2026-03-06T07:40:52Z

test/spec/SpecHelper.hs

+-- helpers used to produce observation diagnostics in waitForObs
+constrName :: (HasConstructor (Rep a), Generic a)=> a -> Text
+constrName = genericConstrName . from
+
+class HasConstructor f where
+  genericConstrName :: f x -> Text
+
+instance HasConstructor f => HasConstructor (D1 c f) where
+  genericConstrName (M1 x) = genericConstrName x
+
+instance (HasConstructor x, HasConstructor y) => HasConstructor (x :+: y) where
+  genericConstrName (L1 l) = genericConstrName l
+  genericConstrName (R1 r) = genericConstrName r
+
+instance Constructor c => HasConstructor (C1 c f) where
+  genericConstrName = T.pack . conName


Hmm, is this strictly needed? Perhaps maybe we can just use the Show instance?

Hmm, is this strictly needed? Perhaps maybe we can just use the Show instance?

@taimoorzaeem
Yes, it is needed: try to implement Show instance for Observation - I certainly tried that first :)

taimoorzaeem · 2026-03-12T07:15:10Z

test/spec/Main.hs

 main :: IO ()
 main = do
+  poolChan <- newChan
+  -- make sure poolChan is not growing indefinitely
+  -- start a thread that drains the channel
+  -- this is necessary because test cases operate on
+  -- copies so poolChan is never read from
+  void $ forkIO $ forever $ readChan poolChan


Hmm, does this mean we have another thread running for the entire duration of postgrest-test-spec?

Hmm, does this mean we have another thread running for the entire duration of postgrest-test-spec?

Yes. Do you think it is a problem? Haskell threads are green threads so really lightweight.

taimoorzaeem · 2026-03-13T11:17:32Z

Thinking on a broader level, I think maybe we should separate observability tests from our spec test-suite. I have noticed that our spec tests mostly test end-to-end behaviour i.e request and response objects.

Observability testing requires a different setup (threads, timeouts, state checking etc), so maybe we should give it it's own test-suite? Probably also move JwtCacheSpec.hs there?

Pros:

Better separation of concerns which means easier to review changes.

Cons:

It would require some initial setup, wrapper scripts like postgrest-test-observability, coverage setup etc

Not sure if it would be worth it, thoughts from team members?

CC: @steve-chavez @wolfgangwalther @laurenceisla @mkleczek @develop7

mkleczek · 2026-03-13T11:57:25Z

Thinking on a broader level, I think maybe we should separate observability tests from our spec test-suite.

I considered it both when implementing JwtCache tests and when working on this PR.
Decided to keep it in the same suite but in separate Spec modules.

First reason is, as you say:

Cons:

It would require some initial setup, wrapper scripts like postgrest-test-observability, coverage setup etc

Not sure if it would be worth it, thoughts from team members?

IMHO splitting tests into separate Spec modules with proper granularity is enough.

Splitting into two separate suites (each with its own setup) would introduce a lot of redundancy but also lead to constant back-and-forth about the proper placement of a particular test (some tests are cross-cutting: require both e2e verification and state checking).

steve-chavez · 2026-03-13T17:37:30Z

It would require some initial setup, wrapper scripts like postgrest-test-observability, coverage setup etc

Makes perfect sense, also considering that we'll need custom tooling for otel and for sentry.. adding those to postgrest-test-spec would make the Nix expression more complicated.

(coverage setup is not that hard, an example here)

Splitting into two separate suites (each with its own setup) would introduce a lot of redundancy but also lead to constant back-and-forth about the proper placement of a particular test (some tests are cross-cutting: require both e2e verification and state checking).

I think redundancy is fine if it makes the test suites more focused/isolated/reviewable, we already have some redundancy in io and spec schemas (sleep in test/io/fixtures/schema.sql, sleep in test/spec/fixtures/schema.sql) and that has been working well.

mkleczek · 2026-03-13T17:52:58Z

It would require some initial setup, wrapper scripts like postgrest-test-observability, coverage setup etc

Makes perfect sense, also considering that we'll need [custom tooling for otel]

OK, can it be done as a separate refactoring PR?
I don’t have bandwidth for designing new shape of our test suites. Also, ass this is quite major design I also expect it to take quite some time to reach consensus. But once it is done it should be trivial to move some code around - tests using the new APIs are isolated in their own modules.
OTOH there are multiple PRs dependent on the new test APIs.

This and stacked PRs can also wait for someone to contribute the new test suite structure.

Maybe kaizen mindset would be a good idea?

DISCLAIMER: This commit was authored entirely by a human without the assistance of LLMs. Some helpers are provided for introspecting metrics already (used in JWT cache tests). This change provides facilities to additionally validate emited Observation events. A new Spec module is also implemented, adding basic tests of schema cache reloading - their main goal is to excercise the new infrastructure.

taimoorzaeem · 2026-03-14T00:50:06Z

Makes perfect sense, also considering that we'll need custom tooling for otel and for sentry.. adding those to postgrest-test-spec would make the Nix expression more complicated.

(coverage setup is not that hard, an example here)

Yes, agree. 👍

[..] I don’t have bandwidth for designing new shape of our test suites.

I will try to do it. Let's see.

mkleczek · 2026-03-14T06:50:55Z

Makes perfect sense, also considering that we'll need custom tooling for otel and for sentry.. adding those to postgrest-test-spec would make the Nix expression more complicated.
(coverage setup is not that hard, an example here)

Yes, agree. 👍

[..] I don’t have bandwidth for designing new shape of our test suites.

I will try to do it. Let's see.

Cool, marking this one as draft and will rebase on top of the new structure.

taimoorzaeem · 2026-03-23T07:27:49Z

Cool, marking this one as draft and will rebase on top of the new structure.

@mkleczek Now that we have #4732, you may continue with this.

mkleczek force-pushed the refactor/test/provide-observations-testing-facilities branch 3 times, most recently from ab5ae74 to bde5a2b Compare February 25, 2026 16:57

mkleczek mentioned this pull request Feb 26, 2026

add(observation): db pool flushes observation #4668

Open