PostgreSQL Range Types #2008

tomohavvk · 2024-03-19T20:37:35Z

Hollo community, I would like to introduce my vision for PostgreSQL range types implementation in doobie since it was requested in this issue but not implemented yet #936

I hope together we can improve this implementation and release the Range Types support into life

Range types

The following range types are supported, and map to doobie generic class:

the int4range schema type maps to Range[Int]
the int8range schema type maps to Range[Long]
the numrange schema type maps to Range[Float] | Range[Double] | Range[BigDecimal]
the daterange schema type maps to Range[java.time.LocalDate]
the tsrange schema type maps to Range[java.time.LocalDateTime]
the tstzrange schema type maps to Range[java.time.OffsetDateTime]

case class Range[T](lowerBound: Option[T], upperBound: Option[T], edge: Edge)

To control the inclusive and exclusive bounds according to the PostgreSQL specification you need to use a special Edge enumeration when creating a Range:

object Edge {
  case object `(_,_)` extends Edge
  case object `(_,_]` extends Edge
  case object `[_,_)` extends Edge
  case object `[_,_]` extends Edge
  case object `empty` extends Edge
}

In the text form of a range, an inclusive lower bound is represented by '[' while an exclusive lower bound is represented by '('. Likewise, an inclusive upper bound is represented by ']', while an exclusive upper bound is represented by ')'.

Note: the range types mappings are defined in a different object (rangeimplicits). To enable it you must import them explicitly:

import doobie.postgres.rangeimplicits._

To create for example custom implementation of Range[Byte] you can use the public method which declared in the following package doobie.postgres.rangeimplicits:

def rangeMeta[T](sqlRangeType: String)(implicit D: RangeBoundDecoder[T], E: RangeBoundEncoder[T]): Meta[Range[T]]

The main requirements is to need to implement the RangeBoundDecoder[Byte] end RangeBoundEncoder[Byte] which are essentially:

type RangeBoundDecoder[T] = String => T
type RangeBoundEncoder[T] = T => String

For a Range[Byte], the meta and bounds encoder and decoder would appear as follows:

import doobie.postgres.rangeimplicits._
import doobie.postgres.types.Range.{RangeBoundDecoder, RangeBoundEncoder}

implicit val byteBoundEncoder: RangeBoundEncoder[Byte] = _.toString
implicit val byteBoundDecoder: RangeBoundDecoder[Byte] = _.toByte

implicit val byteRangeMeta: Meta[Range[Byte]] = rangeMeta[Byte]("int4range")

...
val int4rangeWithByteBoundsQuery = sql"select '[-128, 127)'::int4range".query[Range[Byte]]

satorg · 2024-03-22T09:09:34Z

Hello @tomohavvk, thank you for your thorough work!
I believe the support for range types will be a fantastic addition to Doobie!

However, I would like to discuss some potential pitfalls I've found out in the PR (if I may).

1. Unintentional implicit conversions

Typedefs RangeBoundEncoder and RangeBoundDecoder followed by implicit function values of those types are nothing but disguised implicit conversions. Therefore if a user imports those implicits via import doobie.postgres.rangeimplicits._, the following behavior gets enabled automatically in the scope (scala 2.13.13 repl):

scala> type RangeBoundEncoder[T] = T => String
type RangeBoundEncoder

scala> implicit val intBoundEncoder: RangeBoundEncoder[Int] = _.toString
val intBoundEncoder: RangeBoundEncoder[Int] = $Lambda/0x000000012853cc00@3caa4d85

scala> def foo(str: String) = println(s"supposed to be a string: $str")
def foo(str: String): Unit

scala> foo(123) // <-- passing an `Int` into a function that takes `String` !
supposed to be a string: 123

I.e. all the provided conversions to and from String become implicit, which is definitely not what most of users mean.

Moreover, those implicits are not really necessary out there, compare please:

version with implicits:

type RangeBoundEncoder[T] = T => String
type RangeBoundEncoder[T] = T => String
implicit val intBoundEncoder: RangeBoundEncoder[Int]     = _.toString
implicit val longBoundEncoder: RangeBoundEncoder[Long]   = _.toString
implicit val floatBoundEncoder: RangeBoundEncoder[Float] = _.toString
implicit val intBoundDecoder: RangeBoundDecoder[Int]     = java.lang.Integer.valueOf(_)
implicit val longBoundDecoder: RangeBoundDecoder[Long]   = java.lang.Long.valueOf(_)
implicit val floatBoundDecoder: RangeBoundDecoder[Float] = java.lang.Float.valueOf(_)

implicit val intRangeMeta: Meta[Range[Int]]     = rangeMeta("int4range")
implicit val longRangeMeta: Meta[Range[Long]]   = rangeMeta("int8range")
implicit val floatRangeMeta: Meta[Range[Float]] = rangeMeta("numrange")

versus a more straightforward version without conversion implicits:

implicit val intRangeMeta: Meta[Range[Int]]     = rangeMeta("int4range")(_.toString, java.lang.Integer.valueOf)
implicit val longRangeMeta: Meta[Range[Long]]   = rangeMeta("int8range")(_.toString, java.lang.Long.valueOf)
implicit val floatRangeMeta: Meta[Range[Float]] = rangeMeta("numrange")(_.toString, java.lang.Float.valueOf)

In other words, the implicit conversions there do not seem simplifying anything (nor improve clarity IMO) and in fact require even more lines of code to put everything together eventually.

2. Inexact type correspondence

Speaking of the numrange type:

implicit val floatRangeMeta: Meta[Range[Float]]                   = rangeMeta("numrange")
implicit val doubleRangeMeta: Meta[Range[Double]]                 = rangeMeta("numrange")
implicit val bigDecimalRangeMeta: Meta[Range[BigDecimal]]         = rangeMeta("numrange")

AFAIK, numrange is a built-in range type for the numeric scalar. So yes, the best corresponding JVM type is BigDecimal indeed. However, providing Meta for ranges of Double and Float by default might not be very good idea – a user may want to create a custom range type for floating-point numbers and then they can run into ambiguous implicit problem in that case (from here):

CREATE TYPE floatrange AS RANGE (subtype = float8, ...);

Then somewhere in user code:

implicit val doubleRangeMeta: Meta[Range[Double]] = rangeMeta("floatrange")

So I would probably suggest to hold on creating too many type shortcuts from the beginning. Ultimately, if there is demand for that from users, it will be easier to add them later (comparing to deprecating and removing stuff in a case of complaints).

3. Over-backticking

It is solely my personal perceptions, but does using a lot of back-ticked identifiers in the code really improves it? I feel it may make code more obscure instead. Especially because not all users like to see ASCII-art in their code. For example, (_,_) – this one might look a bit obscene, don't you think?

I feel that more straightforward and less fancy naming could be more appreciated, e.g.:

object Edge {
  case object InclIncl extends Edge
  case object InclExcl extends Edge
  case object ExclIncl extends Edge
  case object ExclExcl extends Edge
  case object Empty extends Edge
}

Also, the suggested approach allows to construct a Range like this:

Range(Some(1), Some(2), Edge.empty)

Which I am not sure about if it makes a lot of sense.
In other words, modeling of the Range type may deserve additional thought.

tomohavvk · 2024-03-23T13:27:05Z

Hello @satorg, thank you very much for the constructive comments, which have truly improved the implementation. I supported each of your suggestions, and here's what we have at the moment:

1. Unintentional implicit conversions

I completely removed any mentions of RangeBoundEncoder and RangeBoundDecoder. Now, there are no implicits; everything needs to be passed explicitly:

def rangeMeta[T](sqlRangeType: String)(encode: T => String, decode: String => T): Meta[Range[T]]

2. Inexact type correspondence

Done. numrange only maps to Range[BigDecimal]. We can easily extend it with more numeric types in the future(if needed)

3. Over-backticking

It is solely my personal perceptions, but does using a lot of back-ticked identifiers in the code really improves it? I feel it may make code more obscure instead. Especially because not all users like to see ASCII-art in their code. For example, (,) – this one might look a bit obscene, don't you think?

I thought about it :) but it didn't stop me because in my opinion it was the easiest way to understand, when you actually look at those brackets that are in PostgreSQL and you don't need to thinking about one more mapping, however, I agree that such code is rare and can be annoying. Therefore, I gladly rewrote it and now no one will see (_,_). Now it looks like:

object Edge {
  case object ExclExcl extends Edge
  case object ExclIncl extends Edge
  case object InclExcl extends Edge
  case object InclIncl extends Edge
}

4. Empty range

Also, the suggested approach allows to construct a Range like this:
Range(Some(1), Some(2), Edge.empty)
Which I am not sure about if it makes a lot of sense.
In other words, modeling of the Range type may deserve additional thought.

I redesigned the range type with this in mind, now the range hierarchy looks like this:

sealed trait Range[+T]
case object EmptyRange extends Range[Nothing]
case class NonEmptyRange[T](lowerBound: Option[T], upperBound: Option[T], edge: Edge) extends Range[T]

We can still create a range like: NonEmptyRange[Int](None, None, ExclExcl) which can surprise at first glance, but this is a valid case from PostgreSQL's point of view and we need to support it:

    nr     | isempty | lower | upper 
-----------+---------+-------+-------
 (,)       | f       |       |      
 [3,)      | f       |     3 |      
 (,5)      | f       |       |     5
 [1.1,2.2) | f       |   1.1 |   2.2
 empty     | t       |       |      
 [1.7,1.7] | f       |   1.7 |   1.7

If there are any suggestions with improvements, I will be glad to consider them.

jatcwang · 2024-05-15T20:40:12Z

Thanks for the fantastic PR description @tomohavvk and your review @satorg. I'll try and get this merged soon and be part of 1.0-RC6

tomohavvk added 12 commits March 18, 2024 19:10

Base implementation of PostgreSQL range types

a0c7779

Refactor implementation of PostgreSQL range types

ee4e0f1

Implement tests for PostgreSQL range types

93068d7

Refactor tests for PostgreSQL range types

f02c20d

Refactor implementation of PostgreSQL range types

5a6afe3

Implement PostgresRange example app

0d88636

Improve rangeinstances formatting

3c63327

Describe docs for PostgreSQL range types

ecfb2fd

Refactor implementation of PostgreSQL range types

65c175e

Improve docs for PostgreSQL range types

f0fed26

Improve docs for PostgreSQL range types

c5cde07

Final refactoring of PostgreSQL range types

8fa8a69

mergify bot added docs example postgres labels Mar 19, 2024

Extend RangeSuite

7501e0a

Review based improvements for PostgreSQL range types

ecbc2c5

jatcwang added this to the 1.0-RC6 milestone May 15, 2024

tomohavvk and others added 2 commits May 21, 2024 12:08

Merge branch 'refs/heads/main' into range-types

31cb886

add explicit types to make 2.12 typecheck

c2896fc

jatcwang merged commit 8fca074 into tpolecat:main May 26, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PostgreSQL Range Types #2008

PostgreSQL Range Types #2008

tomohavvk commented Mar 19, 2024

satorg commented Mar 22, 2024 •

edited

tomohavvk commented Mar 23, 2024 •

edited

jatcwang commented May 15, 2024

PostgreSQL Range Types #2008

PostgreSQL Range Types #2008

Conversation

tomohavvk commented Mar 19, 2024

Range types

satorg commented Mar 22, 2024 • edited

1. Unintentional implicit conversions

2. Inexact type correspondence

3. Over-backticking

tomohavvk commented Mar 23, 2024 • edited

1. Unintentional implicit conversions

2. Inexact type correspondence

3. Over-backticking

4. Empty range

jatcwang commented May 15, 2024

satorg commented Mar 22, 2024 •

edited

tomohavvk commented Mar 23, 2024 •

edited