Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PostgreSQL Range Types #2008

Merged
merged 16 commits into from
May 26, 2024
Merged

PostgreSQL Range Types #2008

merged 16 commits into from
May 26, 2024

Conversation

tomohavvk
Copy link
Contributor

Hollo community, I would like to introduce my vision for PostgreSQL range types implementation in doobie since it was requested in this issue but not implemented yet #936

I hope together we can improve this implementation and release the Range Types support into life


Range types

The following range types are supported, and map to doobie generic class:

  • the int4range schema type maps to Range[Int]
  • the int8range schema type maps to Range[Long]
  • the numrange schema type maps to Range[Float] | Range[Double] | Range[BigDecimal]
  • the daterange schema type maps to Range[java.time.LocalDate]
  • the tsrange schema type maps to Range[java.time.LocalDateTime]
  • the tstzrange schema type maps to Range[java.time.OffsetDateTime]
case class Range[T](lowerBound: Option[T], upperBound: Option[T], edge: Edge)

To control the inclusive and exclusive bounds according to the PostgreSQL specification you need to use a special Edge enumeration when creating a Range:

object Edge {
  case object `(_,_)` extends Edge
  case object `(_,_]` extends Edge
  case object `[_,_)` extends Edge
  case object `[_,_]` extends Edge
  case object `empty` extends Edge
}

In the text form of a range, an inclusive lower bound is represented by '[' while an exclusive lower bound is represented by '('. Likewise, an inclusive upper bound is represented by ']', while an exclusive upper bound is represented by ')'.


Note: the range types mappings are defined in a different object (rangeimplicits). To enable it you must import them explicitly:

import doobie.postgres.rangeimplicits._

To create for example custom implementation of Range[Byte] you can use the public method which declared in the following package doobie.postgres.rangeimplicits:

def rangeMeta[T](sqlRangeType: String)(implicit D: RangeBoundDecoder[T], E: RangeBoundEncoder[T]): Meta[Range[T]]

The main requirements is to need to implement the RangeBoundDecoder[Byte] end RangeBoundEncoder[Byte] which are essentially:

type RangeBoundDecoder[T] = String => T
type RangeBoundEncoder[T] = T => String

For a Range[Byte], the meta and bounds encoder and decoder would appear as follows:

import doobie.postgres.rangeimplicits._
import doobie.postgres.types.Range.{RangeBoundDecoder, RangeBoundEncoder}

implicit val byteBoundEncoder: RangeBoundEncoder[Byte] = _.toString
implicit val byteBoundDecoder: RangeBoundDecoder[Byte] = _.toByte

implicit val byteRangeMeta: Meta[Range[Byte]] = rangeMeta[Byte]("int4range")

...
val int4rangeWithByteBoundsQuery = sql"select '[-128, 127)'::int4range".query[Range[Byte]]

@satorg
Copy link

satorg commented Mar 22, 2024

Hello @tomohavvk, thank you for your thorough work!
I believe the support for range types will be a fantastic addition to Doobie!

However, I would like to discuss some potential pitfalls I've found out in the PR (if I may).

1. Unintentional implicit conversions

Typedefs RangeBoundEncoder and RangeBoundDecoder followed by implicit function values of those types are nothing but disguised implicit conversions. Therefore if a user imports those implicits via import doobie.postgres.rangeimplicits._, the following behavior gets enabled automatically in the scope (scala 2.13.13 repl):

scala> type RangeBoundEncoder[T] = T => String
type RangeBoundEncoder

scala> implicit val intBoundEncoder: RangeBoundEncoder[Int] = _.toString
val intBoundEncoder: RangeBoundEncoder[Int] = $Lambda/0x000000012853cc00@3caa4d85

scala> def foo(str: String) = println(s"supposed to be a string: $str")
def foo(str: String): Unit

scala> foo(123) // <-- passing an `Int` into a function that takes `String` !
supposed to be a string: 123

I.e. all the provided conversions to and from String become implicit, which is definitely not what most of users mean.

Moreover, those implicits are not really necessary out there, compare please:

  • version with implicits:
    type RangeBoundEncoder[T] = T => String
    type RangeBoundEncoder[T] = T => String
    implicit val intBoundEncoder: RangeBoundEncoder[Int]     = _.toString
    implicit val longBoundEncoder: RangeBoundEncoder[Long]   = _.toString
    implicit val floatBoundEncoder: RangeBoundEncoder[Float] = _.toString
    implicit val intBoundDecoder: RangeBoundDecoder[Int]     = java.lang.Integer.valueOf(_)
    implicit val longBoundDecoder: RangeBoundDecoder[Long]   = java.lang.Long.valueOf(_)
    implicit val floatBoundDecoder: RangeBoundDecoder[Float] = java.lang.Float.valueOf(_)
    
    implicit val intRangeMeta: Meta[Range[Int]]     = rangeMeta("int4range")
    implicit val longRangeMeta: Meta[Range[Long]]   = rangeMeta("int8range")
    implicit val floatRangeMeta: Meta[Range[Float]] = rangeMeta("numrange")
  • versus a more straightforward version without conversion implicits:
    implicit val intRangeMeta: Meta[Range[Int]]     = rangeMeta("int4range")(_.toString, java.lang.Integer.valueOf)
    implicit val longRangeMeta: Meta[Range[Long]]   = rangeMeta("int8range")(_.toString, java.lang.Long.valueOf)
    implicit val floatRangeMeta: Meta[Range[Float]] = rangeMeta("numrange")(_.toString, java.lang.Float.valueOf)

In other words, the implicit conversions there do not seem simplifying anything (nor improve clarity IMO) and in fact require even more lines of code to put everything together eventually.

2. Inexact type correspondence

Speaking of the numrange type:

implicit val floatRangeMeta: Meta[Range[Float]]                   = rangeMeta("numrange")
implicit val doubleRangeMeta: Meta[Range[Double]]                 = rangeMeta("numrange")
implicit val bigDecimalRangeMeta: Meta[Range[BigDecimal]]         = rangeMeta("numrange")

AFAIK, numrange is a built-in range type for the numeric scalar. So yes, the best corresponding JVM type is BigDecimal indeed. However, providing Meta for ranges of Double and Float by default might not be very good idea – a user may want to create a custom range type for floating-point numbers and then they can run into ambiguous implicit problem in that case (from here):

CREATE TYPE floatrange AS RANGE (subtype = float8, ...);

Then somewhere in user code:

implicit val doubleRangeMeta: Meta[Range[Double]] = rangeMeta("floatrange")

So I would probably suggest to hold on creating too many type shortcuts from the beginning. Ultimately, if there is demand for that from users, it will be easier to add them later (comparing to deprecating and removing stuff in a case of complaints).

3. Over-backticking

It is solely my personal perceptions, but does using a lot of back-ticked identifiers in the code really improves it? I feel it may make code more obscure instead. Especially because not all users like to see ASCII-art in their code. For example, (_,_) – this one might look a bit obscene, don't you think?

I feel that more straightforward and less fancy naming could be more appreciated, e.g.:

object Edge {
  case object InclIncl extends Edge
  case object InclExcl extends Edge
  case object ExclIncl extends Edge
  case object ExclExcl extends Edge
  case object Empty extends Edge
}

Also, the suggested approach allows to construct a Range like this:

Range(Some(1), Some(2), Edge.empty)

Which I am not sure about if it makes a lot of sense.
In other words, modeling of the Range type may deserve additional thought.

@tomohavvk
Copy link
Contributor Author

tomohavvk commented Mar 23, 2024

Hello @satorg, thank you very much for the constructive comments, which have truly improved the implementation. I supported each of your suggestions, and here's what we have at the moment:

1. Unintentional implicit conversions

I completely removed any mentions of RangeBoundEncoder and RangeBoundDecoder. Now, there are no implicits; everything needs to be passed explicitly:

def rangeMeta[T](sqlRangeType: String)(encode: T => String, decode: String => T): Meta[Range[T]]

2. Inexact type correspondence

Done. numrange only maps to Range[BigDecimal]. We can easily extend it with more numeric types in the future(if needed)

3. Over-backticking

It is solely my personal perceptions, but does using a lot of back-ticked identifiers in the code really improves it? I feel it may make code more obscure instead. Especially because not all users like to see ASCII-art in their code. For example, (,) – this one might look a bit obscene, don't you think?

I thought about it :) but it didn't stop me because in my opinion it was the easiest way to understand, when you actually look at those brackets that are in PostgreSQL and you don't need to thinking about one more mapping, however, I agree that such code is rare and can be annoying. Therefore, I gladly rewrote it and now no one will see (_,_). Now it looks like:

object Edge {
  case object ExclExcl extends Edge
  case object ExclIncl extends Edge
  case object InclExcl extends Edge
  case object InclIncl extends Edge
}

4. Empty range

Also, the suggested approach allows to construct a Range like this:
Range(Some(1), Some(2), Edge.empty)
Which I am not sure about if it makes a lot of sense.
In other words, modeling of the Range type may deserve additional thought.

I redesigned the range type with this in mind, now the range hierarchy looks like this:

sealed trait Range[+T]
case object EmptyRange extends Range[Nothing]
case class NonEmptyRange[T](lowerBound: Option[T], upperBound: Option[T], edge: Edge) extends Range[T]

We can still create a range like: NonEmptyRange[Int](None, None, ExclExcl) which can surprise at first glance, but this is a valid case from PostgreSQL's point of view and we need to support it:

    nr     | isempty | lower | upper 
-----------+---------+-------+-------
 (,)       | f       |       |      
 [3,)      | f       |     3 |      
 (,5)      | f       |       |     5
 [1.1,2.2) | f       |   1.1 |   2.2
 empty     | t       |       |      
 [1.7,1.7] | f       |   1.7 |   1.7

If there are any suggestions with improvements, I will be glad to consider them.

@jatcwang
Copy link
Collaborator

Thanks for the fantastic PR description @tomohavvk and your review @satorg. I'll try and get this merged soon and be part of 1.0-RC6

@jatcwang jatcwang added this to the 1.0-RC6 milestone May 15, 2024
@jatcwang jatcwang merged commit 8fca074 into tpolecat:main May 26, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants