Skip to content

Conversation

@leana8959
Copy link
Collaborator

@leana8959 leana8959 commented Jan 15, 2026

Hello,

I made a proof of concept of the exact printer. The idea is that we collect all the trivia that is lost at the leaves, label them with corresponding data and propagate them upwards to the ParsecFieldGrammar. We call this data TriviaTree. This TriviaTree is later propagated to the pretty printer from PrettyFieldGrammar and within each pretty printer instance we can make wiser choice based on what whitespaces is lost and recreate the original output.

I would like to know what you think about this approach!

In 408a8b3 where I accept the format test changes, you can see that some whitespaces have changed. The changes are not in the right order yet, but it proves that I can retrive whitespace data from the printer.

This probably doesn't compose, but is ok serves as a poc
Doesn't have any effect.
see
```bash
cabal test Cabal-tests:parser-tests --test-options="-p /common.cabal/&&/expr/"
```
Once something is parsed, it is non-trivial to annotate it because we need to have that object in scope
For combinators, we don't know the type being parsed, so we don't know how to annotate it.
cabal test Cabal-tests:parser-tests --enable-profiling --ghc-options="-Wno-unused-imports" --test-options="-p /printer.cabal/&&/printer/" -O0
At this point we assert that the trivia passed to VersionRange has the
right shape
@leana8959 leana8959 marked this pull request as draft January 15, 2026 15:24
@leana8959
Copy link
Collaborator Author

The TriviaTree should make the following functions work as intended, beside having an instance of Semigroup and Monoid. It is very similar to the idea of "complement" in the biparser paper.

The idea is to model the parser as of type Text -> (Complement, a) and the printer of type (Maybe Complement, a) -> Text.
Concretely ParsecFieldGrammar is changed to always output a TriviaTree along side the data (TriviaTree, a), and on the otherhand the PrettyFieldGrammar takes an extra argument of type TriviaTree to pretty print.

-- Namespace is a sumtype representing all the data I want to index at the moment, but I'm changing it to a type class. See "annotating generic outputs" below.

-- | Label a trivia with it's associated data "Namespace"
fromNamedTrivia :: Namespace -> Trivia -> TriviaTree

-- | Mark a given tree being associated with an aggregated data "Namespace"
mark :: Namespace -> TriviaTree -> TriviaTree

-- | What's the trivia associated with data "Namespace"
unmark :: Namespace -> TriviaTree -> TriviaTree

However there are some problems I am seeing. I don't think they are covered in the biparser paper since these problems arise due to the non-sequential nature of final stage of cabal parser.

performance

Data are duplicated all over the place as keys to indicate which of the sub triviatree contains trivia that interests us. I know little about how sharing works in Haskell/GHC and I wonder if this will cause memory pressure.

annotating generic outputs

I can't use data as Namespace when I don't know its exact type to put it into a namespace. This is a problem for parser combinators that takes a Parser (TriviaTree, a) and outputs a Parser (f (TriviaTree, a)) where f is a list or nonempty list, etc.
To annotate the outputted trivia tree (e.g. adding a numbering annotation) of the subparser in a combinator, I need to be able to turn its output data into a Namespace so that numbering is associated to the right data. But I can't because I don't know the right constructor to turn some data into aNamespace due to the data being generic.

I suppose one solution among others is to make Namespace a type class where all instances can be turned into a Namespace, and I add that constraint where I need it. Something like https://hackage.haskell.org/package/xmonad-0.18.0/docs/src/XMonad.Core.html#fromMessage

type-safety

The marking and unmarking is very error-prone. Currently I annotate and mark the trees in a way as simple as possible, but there's no way to check at build time if I mark things in the right order or if I double marked something. Mismatch in the parser "marking" order and the printer "unmarking" order will cause the trivia to not be seen and not used.

Copy link
Collaborator

@Bodigrim Bodigrim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a great work, but I personally find it too big to review. There seems to be a lot of shuffling stuff around intermingled with actual changes. Could we separate splitting Distribution.Types.Foo into Distribution.Types.Foo.{Internal,Parser,Pretty} into their own commits at the beginning of the branch?

(I'm not a maintainer here and you don't need my review, so feel free to ignore)

-- See also https://github.com/ekmett/transformers-compat/issues/35
, transformers (>= 0.3 && < 0.4) || (>=0.4.1.0 && <0.7)

, pretty-simple
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cabal-syntax is a GHC boot library and cannot depend on non-boot ones like pretty-simple.

-- | Skips /zero/ or more white space characters. See also 'skipMany'.
spaces :: CharParsing m => m ()
spaces :: (Monad m, CharParsing m) => m ()
spaces = skipMany space <?> "white space"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can define spaces = void spaces' now, so that the relation between combinators becomes even more apparent.

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE StrictData #-}

module Distribution.Types.Annotation where
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we possibly have more comments about the purpose and semantics of this module and its constituents? It's hard to review otherwise.

@ulysses4ever
Copy link
Collaborator

Point of order

(I'm not a maintainer here and you don't need my review, so feel free to ignore)

@Bodigrim's comments are very welcome by the Cabal maintainers, and we hope they will be addressed (and we hope for more of them!).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants