Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relative cross-references #27

Closed
analog-cbarber opened this issue Jun 22, 2022 · 18 comments
Closed

Relative cross-references #27

analog-cbarber opened this issue Jun 22, 2022 · 18 comments
Labels
feature New feature or request insiders Candidate for Insiders

Comments

@analog-cbarber
Copy link

analog-cbarber commented Jun 22, 2022

Limitations of absolute cross references

By default, mkdocstrings only supports cross-references where the path is
fully qualified or is empty, in which case it is taken from the title. But many times
you do not want to use fully qualified names everywhere in your doc strings, so you
must use the shorter name in the title and the full name in the path.

If you work with long package and class names or with namespace packages, this can result in a lot
of extra typing, harder to read doc-strings and more typographical errors.

For instance, here is a small example of what you need to do currently:

class MyClass:
    def this_method(self):
        """
        See [other_function][mypkg.mymod.MyClass.other_function] 
        from [MyClass][mypkg.mymod.Myclass]
        """

Relative cross-references

I propose that we support a relative_crossrefs option, that when enabled would instruct the python handler
to substitute relative path expressions into absolute paths based on the known path of the doc-string in
which it occurs along with the text in the title. For instance, the previous example could be written as:

class MyClass:
    def this_method(self):
        """
        See [other_function][.] from [MyClass][^]
        """

Relative path syntax

The relative path specifier has the following form:

  • If the path ends in . then the title text will be appended to the path
    (ignoring bold, italic or code markup).

  • If the path begins with . then it will be expanded relative to the path
    of the doc-string in which it occurs. As a special case, if the current
    doc-string is for a function or method, then . will instead be
    expanded relative to the function's parent (i.e. the same as ^.).

  • If the path begins with (c), that will be replaced by the path of the
    class that contains the doc-string

  • If the path begins with (m), that will be replaced by the path of the
    module that contains the doc-string

  • if the path begins with (p), that will be replaced by the path of the
    package that contains the doc-string. For this purpose, a package is
    a module that contains other modules or that does not have a parent
    module.

  • If the path begins with one or more ^ characters, then that will go
    up one level in the path of the current doc string for each ^

Examples

Using relative syntax

class MyClass:
    def this_method(self):
        """
        [`that_method`][.]
        [init method][(c).__init__]
        [this module][(m)]
        [OtherClass][(m).]
        [some_func][^^.]
        """

Using absolute syntax

class MyClass:
    def this_method(self):
        """
        [`that_method`][mypkg.mymod.MyClass.that_method]
        [init method][mypkg.mymod.MyClass.__init__]
        [this module][mypkg.mymod]
        [OtherClass][mypkg.mymod.OtherClass]
        [some_func][mypkg.mymod.some_func]
        [
        
        """
@analog-cbarber
Copy link
Author

I have implemented the above proposal and will submit a merge request shortly...

analog-cbarber added a commit to analog-cbarber/mkdocstrings-python that referenced this issue Jun 22, 2022
This fully implements the proposed feature.

Fixes mkdocstrings#27
@analog-cbarber
Copy link
Author

Comments?

@pawamoy
Copy link
Member

pawamoy commented Jun 25, 2022

Hi @analog-cbarber, I didn't get the time to write a thoughtful answer yet. I agree that the feature would be great to have, though I'm not sure the implementation you propose is the right approach. I believe there's a way to implement something within mkdocstrings and autorefs themselves, to support more than one language, rather than directly in the Python handler. I'll do my best to explain how as soon as I can!

@analog-cbarber
Copy link
Author

Indeed that doing it that way would be more general, although if you try it you will see that it is much less obvious how you would do it, it is not clear that the class/module syntax would be that easy, and it is not clear that you would even be able to provide source locations for errors as I have done here. Currently, you have to just search for identifiers that show up in mkdocstring errors and hope that there aren't too many duplicates.

Griffe has a very nice clean and easy to understand interface, so this was relatively easy to implement this way. The implementation is easy to understand,and did not require any significant modification to the existing code. I am pretty sure that trying to do this in a more general manner in mkdocstrings/autorefs would be a much more serious change and/or would not be as functional.

Also note that different programming languages and existing doc tools (e.g. doxygen) may already have their own conventions for how doc strings/comments are formatted and how cross references should be formatted, so even if you provide a general solution, other language handlers may not take advantage of it.

If you approve of the syntax, why not allow it and worry about how and whether to generalize it later? Most of your users are working with python projects, and this will save them a lot of typing if they use cross-references.

I personally have a number of projects currently using sphinx that I would like to convert and I want to use cross
references extensively, but so far it has been very painful due to the absolute path problem. A general solution sounds
nice in principle but doesn't actually have any particular benefit for python projects.

If we could at least agree on the syntax, I could at least use my fork of the python handler until an approved solution
is publicly available.

analog-cbarber added a commit to analog-cbarber/mkdocstrings-python that referenced this issue Jun 26, 2022
Fixes mkdocstrings#27

This commit fixes:
 - the regular expression for checking valid python identifiers
 - update of line offset into doc string for error messages
@analog-cbarber
Copy link
Author

I am pretty sure this implementation is not really correct at least in where the substitutions are done.

But it would be nice to get feedback on the relative cross reference syntax.

analog-cbarber added a commit to analog-cbarber/mkdocstrings-python that referenced this issue Jul 10, 2022
Fixes mkdocstrings#27

- Move subsitution to from collect to render
- Warn w/ source location about missing reference after relative path resolution
- Work around issue with bad source information in python 3.7
- Add debug logging
- Fix bug in regular expression
@analog-cbarber
Copy link
Author

I have updated the merge request but have subclassed the python handler for use in my own projects.

I don't have any need for the more general solution and since it seems like it would be much more difficult to implement with the same level of functionality, I doubt that I will have time to work on that.

@leandro-lucarella-frequenz

Thanks a lot for putting this together! As a first impression, to me it is a bit weird that there are so many rules for relative imports, I was expecting something more similar to Python relative imports, but it is true that when you are in the scope of a class, for example, it is not enough if you want to avoid repeating the class name all over the place.

Also, if the idea is that it works cross-language, then using a Pythonic thing might also not be super suitable, but if this were only for Python, I would use something like self.symbol (or cls.symbol) to say it's relative to the current class and just use leading dots to go to parents like in imports.

But I did some experiments trying different syntax and it is very difficult not having to end up with this complexity, at least if the goal is to avoid repetition as much as possible (for me some repetition would be acceptable). But sadly I have nothing that it is clearly better than the proposal here. I was about to remove the whole comment, as I guess it doesn't provide a lot of value, but maybe it serves as some sort of validation, so I'll leave it :)

@analog-cbarber
Copy link
Author

It would be nice to have a syntax that works for other languages, but this implementation is obviously python-specific and in practice any given use of the cross-reference syntax will always be language specific. It is unlikely that anyone is going to want to copy doc-strings from python and move them to another language or vice-versa, so having some language-specific differences may be ok.

I have to say that using this in my own code, that cross-references that use more than one ^ operator is a little bit confusing to read because you have to translate it in your head. This is why I added the (c), (m) and (p) syntaxes.

@pawamoy pawamoy added the feature New feature or request label Mar 5, 2023
@analog-cbarber
Copy link
Author

FYI, I just published a public package that implements this proposal. This is an alternate handler that subclasses the standard python handler and adds this capability:

https://analog-garage.github.io/mkdocstrings-python-xref/

@pawamoy
Copy link
Member

pawamoy commented Sep 6, 2023

Nice! I'll add a mention of it in the docs 🙂 (if you're OK with that)

@llucax
Copy link

llucax commented Jan 12, 2024

I found myself really wanting this. We use a very nested package/module structure, and we are also refactoring often this structure, which makes references break frequently. Looking at this proposal again (and now the new package implementing it), I must say that I find it extremely confusing. I think if only python-like import syntax is provided, it would cover for 90% (of course this number is 100% fabricated :P) of the use cases in a very clear (and hopefully easier to implement) way. Even if there is still a bit of extra verboseness by having to repeat the class name, it is much better than the current status.

What I suggest is:

class MyClass:
    def this_method(self):
        """
        See [other_function][.MyClass.other_function] from [.MyClass][].
        """

I assume the above will render as (bold is a link):

See other_function from mypkg.mymod.MyClass.

Also, at least for my use case, normally I only want the leaf symbol name as title, so it would be nice to have an option to make [f.q.n.symbol][] be converted to [symbol][f.q.n.symbol], or a different syntax for it (like [f.q.n.symbol!][] or something). Also, when an implied title is used, I'd like the title to look like code.

So following the example above:

class MyClass:
    def this_method(self):
        """
        See [.MyClass.other_function!][] from [.MyClass!][].
        """

And this to be rendered as:

See other_function from MyClass.

But maybe I should create a separate issue(s?) about this.

@analog-cbarber
Copy link
Author

I actually find that syntax even more confusing. You can always fork my project and implement your own syntax if you want.

@lmmx
Copy link

lmmx commented Jan 13, 2024

🤝 Realigning

Important

If different suggestions are confusing different people then I'd suggest to narrow the proposed DSL to what everyone agrees on.

It seems like the . notation is universal (it's used in Python imports at the package level and in qualnames at the module level), and gets the majority of the benefit provided. Keeping an initial implementation to this is less maintenance burden.

I actually just proposed that we do that over in the autorefs repo unaware of this prior discussion.

Caution

This issue being 1.5 years old (mid-2022) is a red flag. Complexity is the blocker here (associated with maintenance burden) and should be eliminated directly to get this over the line.

It looks like much of the work to be done has already been figured out and a subset of it would be accepted so we seem to have a path to success here, without splitting off from the officially maintained mkdocstrings/python repo.

🧘 Simplified proposal

I agree with the proposal in #28 to interpret the dotted paths "bottom-up"

  • . meaning "here" on whichever qualname element the docstring belongs to
  • .. meaning "the parent" within the qualname and then, once reaching the module level, ascending the import path.

Having read #28 I feel it best to introduce no new special characters. A DSL should be self-explanatory (as intuitive and idiomatic as possible) for ease of use and I couldn't glance at that as a naive user with immediate confidence to write it.

I like @pawamoy's point that there should be one canonical way of writing a reference.

The one question seems to be which way to root a reference we would choose.

To take the example of package my_pkg ⇢ module my_mod ⇢ class MyCls ⇢ method my_method()

We write its full dotted path as:

my_pkg.my_mod.MyCls.my_method

Tip

This "full dotted path" = the import path my_pkg.my_mod joined by a . onto its "qualname" MyCls.my_method

Rooted bottom-up, from the object at the end of a qualname:

  • . would refer to my_method
  • .. would refer to MyCls
# my_pkg/my_mod.py
class MyCls:
    """I am [`MyCls`][.] and I contain [`my_method()`][.my_method]"""
    def my_method(self):
        """I am [`my_method()`][.] and I am in [`MyCls`][..]"""

@analog-cbarber
Copy link
Author

That is pretty much what my implementation does. You don't have to use the '^', '(c)' and '(m)' notation if you don't want to. However, in actual practice, we found that once paths have more than two dots in them, the references become really hard for humans to grok. Something like [some_func][(m).some_func] or [some_func][(m).] is a lot more obvious than [some_func][...some_func]. The (m) syntax makes it clear you are referring to the containing module even if you have nested classes and you don't force the user to figure out how many levels of namespace they need to navigate using .. I don't have a problem with dropping the '^' syntax, but I won't drop it from my own implementation because I have projects using that syntax.

It is true that the .. syntax is idomatic for imports, but it also suffers from the same problem and I typically just use a full package name rather than write something like from ....foo.bar import baz and force the reader to compute what that actually means.

In any case, if you want a solution to this problem right now, you can just use https://analog-garage.github.io/mkdocstrings-python-xref/ and restrict yourself to the '.' based syntax. I will maintain and support that package indefinitely and at least as long as our own projects depend on it.

@llucax
Copy link

llucax commented Mar 4, 2024

@analog-cbarber I finally had some time to try your handler and yeah, it covers my use case (I think at least, haven't fully migrated any project yet), sorry I didn't understand that before. Thanks for the great work!

@pawamoy
Copy link
Member

pawamoy commented Sep 3, 2024

Released as v1.9.0 of the insiders version 🙂 (v1.11.1.1.9.0)

@analog-cbarber would you like to send a PR to list your project as an alternative in the #relative_crossrefs section of our docs?

@pawamoy pawamoy closed this as completed Sep 3, 2024
@analog-cbarber
Copy link
Author

Released as v1.9.0 of the insiders version 🙂 (v1.11.1.1.9.0)

@analog-cbarber would you like to send a PR to list your project as an alternative in the #relative_crossrefs section of our docs?

Yes, please do, since my version has some more features, but I wonder if it will still work. Since my implementation subclasses yours and you use the same config keyword to enable it, it looks like it might enable both of our extensions. Not sure which one would win.

@pawamoy
Copy link
Member

pawamoy commented Sep 3, 2024

Ah, sorry about that.

I think yours will take precedence. IIUC you regex-replace references in docstrings before they are converted from Markdown to HTML. Our own implementation does that on-the-fly when converting a docstring. Once we reach that, your handler will have replaced the references with absolute ones anyway, so nothing more will happen :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request insiders Candidate for Insiders
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants