Skip to content

Add ability to define a simple implementation in embed.fnc #23421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: blead
Choose a base branch
from

Conversation

khwilliamson
Copy link
Contributor

This can replace mathoms.c and lots of essentially boiler plate
scattered throughout the code.

Quite a few of our functions and macros are specializations of more
general ones, taking fewer parameters than the generalized ones. Often
we will have a
foo_flags(a, flags)
and a
foo(a)
which just calls foo_flags with a flag of 0.

This commit allows for a single extra line to be added to embed.fnc to
indicate the implementation of plain foo, replacing the extra boiler
plate lines that were needed to do this prior to this commit.

It uses the current infrastructure in the embed regeneration to
automatically create any needed long names, inserting thread context if
necessary without the programmer needing to worry about this, and
placing this in embed.h and proto.h

I'm not thrilled about the syntax of this. You add an extra
continuation line to the declaration. This line begins with an '=',
followed by the implementation details. For example

Adp|U8 *|uv_to_utf8|NN U8 *d  \
                   |UV uv\
       = uv_to_utf8_flags(d,uv,0)

But embed.fnc is the file that already generates embed.h and proto.h,
and no extra information beyond what embed.fnc already contains needs to
be added.

The bottom line is that many few-line functions in our code can be
replaced by adding a single line to embed.fnc, which will make those
macros instead of functions. This requires fewer resources; the
generated macros require no resources unless actually called; and it's
simply more convenient to do this here in one place, than do the extra
boiler plate that has to be in multiple places otherwise.

embed.h will now contain long name definitions that formerly it didn't,
so autodoc.pl has to change to look for them

This commit changes nothing; just adds the capability. The next few
commits will start to use it

  • This set of changes does not require a perldelta entry.

embed.fnc Outdated
|NULLOK SV *ssv
Abdp |void |sv_setsv |NN SV *dsv \
|NULLOK SV *ssv \
= sv_setsv_flags(dsv,ssv,SV_GMAGIC|SV_DO_COW_SVSETSV)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If shim function "short/basic" variant, is function "extended/long/HAS_FLAGS" variant, with a dumb simple 0 passed as arg flags, then yes I agree embed.fnc should automate the codegen for "short/basic" variant.

If arg flags is NOT 0, that is active logic, and these don't round trip in behavior "short/basic" variant -> "extended/long" variant -> "short/basic" variant, the it shouldn't be part of embed.fnc codegen.

I dont think this sv_setsv() = sv_setsv_flags(dsv,ssv,SV_GMAGIC|SV_DO_COW_SVSETSV)` style of shim should be part of embed.fnc.

Hiding runtime C behavior in files exts not ending with .h/.c/.xs makes development harder, and makes grepping the code base harder for me, and makes even more work to set up an IDE/text editor to efficiently deal with the p5p repo.

Sure I can work around that and add Perl's fictional .fnc file format to my IDE's grep presets, but hiding runtime C behavior/C machine code inside files not ending with .h or .c makes everything harder. Some people may exclude proto.h and embed.h files permanently from their grep, since their contents are meaningless junk that are necessary for C lang and OS formalities.

sv_setsv() = sv_setsv_flags(dsv,ssv,SV_GMAGIC|SV_DO_COW_SVSETSV) specifically is a bug ticket/design problem right now, that #23409 and #13878 revolve around

the statement sv_setsv() = sv_setsv_flags shouldnt be hidden in a non obv place

automating pvs() variants would be similar to this PRs idea and less impact runtime wise, since pvs() variants have no "design" choices or pick 1 of many possible implementation options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand much of what you are saying here. Moving the define from sv.h to embed.fnc effectively is moving the define to embed.h, where lots of other defines already are. Your IDE, like mine, should find things in embed.h. This does not hide something in a non-obvious spot. Everyone should already know that if they find something in embed.h, that the source is embed.fnc.

I don't understand what you are saying about round-tripping.

If this facility were to start to be used for adding non-0 flags, then embed.h could be more important than it is now, and people might want to stop excluding it from their greps, if they are. But their IDE should point them to the right place. If people have an opinion on this, speak up

I have removed the sv_setv changes from this p.r. , and replaced with with one that doesn't have these kinds of issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand much of what you are saying here. Moving the define from sv.h to embed.fnc effectively is moving the define to embed.h, where lots of other defines already are. Your IDE, like mine, should find things in embed.h. This does not hide something in a non-obvious spot. Everyone should already know that if they find something in embed.h, that the source is embed.fnc.

embed.fnc syntax doesn't have the ( from sv_set*v( to find the definition of a function of macro. If I search gv_fetch or gv_fetchmeth I will get a full laptop screen of irrelevant hits. If I search gv_fetch\( or gv_fetchmeth\( I know I am matching against 1 and only 1 C function definition. gv_fetch\( or gv_fetchmeth\( can't ever match anything inside embed.fnc. Using gv_fetch\| or gv_fetchmeth\| still doesn't match anything inside embed.fnc.

This is now extra work for me to scroll through 15-30+ irrelevant hits and filter with my eyes, to learn what the _flags() version of _no_flags() is. If I can't find the #define for _no_flags() -> _flags() in a .h, I want to be able to safely type _flags(, , , 0) assuming embed.fnc generated it, and the last arg flags can be assumed it is 0 without manually verifying it is 0.

I don't understand what you are saying about round-tripping.

When I say round tripping, I mean the 2 func calls are symmetric. xv_do_tv(a, b, c) can be safely assumed to be xv_do_tv_flags(a, b, c, 0) and vice versa. In other words, macro xv_do_tv(a, b, c) is NOT implemented and written as

#define xv_do_tv(a, b, c) xv_do_tv_flags(a, b, c, TV_NO_LOG_METRICS | TV_NO_OVERLOAD \
        | TV_NO_BACKTRACE | TV_NO_ALTERNATE_ERROR_DISPATCH)

And constants TV_NO_LOG_METRICS | TV_NO_OVERLOAD | TV_NO_BACKTRACE | TV_NO_ALTERNATE_ERROR_DISPATCH added up are 0xA9 aka 10101001. Macros that forward from _no_flags() to _flags(), with a non-0 flags constant on the far right side, probably need C comments saying why the _no_flags() to _flags() macro, that has a non-0 flags constant, and is the way is the way it is. That _no_flags() to _flags() macro is now active logic, and not a simple low risk mundane codegen template task, which is the specialty category of embed.fnc to do.

I am aware that "the ship has sailed" for any existing _no_flags() to _flags() forwarder macros with a non-0 flags constant on the far right side, since real backend function _flags() can't modify its P5P private API or CPAN XS API after it and its prototype was invented and added to perl 5-10-15-20 years ago.

I think _no_flags() to _flags() forwarder macros with a non-0 flags constant on the far right side should stay in .h files, and not be automated by embed.fnc. If _no_flags() to _flags() macro has a 0 value on the far right side, I'm fine with embed.fnc generating the macro.

In a patch I never published from around 2014-2017, I did convert WinPerl to use i386 __stdcall or i386 __fastcall calling convention through a build flag. If the optimization would've been published in WinPerl stable, I then would've written an embed.fnc patch, to enable i386 __stdcall or i386 __fastcall CC tailcalling, by automatically swapping the my_perl var from the far left side of arg list to far right side of arg list, and swap flags arg from far right side to far left side.

Automatic moving of my_perl from far left to far right, and flags from far right to far left, is only beneficial on i386 Windows and i386 Linux builds, and harmful everywhere else. All RISC and AMD64 OSes and CPUs that Perl runs on use a __regcall or __fastcall ABI of some kind. That allows function symbol sv_setsv() to be a tailcall, with just 2 instructions, on every RISC CPU/OS and every AMD64/x64 OS. I consider i386 legacy, so I doubt will actually publish that patch now, but this is an example of futuristic embed.fnc benefits over the current state, just like your PR here is a future benefit over the current state. embed.fnc/regen.pl itself has always been a large upgrade over a naked stock vendor CC toolchain.

mathoms.c Outdated
PERL_ARGS_ASSERT_SV_SETSV;

sv_setsv_flags(dsv, ssv, SV_GMAGIC);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mathom.c linker symbol stub's .c code, DOES NOT match the macro. Touching this sv.c/sv.h code logic absolutely shouldn't be part of a generic "improve embed.fnc for easier use" PR/commit, and needs to be done in a separate PR/commit.

#define sv_setsv(dsv, ssv) \
        sv_setsv_flags(dsv, ssv, SV_GMAGIC|SV_DO_COW_SVSETSV)
``

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I forgot that the mathoms.c code for this function and the sv.h definition had gotten out of sync. I changed to use a different function that doesn't have an inconsistency

This can replace most of mathoms.c and lots of essentially boiler plate
scattered throughout the code.

Quite a few of our functions and macros are specializations of more
general ones, taking fewer parameters than the generalized ones.  Often
we will have a
    foo_flags(a, flags)
and a
    foo(a)
which just calls foo_flags with a flag of 0.

This commit allows for a single extra line to be added to embed.fnc to
indicate the implementation of plain foo, replacing the extra boiler
plate lines that were needed to do this prior to this commit.

Another advantage is this lowers the likelihood of the implementation of
long-name functions getting out-of-sync with supposedly-equivalent short
name macros that are generally used to bypass them.  This commit causees
the long name form to be automatically generated, so there is no
possibility of getting out-of-sync.

It uses the current infrastructure in the embed regeneration to
automatically create any needed long names, inserting thread context if
necessary without the programmer needing to worry about this.  The
results are placed in embed.h and proto.h

I'm not thrilled about the syntax of this.  You add an extra
continuation line to the declaration.  This line begins with an '=',
followed by the implementation details.  For example

 Adp    |U8 *   |uv_to_utf8     |NN U8 *d   \
				|UV uv	    \
              = uv_to_utf8_flags(d,uv,0)

But embed.fnc is the file that already generates embed.h and proto.h,
and no extra information beyond what embed.fnc already contains needs to
be added.

The bottom line is that many few-line functions in our code can be
replaced by adding a single line to embed.fnc, which will make those
into macros instead of functions.  This requires fewer resources; the
generated macros require no resources unless actually called; and it's
simply more convenient to do this here in one place, than do the extra
boiler plate that has to be in multiple places otherwise.

embed.h will now contain long name definitions that formerly it didn't,
so autodoc.pl has to change to look for them

This commit changes nothing; just adds the capability.  The next few
commits will start to use it.
This shows how taking a #define for sv_utf8_downgrade() from sv.h and
moving it to embed.fnc creates the equivalent entry in embed.h, plus
automatically generating macros that implement the long form
Perl_sv_utf8_downgrade().  This means the manual implementation of
Perl_sv_utf8_downgrade() in mathoms.c can be and should be deleted,
which this commit also does.
This shows how two synonymous macros can be conveniently defined in a
single place
This shows how the new capability works for elements without a thread
context parameter
This shows how the new capability works for replacing elements with a thread
context parameter that have an inline function
This shows how the new capability works for chaining elements, as
uv_to_utf8 calls this, which now calls uv_to_utf8_msgs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants