Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The description of the behaviour of "default" and unsafe arithmetic is unclear, especially the term wrap around #425

Open
fdsc opened this issue Feb 28, 2020 · 11 comments

Comments

@fdsc
Copy link

fdsc commented Feb 28, 2020

In https://tutorial.ponylang.io/expressions/arithmetic.html

  • | add() | wrap around on over-/underflow
    +~ | add_unsafe() | Overflow E.g. I32.max_value() +~ I32(1)

But I see
var x10 = U32.max_value() + 1
env.out.print("U32.max_value() + 1: " + x10.string()) // print 0

x10 = U32.max_value() +~ U32(1)
env.out.print("U32.max_value() +~ 1: " + x10.string()) // 4294967295

That is, "+~" is performed without overflow, but "+" - with overflow

ponyc -v
0.33.2-aff95ec [release]
(windows 7 x64)

@SeanTAllen
Copy link
Member

You seem to have an expectation of what should be happening? +~ over the max in your example is undefined behavior.

The current description says that the result of an overflow is undefined. That is what you are seeing. +~ isn't performed without overflow, it is undefined what will happen when it overflows. It can vary by platform, including overflowing as you might expect and wrapping around.

What your snippet from the docs above doesn't include is:

"Undefined in case of" before "Overflow E.g. I32.max_value() +~ I32(1)" so the full statement is:

"Undefined in case of Overflow E.g. I32.max_value() +~ I32(1)"

@fdsc
Copy link
Author

fdsc commented Feb 28, 2020

U32.max_value() + 1 == 0 is a native result for the Intel x64
Overflow occured at safe operation. Incorrect operation.


BUT
U32.max_value() +~ 1

The unsafe fast operation by tutorial ended without a native result
The unsafe operation makes no overflow
U32.max_value() +~ U32(1) must equal "0" for the Intel x64 (but 4294967295)

Don't you find it strange?
The unsafe operation ends safely, and the safe operation ends with an overflow?

@SeanTAllen
Copy link
Member

  • is not safe in the way you are using it. it is merely predictable.
    +~ is unsafe because it is undefined what will happen, not because you will get a wrong result.
    +? is probably what you mean by "safe".

As detailed later in that page +? will error on overflow or underflow.

Please note nothing in the page says that "+" is safe. It says it is "default arithmetic".

@fdsc
Copy link
Author

fdsc commented Feb 28, 2020

Please reread what I wrote above before reading below (yes, safe is "default arithmetic" )
I assume also that you know the assembler for the Intel x64 (x86-x64; AMD64 and other names) architecture


Operator | Method | Description

  • | add() | wrap around on over-/underflow

It says here that this operator has a wrapper.
"tries to balance the needs for performance and correctness"

That is, this operator is safe in the sense that it has a wrapper.
That is, it should not just the simple overflow

But. Something like this happens
mov rax, 4294967295
add rax, 1

That's all (I didn't look in the debugger to see if it looks like this)

This is the first contradiction


For "+~"
"No checks, raw speed, possibilities of overflow, underflow or division by zero"

This assumes just the native addition operation
What we saw above

However, instead we see the result of something like this
mov rax, 4294967295
mov rdx, rax
add rax, 1
jnc success

mov rax, rdx
success:

So here we see traces of the wrapper. But there should be no wrappers. This is raw speed!
But the wrapper is there, although it should not be


Try it on your version of ponyc, will you have this behavior?
U32.max_value() +~ U32(1) == ???
U32.max_value() + U32(1) == ???

@SeanTAllen
Copy link
Member

SeanTAllen commented Feb 28, 2020

I see a source of confusion. In the case of "wrap around on over-/underflow"; we will need to clarify that. "wrap" there means literally if you consider the representable range of numbers to be circular, it will wrap around, that is: MAX + 1 = the lowest value representable for the type.

It doesn't mean there is some wrapper the current description led you to believe.

The description means the opposite of what you are taking it to mean. The intent is to convey that it will wrap on overflow or underflow, not that there is a wrapper. It will do as you put it "a simple underflow".


+~ makes no guarantees at all. It is dependent on LLVM functions that declare themselves as undefined behavior in the case of overflow, underflow, etc. Those LLVM functions are the "unsafe" variants that it provides.


I can also see how the description of unsafe might have lead to your conclusions. @mfelsche, I think part of the issue here is that the description of unsafe makes it sound like default is doing something different and that isn't necessarily the case. But the presentation suggests that.

Really, we want to be contrasting checked/partial with unsafe. But the current reading makes unsafe sound like it is directly contrasted against default and therefore default must be doing something safe.

I think we need to rewrite.

@fdsc
Copy link
Author

fdsc commented Feb 28, 2020

This would be possible if not for the behavior of the operation "+~"

The problem is unsafe "+~" it should not contain any wrappers.
It should make the fine machine operation of addition.

But "+~" doesn't do it. Instead, it does what I wrote above

@fdsc
Copy link
Author

fdsc commented Feb 28, 2020

I have the impression that there are two problems here


  1. The compiler or LLVM does not work correctly under Windows or at all

Since it is obvious that "+~" should give the machine addition command. But does not give (question, only if I have?)


  1. There is no clear explanation in the tutorial text

Perhaps it is better to say that the default operations give the same thing on different architectures
And this is similar to the behavior of Intel x64

This is especially true for the "neg" command. Not a wrap around. neg(-128) == -128 for U8

@mfelsche
Copy link
Contributor

So, if you look into the generated IR (obtained from compiling some pony code with ponyc --debug --pass=ir .) it is clear that:

If the sum has unsigned overflow, the result returned is the mathematical result modulo 2n, where n is the bit width of the result.

  • +~ creates an LLVM add instruction with the nuw flag (No unsigned wrap)
    • this will result in LLVM emitting a poison value on overflow, leading to undefined behaviour as the tutorial and Sean explained.

I am not an expert on x86 assembler and i am unable to make sense of the assembler emitted when calling ponyc as ponyc --pass=asm . as the debug info is stripped. Beware that the LLVM passes we use will do several optimizations on the generated IR, including inlining constant folding etc etc. especially in case of undefined behaviour, the results might be surprising.

I would say that if you think that ponyc is not emitting correct assembler, please investigate further on your machine, look into the generated assembler, and create an issue on the ponyc repo ( https://github.com/ponylang/ponyc/issues ).

I am closing this issue as this is not a matter of the tutorial imho.

@fdsc
Copy link
Author

fdsc commented Feb 28, 2020

I think this is a textbook question too.
I agree with what SeanTAllen indicated above.

The words about security compromise are not clear. They're confusing.
The words "wrap around" are also very confusing. In LLVM, in this sense, everything is described correctly.
In addition, the words "wrap around" are clearly incorrect when describing the "neg" operation"

@fdsc fdsc mentioned this issue Feb 28, 2020
@mfelsche
Copy link
Contributor

Ok, if we describe this issue as:
"The description of the behaviour of "default" arithmetic is unclear, especially the term wrap around."
This issue makes sense here, and I agree with your point. Does that match with what you have in mind?

Please feel free to edit the title accordingly.

@mfelsche mfelsche reopened this Feb 28, 2020
@fdsc fdsc changed the title description of unsafe arithmetic is incorrect The description of the behaviour of "default" arithmetic is unclear, especially the term wrap around Feb 28, 2020
@fdsc fdsc changed the title The description of the behaviour of "default" arithmetic is unclear, especially the term wrap around The description of the behaviour of "default" and unsafe arithmetic is unclear, especially the term wrap around Feb 28, 2020
@fdsc
Copy link
Author

fdsc commented Feb 28, 2020

Yes

  1. It would be nice to add that we are talking about adding by module.
  2. Complement the describe of the concept more clearly, what is the difference in the default concept compared to unsafe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants