Add "Runtime basics" to the tutorial #408

SeanTAllen · 2019-10-22T16:18:53Z

While discussing ponylang/ponylang-website#502 (add quiescence FAQ).

Garbage collector basics (there are 2 entries in the appendix)
Quiescence
ASIO system
Env.exitcode

Perhaps a link to runtime options, also information on how to find more information.

rhagenson · 2019-11-20T02:19:14Z

Any thoughts on where this chapter should go? My opinion in after Packages and before Testing as the former ends on the Standard Library and the latter starts into sort of advanced Pony usage.

rhagenson · 2019-11-20T02:46:05Z

Adding a task list to ensure I stay organized (this task list was build over the course of writing):

Add new chapter and weight accordingly in config for menu placement
Add section on on Env.exitcode
Add section on garbage collector
Add section on ASIO system
Add section on quiescence (see Section about quiescence as it relates to program termination. #157, as well)
Add Backpressure/Muting section
Add section on Runtime Options

rhagenson · 2019-11-20T02:53:32Z

After re-reading both the runtime-related appendices (Memory Allocation at Runtime, and Garbage Collection with Pony-ORCA), I think this information should be moved into the new chapter rather than repeated in the appendix.

Also, Memory Allocation at Runtime ends with an ellipsis. Was there something more intended to go there?

SeanTAllen · 2019-11-20T03:14:38Z

@rhagenson I have no idea why the ellipsis is there.

SeanTAllen · 2019-12-03T17:43:26Z

@rhagenson do you have all the info you need?

rhagenson · 2019-12-03T19:19:27Z

@SeanTAllen I believe so. I have begun work over on https://github.com/rhagenson/pony-tutorial/tree/runtime-chapter Mostly so far it has been a refactor to move the runtime content from Appendices into the new chapter and rewrite that content to stress the pertinent details.

rhagenson · 2019-12-17T15:45:27Z

Friendly update that I am back on top of this. Had to take a hiatus while, among other things, I prepared a tutorial submission for the largest conference in my field. That tutorial will include Pony, if accepted.

Sort of preemptive question so feel free to answer solely off the cuff: what have been some past references for any of this information? E.g., blog post on how the ASIO system works, video on quiescence, past issue/PR that involved ample discussion of garbage collection, etc. Looking to make the language consistent between what has been deemed good/helpful in the past and the tutorial.

SeanTAllen · 2019-12-17T15:48:30Z

re: gc... see the orca paper: https://www.ponylang.io/media/papers/orca_gc_and_type_system_co-design_for_actor_languages.pdf and this video from @aturley: https://vimeo.com/181099993

rhagenson · 2019-12-17T16:06:17Z

Already had the ORCA paper, but not the video. Thank you for both.

Any similarly useful references for the other topics?

EpicEric · 2019-12-17T17:47:47Z

There are some early discussions from the beginning of the #runtime stream in Zulip (that were actually copied from the Slack) that cover a few runtime topics.

rhagenson · 2019-12-17T17:56:46Z

@EpicEric I had no idea there even was a runtime stream on Zulip. Thank you for pointing me toward it.

rhagenson · 2019-12-19T03:09:06Z

Reminder to self (and noted here for others to hold me to it):

Find good place to mention the number of scheduler threads is N where N is core count + 1 ASIO thread

The way I have it written now this detail would be forced or detract from the point wherever I put it. I am sure the content will change so not going to force it now when later unforced addition is still possible.

SeanTAllen · 2019-12-19T03:10:42Z

Number of scheduler threads is N where N is the core count. There is additionally an asio thread that handles receiving asio messages, however, it never runs any actors. When we say "scheduler threads", the asio thread is not included in that.

rhagenson · 2019-12-19T03:13:12Z

Understand the distinction now. Still will need to find a good place for this detail at a later time to ensure it is not just dropped in somewhere.

SeanTAllen · 2019-12-19T03:14:45Z

Couple of things to know about scheduler threads.

You can change the default number using --ponymaxthreads to set to less than N where N is the number of cores.

By default, the runtime will stop using scheduler threads that "aren't needed". This helps keep excessive work stealing from happening. This can scale down to 0 scheduler threads. At 0, the only thread running will be the ASIO thread that is waiting to receive an event. Once an event is received, at least 1 scheduler thread will be started back up. You can set a minimum number of scheduler threads to always keep running using the --ponyminthreads option.

If you want to, you can turn off scheduler thread scaling by using --ponynoscale.

There is also a --ponysuspendtheshold that has an impact on scheduler thread scaling.

rhagenson · 2019-12-19T03:15:40Z

Currently thinking, given the distinction just noted, that this might naturally fit into the ASIO system section in the reverse of the way stated here, i.e.: there is one ASIO thread + N scheduler threads...

rhagenson · 2019-12-19T03:19:00Z

For all the runtime-related options, rather than spread them throughout the new chapter, how about a section called "Runtime Options" that is the last section in the chapter and covers these runtime configuration options like pinning ASIO, changing minimum scheduler thread count, etc?

rhagenson · 2019-12-19T03:28:57Z

What states can an actor be in besides: alive, blocked, dead, and muted?

Alive: running a behavior or processing a message from its queue
Blocked: completed execution and no messages waiting in its queue
Dead: blocked itself and all actors with a reference to it are blocked
Muted: attempted send to overloaded actor and itself is not overloaded (is the result of backpressue and will be scheduled once backpressure decreases)

These are the ones I know of by reading through the #runtime stream and past runtime content from the tutorial. I want to ensure I am not neglecting an actor state.

SeanTAllen · 2019-12-19T03:43:05Z

@rhagenson I'm not aware of Dead being used as a term.

Alive -> Scheduled
Muted -> Muted
Blocked -> Unscheduled
Dead -> There is no state for this in the runtime.

For Unscheduled, a distinction could be made between "has no messages and therefore doesn't exist in a queue for a scheduler thread" and "has messages and is waiting in Scheduler thread's queue".

That would give 4 states. But we don't have agreed-upon terminology for those 2 possible "unscheduled" states.

Blocked would be one possible term in for the first unscheduled state (and is noted in actor.c as being "logically blocked"). We don't have a name afaik for the 2nd of the 2. Generally "Blocked" is mostly used when the cycle detector is in use. I think it would be reasonable to use as you have defined.

There is also "overloaded" and "under pressure" that could be considered states as well that are separate.

Sorry if this doesn't help much. I'm trying to provide more info, You are asking good questions.

EDIT

I'm realizing that "Unscheduled" might be problematic as there is a flag you can set via the C api called FLAG_UNSCHEDULED to manually remove an actor from scheduling. It isn't used anymore but it exists. This conversation is making me realize that we should definitely discuss an RFC to remove.

EDIT 2

re: C api -> there is a C api that is exposed that allows you to control various parts of the runtime including starting it up, scheduling actors, creating them etc. It isn't used by Pony but could be used to embed the Pony runtime in other systems.

SeanTAllen · 2019-12-19T03:43:26Z

For all the runtime-related options, rather than spread them throughout the new chapter, how about a section called "Runtime Options" that is the last section in the chapter and covers these runtime configuration options like pinning ASIO, changing minimum scheduler thread count, etc?

This sounds reasonable.

rhagenson · 2019-12-19T04:02:24Z

@SeanTAllen Thank you for the information.

For your own knowledge of where "dead: cropped up, it is in the Appendix on GC/ORCA that is being moved to the new chapter.

pony-tutorial/content/appendices/garbage-collection.md

Line 23 in 228ef53

 When an actor has completed local execution and has no pending messages on its queue, it is _blocked_. An actor is _dead_, if it is blocked and all actors that have a reference to it are blocked, transitively. A collection of dead actors depends on being able to collect closed cycles of blocked actors. 

Currently I use the three states: alive, blocked, dead and have not made mention of muted yet (same problem as the scheduler thread problem that I do not have a "natural" place for it yet). I then reuse the term "dead" in the Quiescence section to differentiate between collecting an individual actor and collecting a cycle of actors (i.e., it takes a cycle of dead actors to GC them all at once).

As for "overloaded" and "under pressure" I think given that those are both backpressure related I would categorize them into that system as the cause of muting. Of course given this is the tutorial I am trying to toe that line of just enough information at one time to be understood. Not suggesting it yet, but I would almost "hide" those backpressure states for now and put all three: muted, under pressure, and overloaded together in a backpressure-related chapter/section/FAQ/Appendix/etc.

I will progress with the "Runtime Options" section.

SeanTAllen · 2019-12-19T04:04:06Z

@rhagenson well, apparently we are using "Dead" somewhere. I never knew that.

SeanTAllen · 2019-12-19T04:04:48Z

As for "overloaded" and "under pressure" I think given that those are both backpressure related I would categorize them into that system as the cause of muting. Of course given this is the tutorial I am trying to toe that line of just enough information at one time to be understood. Not suggesting it yet, but I would almost "hide" those backpressure states for now and put all three: muted, under pressure, and overloaded together in a backpressure-related chapter/section/FAQ/Appendix/etc.

agreed.

SeanTAllen · 2019-12-19T04:32:46Z

@rhagenson I think that definition of dead is not quite right.

To be dead an actor also can't be registered with the asio system to receive events.

Or current definitions don't really take that into account.

Perhaps

Alive/Dead would be a good distinction

Alive: Has messages its queue or can receive messages (this includes ASIO events)
Dead: Has no message in its queue nor can it receive messages.

Running or Scheduled or Executing/Blocked/Waiting/Muted

Where "Running or Scheduled or Executing" is "currently processing its message queue"
Blocked is as you said
Waiting is "in the run queue for a scheduler with messages to process"
Muted is "not in a run queue for a scheduler. may or may not have messages in its queue"

An Alive actor can be Running, Blocked, Waiting or Muted.
A Dead actor can only be Blocked or Muted. (Although I'm not sure if the current implementation would consider a muted actor to be able to be collected by the GC- I would have to see what I did when I implemented muted).

rhagenson · 2019-12-19T14:51:12Z

@SeanTAllen My response below just grew and grew here so a lot to respond to here.

First, to be sure there was no typo, did you mean to say Alive is having messages or the ability to receive messages rather than having no messages?

So to summarize my understanding, it would shake down as (borrowing <: "subtype of" notation):

Running <: Alive
Waiting <: Alive
Executing <: Alive
Muted <: Alive
Muted <: Dead
Blocked <: Dead

Therefore, Muted is the only subtype that can be applied to either Alive or Dead actors. From your definitions, I merged Scheduled into Waiting as I do not understand the distinction between "waiting for scheduling" and "being scheduled" (latter of which I assume places an actor as Running). I make the distinction between Running and Executing as loosely related to semantic "in a behavior" (Executing) and "has control of a scheduler thread" (Running) -- a Executing <: Running might then still be technically correct.

Running and Scheduled are not GCed. Blocked and Muted are grounds for GC, Dead is GCed as soon as possible. A backpressure transition due to overload places the actor into Waiting. (I want to say backpressure "kills" actors, but that is not the case given the Alive/Dead supertype names we are using here.)

Anything in here that I missed or is not consistent with your view?

SeanTAllen · 2019-12-19T14:53:39Z

@rhagenson yes, Alive should have been "has messages in its queue". I've edited accordingly.

SeanTAllen · 2019-12-19T14:55:57Z

Running and Executing are the same thing.

Blocked is applicable to both Alive and Dead. Either can be blocked. But a blocked actor can be alive in that it can receive messages still.

Running <: Alive
Waiting <: Alive
Muted <: Alive
Muted <: Dead
Blocked <: Alive
Blocked <: Dead

rhagenson · 2019-12-19T15:12:58Z

Got it. Consistent on the view of GCing as well? Dead actors are GCed, while Alive actors in Blocked/Muted are possibly GCed?

SeanTAllen · 2019-12-19T15:15:46Z

Only Dead actors can be GCed. Alive means they can't be GCed because they are still capable of receiving messages.

Dead - can be GCed
Alive - can not be GCed

rhagenson · 2019-12-19T15:27:29Z

I had a rebuttal based on backpressure along with what I had written so far in the chapter for quiescence, however after reading what I wrote again along with the hierarchy here it all agrees that an actor must be Dead to be GCed. All Alive states are some form of the actor still being active so whether it is actively Waiting due to backpressure or not that will not result in GC to reallocate resources from the cooperative scheduler.

Thank you for helping me clarify these states!

SeanTAllen · 2019-12-19T15:29:37Z

@rhagenson you're welcome

rhagenson · 2019-12-30T23:14:47Z

Points gained from Andrew Turley VUG video:

Actors can send themselves messages, therefore I need to check that Sean and I's Dead/Alive/etc definitions factor this in (i.e., just because an actor is not referenced by other actors it still can send a message to itself or send a reference to itself out if it knows other actors therefore is not quite dead...yet)
The cycle detector is a special actor that receives messages when an actor is blocked (i.e., no more messages in its queue) with a reference to itself and the actors it references, as well as when it is unblocked (messages in the queue, waiting for a thread)
GC tracing within an actor has six steps:
1. All owned objects are marked unreachable
2. All unowned objects with foreign reference count (FRC) > 0 marked unreachable
3. Tracing from actor fields, mark reachable objects (owned or not)
4. All owned objects with local reference count (LRC) > 0 are marked reachable
5. Unreachable owned objects are collected
6. Decrement messages are sent for unowned objects are are unreachable, and their FRC is set to 0

The final point above has the subtlety that objects are always owned by the actor that created them, even if that actor no longer has a local reference to the object. Therefore it is possible for an actor to be Alive simply because it created an object that is still referenced by some other actor.

rhagenson · 2019-12-31T16:31:02Z

Per Sean and my's conversation above when we agreed on temporarily hiding the details of backpressure/muting/overloading/etc and place those details into their own cohesive unit I have decided the section just before Runtime Options (the last section in this Runtime chapter) should be on the backpressure system. This will currently place it after ASIO and Env.exitcode, but I foresee those sections perhaps moving earlier as the Quiescence chapter mentions ASIO and Env.exitcode should be moveable to earlier without causing issue (have not written it yet so time will tell).

rhagenson · 2020-02-17T20:47:04Z

Update: I created/organized all the sections previously discussed in this thread so it is now (checked if full written):

I feel like I can effectively write the Backpressure and Runtime Options sections. However I have the following questions for the other sections:

Env.exitcode

This is meant to cover Env.exitcode, of course, but I feel like I am missing the importance of having this be its own section. Does it need to be its own section or is the importance of it being in this chapter that the means to set exitcode is introduced in the tutorial?

ASIO Subsystem

As I have not yet needed to sink myself into the ASIO subsystem so I feel unprepared to write anything more than the very generic interplay between it and GC (which I have already written). For introducing the idea of asynchronous IO and its use in Pony are there existing blog posts or videos on the topic? I read through the #runtime channel on Zulip and still feel I have not yet plucked out the important details.

rhagenson · 2020-06-24T02:40:26Z

Perhaps not a "Runtime Basic" but I had yet to mention that the default options of the runtime can be overwritten via the RuntimeOptions struct in builtin, see here. In brief, the --pony* CLI options can be set to different defaults via adding an FFI function to Main:

actor Main
  ...
  fun @runtime_override_defaults(rto: RuntimeOptions) =>
    ...

rhagenson · 2020-09-21T23:18:12Z

For when this issue continues and as part of the concept of quiescence it is important for the reader to understand that Main.create(...) can exit and the program will continue running if there are other actors exchanging messages and doing work.

This misunderstanding of the Pony runtime led to a conversation in Zulip on GC and actors in busy loops which appeared to draw the reasonable, but incorrect conclusion that since the program starts at Main.create(...) that it also exits when Main.create(...) exits.

SeanTAllen · 2020-09-22T18:37:18Z

@rhagenson are you working on this or should be trying to find someone to work on it?

rhagenson · 2020-09-22T18:40:43Z

I am not actively working on it, but I do have a branch in my fork with existing work. If this is a priority, I have no issue with someone taking it over. Otherwise, I will get a PR open before the end of this year. Wish I could say sooner, but I am not able to promise sooner.

SeanTAllen · 2022-02-10T00:19:23Z

@rhagenson q... do you want to finish this off or should I take over?

rhagenson · 2022-02-10T01:31:44Z

@SeanTAllen As much as I would like to say I am on top of this, I am not. My existing work exists over on https://github.com/rhagenson/pony-tutorial/commits/runtime-chapter, but has not been updated since early 2020. Please take it over and save as much of my former work as you see fit.

SeanTAllen · 2022-02-10T15:00:34Z

I've grabbed ryan's commits from his fork and pushed to this repo on the branch runtime-basics

SeanTAllen mentioned this issue Oct 22, 2019

Add FAQ for "quiescence" ponylang/ponylang-website#502

Closed

EpicEric mentioned this issue Nov 19, 2019

Section about quiescence as it relates to program termination. #157

Closed

rhagenson self-assigned this Dec 23, 2019

ponylang-main added the discuss during sync Should be discussed during an upcoming sync label Feb 10, 2022

jemc removed the discuss during sync Should be discussed during an upcoming sync label Mar 8, 2022

jemc unassigned rhagenson Mar 8, 2022

Add "Runtime basics" to the tutorial #408

Add "Runtime basics" to the tutorial #408

Comments

SeanTAllen commented Oct 22, 2019

rhagenson commented Nov 20, 2019

rhagenson commented Nov 20, 2019 • edited

rhagenson commented Nov 20, 2019

SeanTAllen commented Nov 20, 2019

SeanTAllen commented Dec 3, 2019

rhagenson commented Dec 3, 2019

rhagenson commented Dec 17, 2019

SeanTAllen commented Dec 17, 2019

rhagenson commented Dec 17, 2019

EpicEric commented Dec 17, 2019

rhagenson commented Dec 17, 2019

rhagenson commented Dec 19, 2019 • edited

SeanTAllen commented Dec 19, 2019

rhagenson commented Dec 19, 2019

SeanTAllen commented Dec 19, 2019

rhagenson commented Dec 19, 2019

rhagenson commented Dec 19, 2019

rhagenson commented Dec 19, 2019 • edited

SeanTAllen commented Dec 19, 2019 • edited

SeanTAllen commented Dec 19, 2019

rhagenson commented Dec 19, 2019

SeanTAllen commented Dec 19, 2019

SeanTAllen commented Dec 19, 2019

SeanTAllen commented Dec 19, 2019 • edited

rhagenson commented Dec 19, 2019

SeanTAllen commented Dec 19, 2019

SeanTAllen commented Dec 19, 2019 • edited

rhagenson commented Dec 19, 2019

SeanTAllen commented Dec 19, 2019

rhagenson commented Dec 19, 2019

SeanTAllen commented Dec 19, 2019

rhagenson commented Dec 30, 2019 • edited

rhagenson commented Dec 31, 2019

rhagenson commented Feb 17, 2020

rhagenson commented Jun 24, 2020

rhagenson commented Sep 21, 2020 • edited

SeanTAllen commented Sep 22, 2020

rhagenson commented Sep 22, 2020

SeanTAllen commented Feb 10, 2022

rhagenson commented Feb 10, 2022

SeanTAllen commented Feb 10, 2022

rhagenson commented Nov 20, 2019 •

edited

rhagenson commented Dec 19, 2019 •

edited

rhagenson commented Dec 19, 2019 •

edited

SeanTAllen commented Dec 19, 2019 •

edited

SeanTAllen commented Dec 19, 2019 •

edited

SeanTAllen commented Dec 19, 2019 •

edited

rhagenson commented Dec 30, 2019 •

edited

rhagenson commented Sep 21, 2020 •

edited