Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are command GUID really mandatory? #75

Open
wischi-chr opened this issue Jan 17, 2020 · 5 comments
Open

Are command GUID really mandatory? #75

wischi-chr opened this issue Jan 17, 2020 · 5 comments
Labels
documentation documentation needed, non existant or should be improved question

Comments

@wischi-chr
Copy link

wischi-chr commented Jan 17, 2020

Intro (skip it if you are in a hurry ^^ )
A few weeks ago I started a new project and keeping all data in memory and just persist a log of events is a perfect match for the requirements. I started implementing my own solution (protobuf on filesytem to persist the journal, events and an in memory projection of the data).

A few days ago I found this project and it looks like someone took my prototype and put a bit more thought and effort into it so I'm consider switching to memstate, because why maintain some quirky homebew version of it.

Reading through the documentation I noticed a few differences and don't quite understand the design choice with the command ids (GUID)

Question about command id
Are those really necessary? Couldn't just the object reference (because we are in memory) be used to track if an command was already executed or not. Persisted commands do have an implicit integer id: the position in the journal.

If I read the code right, the GUIDs are also serialized and thus every command has at least 16 byte (uncompressable, because they are random) overhead that is not needed, because it will never be used again after it's serialized.

Would be great I you could let me know if I got something wrong or what lead to this design choices. Thanks in advance

@goblinfactory
Copy link
Collaborator

+1 for this question. I'd be keen to either spike out some tests, explore whether the GUID can be made optional. Keen to hear the official reasoning. I'll happily volunteer to update the docs with any replies from @rofr. I'm now super keen to dive into more documentation now that I discovered Simon Copp's snippets tool to reference unit tests inside readme's, so that your code is gauranteed to always be correct! booya... sorry I digress, yes, good question :D

@goblinfactory goblinfactory added documentation documentation needed, non existant or should be improved question labels Jan 17, 2020
@rofr
Copy link
Member

rofr commented Jan 17, 2020

Great question. Each node sends its commands to the underlying storage and never executes them. Execution happens off the stream of commands coming back from the underlying storage. This is to ensure a global order of commands across multiple nodes. The commands received do not have the same object identity becuase they are constructed from a network stream.

The main point of the GUID is to correlate the command sent with the command received so that we can respond to the client which issued the command. But as you point out there is no value in persisting the GUID per se. It only needs to travel back to the engine from which it originated.

@wischi-chr
Copy link
Author

Oh I get it. Maybe the GUID could be moved to a wrapper type with a nullable GUID and the storage provider makes sure that the command that was written is returned with the correct guid or if the command is replayed from storage just use null (because the GUIDs are not serialized).

While digging through the code to get some ideas how this could be solved I found a few things, I'm not sure if they are intended or not and want to ask if you like me to collect them in a single issue or if I should open a separate issue for each.

A few examples:

  • Whitespace issues
  • Unnecessary usings
  • JournalRecord has public fields (CA1051), should probably be changed to properties
  • The Command class has an internal abstract method, but is itself public. This has the effect, that the class looks like it could be derived outside the Memstate namespace but can't be implemented because the abstract method ExecuteImpl can't be implemented. Solutions would be to either make the abstract method "protected internal" or internalize the class itself
  • inactive/commented out code

@rofr
Copy link
Member

rofr commented Jan 17, 2020

Maybe the GUID could be moved to a wrapper type with a nullable GUID and the storage provider makes sure that the command that was written is returned with the correct guid or if the command is replayed from storage just use null (because the GUIDs are not serialized).

That would require a significant redesign. Writing is a fire and forget operation. We would need to do a request/response for the write operation, then merge sort on the fly with incoming committed commands based on their sequence number in the stream. Perhaps that would be a better design! Let us contemplate :)

@rofr
Copy link
Member

rofr commented Jan 17, 2020

I haven't been very strict on tidyness, mostly trying to get the overall design right before adding polish. Feel free to submit PRs or create issues as you see fit. I recently switched to jetbrains rider as my main IDE and the code style recommendations differ a bit from VS2019.

IMO public readonly fields are cleaner/clearer than properties, so any such fields are intentionally public.

The Command class shouldn't be inherited directly, use the generic Command and Command<TModel, TResult>. So the Command class could very well be internal. On the other hand users might find it useful to refer to commands in a non-generic fashion, for example reading/writing from/to storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation documentation needed, non existant or should be improved question
Projects
None yet
Development

No branches or pull requests

3 participants