This repository was archived by the owner on Sep 1, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
Architecture overview
Konstantin edited this page Jul 26, 2014
·
6 revisions
The system involves the following entities:
- Client
- MetaScheduler
- Executor
A Client has a necessity to compute something. He or she creates a JobDescriptor and send it to MetaScheduler.
The Client can ask MetaScheduler about a task status or termination.
According to it's internal state, an Executor ask MetaScheduler for new tasks. MetaScheduler can split original JobDescriptor of the Client into several pieces and return only some of them to the Executor. The Executor must provide a TaskID to MetaScheduler, so that MetaScheduler could be able to specify the task status or terminate the task.
+------+ +-------------+ +--------+ +--------+
|Client| |MetaScheduler| |Executor| |Executor|
+--+---+ +------+------+ +----+---+ +----+---+
| | | |
|JobDescriptor | | |
+------------> | | |
| | | |
| TrackingID | | |
| <------------+ | |
| | | |
| | | |
| | I'm ready | |
| | <-------------+ |
| | | |
| |JobDescriptor1 | |
| +-------------> | |
| | | |
| | TaskID1 | |
| | <-------------+ |
| | | |
| | I'm ready |
| | <-------------+------------+
| | | |
| | JobDescriptor2 |
| +---------------+----------> |
| | | |
| | TaskID2 |
| | <-------------+------------+
| | | |
+ + + +
This flow has the following advantages:
- MetaScheduler has no knowledge about each executor's computational load. If executor asks for a job, it'll get it. If not, MetaScheduler does not care.
- Executor may have a bunch of implementations (YARN, EC2, Cocaine).
Is to be investigated how IPython Cluster can communicate with this architecture. I hope that it will work on the top of JobDescriptor protocol.