Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Launch separate TaskTracker instances for Map and Reduce slots #47

Open
tarnfeld opened this issue Apr 11, 2015 · 0 comments
Open

Launch separate TaskTracker instances for Map and Reduce slots #47

tarnfeld opened this issue Apr 11, 2015 · 0 comments

Comments

@tarnfeld
Copy link
Member

With the recent enhancements that landed related to freeing up some resources when a TaskTracker becomes idle, Hadoop is a little less greedy about holding onto cluster resources when it's not actually using them. However, because this is based on the whole TaskTracker being idle, we don't get the best chance of freeing resources when TTs have mixed slots, both map and reduce.

We should launch separate TTs for map and reduce slots. To do this effectively, we probably want to try and bunch up a many map or reduce slots onto each node as possible, as opposed to the current logic, which is to apply the map/reduce slot ratio to each incoming offer. Take the following example...


1 Slot = 1 CPU and 1GB RAM

Offers:

  • Slave (1) 10 CPUs, 10GB RAM
  • Slave (2) 10 CPUs, 10GB RAM
  • Slave (3) 10 CPUs, 10GB RAM

Pending tasks:

  • 1000 Map
  • 100 Reduce

Current result:

  • Slave(1) -> TaskTracker(9 Map, 1 Reduce)
  • Slave(2) -> TaskTracker(9 Map, 1 Reduce)
  • Slave(3) -> TaskTracker(9 Map, 1 Reduce)

Ideal Result:

  • Slave(1) -> TaskTracker(10 Map)
  • Slave(2) -> TaskTracker(10 Map)
  • Slave(3) -> TaskTracker(7 Map)
  • Slave(3) -> TaskTracker(3 Reduce)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant