The ingest framework support remote job scheduling capability current (through direct jobtracker interaction). This breaks when running MR on Yarn. (And yarn is where we want to go).
Options I can see are
- Drop support for MR1 (don't feel comfortable doing this quite yet, though can discuss)
- Maven munge/conditional compilation (might be the best option - I think it's tenable only because there's a sunset in sight (sunset being eventually dropping MR1 support))
- Moving over to Oozie for job scheduling. Adds another dependency on the cluster, but it's pretty minimal and common. I think this might be the right way to go. (I'm not stuck on Oozie particularly - that's just the only "answer" I'm aware of).
My vote is for the Oozie version (assuming I'm not missing something obvious)
The ingest framework support remote job scheduling capability current (through direct jobtracker interaction). This breaks when running MR on Yarn. (And yarn is where we want to go).
Options I can see are
My vote is for the Oozie version (assuming I'm not missing something obvious)