Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPIP: built-in Job Scheduler support in MLSQL Stack #1045

Closed
allwefantasy opened this issue Apr 23, 2019 · 0 comments
Closed

MPIP: built-in Job Scheduler support in MLSQL Stack #1045

allwefantasy opened this issue Apr 23, 2019 · 0 comments

Comments

@allwefantasy
Copy link
Contributor

allwefantasy commented Apr 23, 2019

Background

MLSQL Engine provides rest APIs to run a job script. It's convenient to integrate with job schedulers since the only thing you should do is just to send a post request in your job scheduler.

However, we hope there is a built-in job scheduler so people can use MLSQL Stack in production more easily.

The key point is where to put the scheduler and how to use it?

Where to put?

Console -> Cluster -> Engine 1
                   -> Engine 2

The best place is Cluster. And we hope we can use it more mlsql-style.

You can use it like this:

!crontab */5 * * * * "/project/dir1/dir2/a.mlsql";

Or even more, run it like following:

!crontab */5 * * * * self;

---you script content

select * from hive1 as hiveTable2;
save......

This makes the script self-contains how to execute itself.

How to configure the depends? The first way should like this:

set a_script = "/project/dir1/dir2/a.mlsql";
set b_script = "/project/dir1/dir2/b.mlsql";
set c_script = "project/dir1/dir2/c.mlsql";

!build a_script depends on c_script;
!build b_script depends on c_script;

-- trigger c with crontab.
!crontab */5 * * * * "${c_script}";

The second way is to use it like this:

!build self depends on c_script;
---you script content
select * from hive1 as hiveTable2;
save......

And the system should build the dependency graph by scanning all scripts.

How to run?

When we execute the !crontab command and the mlsql engine will resend this command to Cluster, and
the scheduler in Cluster should record it and schedule as required.

@allwefantasy allwefantasy changed the title MPIP: build-in Job Scheduler support in MLSQL Stack MPIP: built-in Job Scheduler support in MLSQL Stack Apr 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant