iamduo / workq
- среда, 24 августа 2016 г. в 03:15:05
Go
Job server in Go.
Workq is a job scheduling server strictly focused on simplifying job processing and streamlining coordination. It can run jobs in blocking foreground or non-blocking background mode.
Workq runs as a standalone TCP server and implements a simple, text based protocol. Clients interact with Workq over a TCP socket in a request/response model with text commands. Please refer to the full protocol doc for details.
Scheduling & Timers
Schedule adhoc one-time jobs at a future UTC time. Workq can be used as a timer for any application event scheduling. Compare this to the usual alternative of building scheduling schema into an existing datastore, writing a custom worker to poll the datastore, and finally building in locking & state management to manage dispatching.
In Workq, you schedule the job, lease the job, and complete the job.
Concurrency Control
Workq can enable concurrency especially in languages that do not have built-in concurrency. Background multiple jobs, process them with multiple workers, and retrieve results from within a single process.
Streamlining & Persistent Processing
Workq will retry jobs from TTR (time-to-run) timeouts and/or from explicit job failures through the max-attempts and max-fails flags. Retry is a job execution specification and does not require any custom worker logic.
Distributing Work
Distribute work to multiple workers using Workq as the coordinator. In addition, Workq is language agnostic, submit a job from one language and process it in another.
Workq can also naturally store job execution results for later retrieval up to the job's TTL.
Workq is in alpha status and not yet stable. The protocol may still be subject to small changes. Full unit test coverage, system testing, and fuzz testing are used to ensure maximum coverage and safety throughout development. Workq is currently suitable for development experimentation and evaluation. Stabilization will happen once the test coverage is thorough enough and there no major gaps in the protocol. There will be absolutely no reliance on manual testing where technically possible. Workq will slow bake until ready.
Table of Contents
Client/Worker examples are shown using the Go client.
make server
# Builds into bin/workq-server
bin/workq-server
Starts a workq-server on port 9922.
import "github.com/iamduo/go-workq"
// ...
client, err := workq.Connect("localhost:9922")
if err != nil {
// ...
}
Add a background job. The result can be retrieved through the "result" command.
job := &workq.BgJob{
ID: "6ba7b810-9dad-11d1-80b4-00c04fd430c4",
Name: "ping",
TTL: 60000, // Expire after 60 seconds
TTR: 5000, // 5 second time-to-run limit
Payload: []byte("ping"),
Priority: 10, // @OPTIONAL Numeric priority, default 0.
MaxAttempts: 3, // @OPTIONAL Absolute max num of attempts.
MaxFails: 1, // @OPTIONAL Absolute max number of failures.
}
err := client.Add(job)
if err != nil {
// ...
}
Run a job and wait for its result.
job := &workq.FgJob{
ID: "6ba7b810-9dad-11d1-80b4-00c04fd430c4",
Name: "ping",
TTR: 5000, // 5 second time-to-run limit
Timeout: 60000, // Wait up to 60 seconds for a worker to pick up.
Payload: []byte("ping"),
Priority: 10, // @OPTIONAL Numeric priority, default 0.
}
result, err := client.Run(job)
if err != nil {
// ...
}
fmt.Printf("Success: %t, Result: %s", result.Success, result.Result)
Schedule a job at a UTC time. The result can be retrieved through the "result" command.
job := &workq.ScheduledJob{
ID: "6ba7b810-9dad-11d1-80b4-00c04fd430c4",
Name: "ping",
Time: "2016-12-01T00:00:00Z" // Start job at this UTC time.
TTL: 60000, // Expire after 60 seconds
TTR: 5000, // 5 second time-to-run limit
Payload: []byte("ping"),
Priority: 10, // @OPTIONAL Numeric priority, default 0.
MaxAttempts: 3, // @OPTIONAL Absolute max num of attempts.
MaxFails: 1, // @OPTIONAL Absolute max number of failures.
}
err := client.Schedule(job)
if err != nil {
// ...
}
Lease the first job within a set of one or more job names with a wait-timeout (milliseconds).
// Lease the first available job in "ping1", "ping2", "ping3"
// waiting up to 60 seconds.
job, err := client.Lease([]string{"ping1", "ping2", "ping3"}, 60000)
if err != nil {
// ...
}
fmt.Printf("Leased Job: ID: %s, Name: %s, Payload: %s", job.ID, job.Name, job.Payload)
Mark a job successfully completed with a result.
err := client.Complete("6ba7b810-9dad-11d1-80b4-00c04fd430c4", []byte("pong"))
if err != nil {
// ...
}
Mark a job failed with a result.
err := client.Fail("6ba7b810-9dad-11d1-80b4-00c04fd430c4", []byte("failed"))
if err != nil {
// ...
}
Get a job result previously executed by Add or Schedule commands.
// Get a job result, waiting up to 60 seconds if the job is still executing.
result, err := client.Result("6ba7b810-9dad-11d1-80b4-00c04fd430c4", 60000)
if err != nil {
// ...
}
fmt.Printf("Success: %t, Result: %s", result.Success, result.Result)
Please see the full protocol doc and Go client docs for more details.
While there is no official client development guide yet, developing a Workq client is fairly straightforward simply by reading the protocol doc. In addition, you can review the Go client as a working example. If you have questions, please join our mailing list.
Workq can't and will not do everything you want. Some things to keep in mind:
Priorities are numeric from 0 (default, and lowest) - 4,294,967,295 (highest, 2^32) and can control the order of job execution within a single job queue. Higher priorities are executed first.
TTR stands for time-to-run and is the maximum time a job can run for after being leased before it is requeued. This value is in milliseconds.
Expiration time before job is deleted regardless of state. This value is in milliseconds.
A worker receives work by leasing jobs by their name. A lease is bound by the job's TTR.
Max Attempts is used in add
and schedule
commands to specify the number of times a job can be attempted before being marked as failed. An attempt is counted anytime a job is leased regardless of the outcome. This primarily is a job timing out due to TTR.
Max number of explicit job failures. Explicit job failures are not from timeouts, but from explicit workers reporting results by the fail
command. Use this when you want a job to be retried on "true" worker failures and not timeouts from TTR.
The project maintainer has the last word on whether or not a contribution is suitable for Workq. All contributions will be considered carefully, but from time to time, contributions will be rejected because they do not suit the current goals or needs of the project.
If your contribution is rejected, don't despair! As long as you followed these guidelines, you will have a much better chance of getting your next contribution accepted.
Bug reports are hugely important! Before you raise one, though, please check through the GitHub issues, both open and closed, to confirm that the bug hasn't been reported before. Duplicate bug reports are a huge drain on the time of other contributors, and should be avoided as much as possible.
Workq is intended to have a small API to solve a very specific set of use cases. Maintaining a small API allows Workq to be maintainable in the future and understandable.
One of the most important skills to have while maintaining an open source project is learning the ability to say "no" to suggested changes, while keeping an open ear and mind.
Admittedly Workq is still in its early stages. If you believe there is a feature missing, feel free to raise a feature request, but please do be aware that not all requests are suitable.
If enough users find value in Workq, I'll attempt to set up a strategy to sustain it for the long term. If you find value in Workq for your company, please say hello on our mailing list.
Workq is inspired from prior works. The following projects have shaped Workq's feature set:
Special thanks to Theofanis M3 Industries, the initial Workq Supporter, which provided me the hardware to build Workq.