I am working on a scheduler-like code (in PHP if that matters) and encountered an interesting thing: it's easy to reschedule a recurring task, but what if, for some reason, it was run significantly later, than it was supposed to?
For example, let's say a job needs to run every hour and it's next scheduled run is13.05.2021 18:00
, but it runs at13.05.2021 20:00
. Now normal rescheduling logic will be taking the original scheduled time and adding recurrence frequency (1 hour in this case), but that would make the new time13.05.2021 19:00
, which can cause to run this job twice. We could, theoretically, use the time for "last run" but it can be something like13.05.2021 20:03
, which would make new time13.05.2021 21:03
.
Now my question is: what logic can we use so that in this case next time would be13.05.2021 21:00
? I've tried googling something like this, but was not able to find anything. And I do see, that Event Scheduler in Windows, for example, does reschedule jobs in a way, that I want to do that.
I actually found a pretty easy way to do what I needed, so posting it as an answer.
If we have a value offrequency
in seconds (in my case, at least) and we have the originalnextrun
, which is when a task was supposed to be run initially, then the logic is as follows:
time()
,UTC_TIMESTAMP()
or whatever).nextrun
and get the difference between them in seconds.frequency
.ceil()
). If we have a value lower than 1, we may want to sanitize it.frequency
, which will give us a different result than on step 2, which is the salt of this method.nextrun
.And that's it. This does not guarantee, that you won't ever have a task run twice, if it ended just a few seconds before the time value on step 6, but to my knowledge MS Event Scheduler has the same "flaw".
Since I am doing this calculation in SQL, here's how this would look in SQL (at least for MySQL/MariaDB):
UPDATE `cron__schedule` SET `nextrun`=TIMESTAMPADD(SECOND, IF(CEIL(TIMESTAMPDIFF(SECOND, `nextrun`, UTC_TIMESTAMP())/`frequency`) > 0, CEIL(TIMESTAMPDIFF(SECOND, `nextrun`, UTC_TIMESTAMP())/`frequency`), 1)*`frequency`, `nextrun`)
To explain by referencing the logic above:
UTC_TIMESTAMP()
TIMESTAMPDIFF(SECOND, `nextrun`, UTC_TIMESTAMP())
- time comparison in seconds.TIMESTAMPDIFF(...)/`frequency`
CEIL(...)
to round up the value.IF(...)
is used to sanitize, since we can get 0 seconds, that will result in us not changing the time, at all.CEIL(...)*`frequency`
TIMESTAMPADD(...)
I do not like having to useTIMESTAMPDIFF(...)
twice because ofIF(...)
, but I do not know a way to avoid that without moving to a stored procedure, which feels like an overkill. Besides, as far as I know, MySQL should calculate this value only once regardless. But, if someone can advise me on a cleaner approach, I'll update the answer.
There isn't a right or wrong in this situation, it really depends on your business logic and how you want to build this.
WordPress and Drupal, two of the largest CMSs out there have faced this problem, too, which boils down to "poor man's cron" versus "system cron". For a "poor man's cron", these systems rely on someone hitting the website in order to "wake" the scheduler up, and if no one visits your site in a month, your tasks don't run, either. Both of these systems instead recommend using the system's cron to be more consistent and "wake up" the scheduler at certain intervals. I would encourage you to explore this in your system, too.
The next problem is, how are you storing your recurrence? Do you have (effectively) a table with every possible run time? So for an hourly run there's 24 entries? Or is there just a single task that has an ideal run date/time? The latter is generally easier to control compared to the former which has a lot of duplicated data being stored.
Then, do tasks reschedule themselves, does the scheduler do that, or is there a middle ground where the scheduler asks the task for the next best run? Figuring this out is very important and there's some nuances.
Another thing to think about, what happens if a task runs earlier than planned? For instance, does the world break if a task runs as 01:00 and 01:15, or is it just sub-optimal.
Generally when I build these types of systems, my tasks conform to a pattern (interface in OOP) and support a "next run time". The scheduler pulls all of the tasks from a data store that have an expired "next run time" and runs them. Doing this, there's no chance for a single task to exist at both 01:00 and 02:00 because it will only exist in the data store once, for instance at 01:00. If the scheduler then wakes up at 01:15, it finds the 01:00 task which has expired and runs it, and then it asks the task for the next run. The task looks at the clock (or time as provided by the scheduler if you are running in a distributed environment) and the task performs its own logic to determine that. If the logic is every hour, you can add 60 minutes from "now" and then remove the minutes portions, so 01:15 becomes 02:00.
Throw some exception handling and possibly database transactions into this mix to guarantee that a task can't fail but still get rescheduled, too.
Our community is visited by hundreds of web development professionals every day. Ask your question and get a quick answer for free.
Find the answer in similar questions on our website.
Do you know the answer to this question? Write a quick response to it. With your help, we will make our community stronger.
PHP (from the English Hypertext Preprocessor - hypertext preprocessor) is a scripting programming language for developing web applications. Supported by most hosting providers, it is one of the most popular tools for creating dynamic websites.
The PHP scripting language has gained wide popularity due to its processing speed, simplicity, cross-platform, functionality and distribution of source codes under its own license.
https://www.php.net/
DBMS is a database management system. It is designed to change, search, add and delete information in the database. There are many DBMSs designed for similar purposes with different features. One of the most popular is MySQL.
It is a software tool designed to work with relational SQL databases. It is easy to learn even for site owners who are not professional programmers or administrators. MySQL DBMS also allows you to export and import data, which is convenient when moving large amounts of information.
https://www.mysql.com/
Welcome to the Q&A site for web developers. Here you can ask a question about the problem you are facing and get answers from other experts. We have created a user-friendly interface so that you can quickly and free of charge ask a question about a web programming problem. We also invite other experts to join our community and help other members who ask questions. In addition, you can use our search for questions with a solution.
Ask about the real problem you are facing. Describe in detail what you are doing and what you want to achieve.
Our goal is to create a strong community in which everyone will support each other. If you find a question and know the answer to it, help others with your knowledge.