GSOC 2016 #1: with_timeout
June 09, 2016 · 3 mins to read
Preamble
The end of the May and the begging of June is a tough time for me. I’ve passed my final exam for Master Degree, and, finally, I am a M.S. in Computer Science. Now I’m preparing for the Ph.D. exams. Meanwhile, I’m working on Splash during this summer. I’ve got something to tell you.
with_timeout
The first thing that I should do is splash:with_timeout
API. It allows to wrap any function and let it run only a specified amount of time in the Splash Lua scripting.
Originally, I was going to add a timeout functionality only to one API - splash:go
, but after having discussed it with my mentor, we decided to create a more general API.
Implementation
There are two possible ways to implement this API:
- Using Lua.
- Using Python.
Both of them has their advantages and disadvantages. The implementation with Lua is more simple. It requires to use splash:call_later
and splash:wait
existing APIs and also some kind of polling (infinite loop) to check whether the running callback finished its work or not. Polling isn’t the best solution, because it requires CPU to do a lot of unnecessary work.
On the other hand, the Python implementation can know when the callback is finished its execution and notify the main event loop. Also, it’s more agile and configurable. So, we decided to write this API using Python.
Callbacks
The first thing that you should think of is the callback execution. In the current Splash version there are some API functions that take as an argument a callback. These callbacks are executed as coroutines. They are created using Splash#get_coroutine_run_func
method. Earlier, there wasn’t any need to stop the execution of the running coroutine. However, the main idea of splash:with_timeout
is the ability to run a function only the specified amount of time.
The first solution was just ignore success and error callbacks from the running function. The idea is simple but not correct. Consider the following example of Lua script:
function main(spash)
local ok, result = splash:with_timeout(1, function()
splash:wait(2)
assert(splash:go("https://google.com"))
end)
splash:go("https://www.python.org")
splash:wait(3)
return splash:url()
end
The first argument of splash:with_timeout
is the amount seconds you want to wait and the second one is your callback. As you can see, we set the timeout to 1 second and in the callback we’re waiting for 1 second then trying to go to the https://google.com. After that we’re navigating to the https://www.python.org. Then waiting for 3 seconds and returning the current URL. The result URL, obviously, should be https://www.python.org, because the callback of splash:wait_timeout
would exceed its timeout. However, the result URL will be https://google.com. The reason is that we didn’t stop it when 1 second has elapsed.
So, I’d implement the coroutine stop functionality. I added a new method to BaseScriptRunner
which is BaseScriptRunner#stop
. It sets the flag self._is_stopped
to True
and during the coroutine execution that flag is checked: if it’s True StopIteration
exception is raised and the coroutine stops its execution.
Error handling
There was an interesting conversation with my mentor related to how I’d handle errors from the callback of splash:with_timeout
.
There are two ways to handle errors and exceptions in Splash Lua scripts:
- Return a flag
ok
which tells whether the operation was successful or not andresult
which contains the result or the reason of an exception of the operation. - Return
result
which contains the result of the operation and raise an exception usingerror(...)
if the operation failed.
In Lua, exceptions are thrown only if a user did something wrong (e.g. passed wrong arguments), so we’ve chosen the first solution because the timeout of callback isn’t related with the user itself rather than the API implementation.
Results
You can see my work in this PR#465.
Further plans
This week I’m finishing with all my exams and I can start spend more time on working for GSoC.
Thank you for reading. See you next time 😉