GSOC 2016 #1: with_timeout

June 09, 2016 · 3 mins to read

Preamble

The end of the May and the begging of June is a tough time for me. I’ve passed my final exam for Master Degree, and, finally, I am a M.S. in Computer Science. Now I’m preparing for the Ph.D. exams. Meanwhile, I’m working on Splash during this summer. I’ve got something to tell you.

with_timeout

The first thing that I should do is splash:with_timeout API. It allows to wrap any function and let it run only a specified amount of time in the Splash Lua scripting.

Originally, I was going to add a timeout functionality only to one API - splash:go, but after having discussed it with my mentor, we decided to create a more general API.

Implementation

There are two possible ways to implement this API:

Using Lua.
Using Python.

Both of them has their advantages and disadvantages. The implementation with Lua is more simple. It requires to use splash:call_later and splash:wait existing APIs and also some kind of polling (infinite loop) to check whether the running callback finished its work or not. Polling isn’t the best solution, because it requires CPU to do a lot of unnecessary work.

On the other hand, the Python implementation can know when the callback is finished its execution and notify the main event loop. Also, it’s more agile and configurable. So, we decided to write this API using Python.

Callbacks

The first thing that you should think of is the callback execution. In the current Splash version there are some API functions that take as an argument a callback. These callbacks are executed as coroutines. They are created using Splash#get_coroutine_run_func method. Earlier, there wasn’t any need to stop the execution of the running coroutine. However, the main idea of splash:with_timeout is the ability to run a function only the specified amount of time.

The first solution was just ignore success and error callbacks from the running function. The idea is simple but not correct. Consider the following example of Lua script:

function main(spash)
    local ok, result = splash:with_timeout(1, function()
        splash:wait(2)
        assert(splash:go("https://google.com"))
    end)

    splash:go("https://www.python.org")

    splash:wait(3)

    return splash:url()
end

The first argument of splash:with_timeout is the amount seconds you want to wait and the second one is your callback. As you can see, we set the timeout to 1 second and in the callback we’re waiting for 1 second then trying to go to the https://google.com. After that we’re navigating to the https://www.python.org. Then waiting for 3 seconds and returning the current URL. The result URL, obviously, should be https://www.python.org, because the callback of splash:wait_timeout would exceed its timeout. However, the result URL will be https://google.com. The reason is that we didn’t stop it when 1 second has elapsed.

So, I’d implement the coroutine stop functionality. I added a new method to BaseScriptRunner which is BaseScriptRunner#stop. It sets the flag self._is_stopped to True and during the coroutine execution that flag is checked: if it’s True StopIteration exception is raised and the coroutine stops its execution.

Error handling

There was an interesting conversation with my mentor related to how I’d handle errors from the callback of splash:with_timeout.

There are two ways to handle errors and exceptions in Splash Lua scripts:

Return a flag ok which tells whether the operation was successful or not and result which contains the result or the reason of an exception of the operation.
Return result which contains the result of the operation and raise an exception using error(...) if the operation failed.

In Lua, exceptions are thrown only if a user did something wrong (e.g. passed wrong arguments), so we’ve chosen the first solution because the timeout of callback isn’t related with the user itself rather than the API implementation.

Results

You can see my work in this PR#465.

Further plans

This week I’m finishing with all my exams and I can start spend more time on working for GSoC.

Thank you for reading. See you next time 😉