Synchronous Operations are So Outdated
Understanding asynchronous events
The best way to explain why synchronous code can sometimes be daunting is to use an example from Real Life™. A single day in our lives can contain plenty of actions that make us cringe and growl. Take, for instance, trying to make a meal.
Imagine you're cooking. You wouldn't wait for the water to boil before you prepared the potatoes. Nor would you wait for the potatoes to be done before you started working on the salad.
Asynchronous programming means having multiple events happen at the same time. It allows you to get more things done while you're waiting for other things to happen.
The fundamental element of asynchronous programming is the callback, so let's review that first, and then take a look at some examples of async in code-land.
We will be using AnyEvent for this article, but the same principles exist in all other async frameworks.
Introduction to callbacks
Since multiple events run at the same time, the application (much like the spice) must flow. To make this work, whenever we start one event we include references to the code that should be run when it finishes or hits other milestones. Since the event then "knows" how to proceed on its own, it can start up and work in the background while the rest of the program continues on doing more things.
We're going to be using a technique that some are not familiar with: callbacks. Just to get you up to speed, let me start by explaining callbacks in a nutshell: callbacks are just references to subroutines.
These subroutines can be defined using names or they can be anonymous. We can call those subroutines by their reference instead of their name.
If we use
sub to create a reference to a subroutine, we can pass the callback as a parameter directly, without saving it first:
Reading from input
You have an application that needs to read from a handle (which could be a file descriptor, a socket, or even the standard input), but you don't know when it will be ready to be read.
In a synchronous application, you'll be waiting for it to become available, possibly calling
sleep in between. But these days, we're busy people, we can't just be waiting by the phone. We have stuff to do!
How does that work? By calling AnyEvent's
io method, you're creating a new watcher that checks a file handle for new read events. If it has something to read, it will call the code reference we provided. Both the checks and the subroutine call will happen in the background.
Also, since we've given it all the information it needs (which file handle to poll, what kind of events we want, and what to do in that case), it doesn't need to hold us back. That way we can continue with some other code, and the watcher will wait and run the background, without bothering us.
Keeping the watchers alive
There is a problem I haven't mentioned. That code is fine, except that once it executes the additional code, the application will close, simply because it reached the end of the file. We want to keep the application running, so our watchers will continue to work. How do we do that? Condition variables!
Condition variables are variables that represent a condition waiting to come true, like your cat waiting for you to get comfortable with a laptop. When the variable becomes true, the cat comes over and lies on your lap, disrupting your work.
This time we created a condition variable that is available to the watcher. The watcher still dilligently continues its work. Only this time as soon as it finds a line in the file indicating the end of it, it will call
send on the condition variable, making the condition true, effectively saying "that's it, we're done".
If someone has called
recv on the condition variable, it will wait until something else in the background (like our watcher) will call
send and then will continue running.
That means that the line "All done!" will only get written once our worker finished reading the line.
Another ramification of the condition variable's behavior is that it is possible to create an infinite loop by creating a condition variable, calling
recv, and not having anything call
send on it. It looks exactly like this:
Since the application is now waiting for a condition variable to come true, it will not terminate. Because nothing can call
send on this variable, it basically means the application will stay up indefinitely. The most common usage for this are daemons, which should always be running.
Timing your cooking
The last element in AnyEvent that we'll be looking at is the timer. Timers are events (any kind of event) that gets run at some point in time. It can be in a few minutes from now or at a specific hour. It can happen once or it can repeat itself several times, or even forever.
This defines a timer that will wait 3.5 seconds, and then call the subroutine every 5 seconds. Fairly simple. Let's try a few timers.
What we have here isn't the best example for how to make a meal, but it does give us an example showing multiple timers. The first timer (
$t1) keeps alerting us every seven minutes about our progress. Meanwhile, our second timer picks up an action to do every 10 minutes, and does it. Once no more actions are available, it tells the condition variable that it's done. It does this by simply returning out of the subroutine (so we don't call
do_step again) and calling
send at the same time.
After we created our timers, we set up a
recv on a condition variable, meaning "don't continue running the rest of the application until we are notified that the timers finished their work". It will wait in that point in time (without blocking the timers) until the
send is called. Then it will continue and say dinner is finally served. Since it's the end of the application, the timers will close and the application will end.
Here is the output we'll get from running the application:
Current cooking state: Preparing (do_step() called with "Cutting") Current cooking state: Cutting (do_step() called with "Simmering") Current cooking state: Simmering Current cooking state: Simmering (do_step() called with "Cooking") Current cooking state: Cooking (do_step() called with "Seasoning") Current cooking state: Seasoning (do_step() called with "Serving") Current cooking state: Serving Current cooking state: Serving Dinner is served!
Condition variables with multiple calls
Sometimes the behavior of the condition variable's
recv is not flexible enough to handle instances in which you need to be able to wait on multiple calls.
Suppose you have a calculation to do that depends on the result of multiple database queries. Before the SQL experts jump at it, let's also suppose these queries are made across different databases.
A database connection is in fact a network operation, which means it blocks. This is an ideal example for async programming. You could initiate several connections and queries concurrently instead of consequtively. Using condition variables, you would probably try to open three condition variables, and then waiting for each to come true. That won't work, since you can only call
recv on one variable at a time.
Instead, condition variables can accept a
end call to signify a multi-call request. Once there's been an
end call for each
begin call, it will return to the
Bringing it all together
Suppose we have a file that has contains a lot of links and we want to download every image listed in it. These are two different actions: (1) reading the file and (2) downloading the images. We will also have a timer that gives us the progress every two seconds.
Let's analyze what we've got here. We use some modules that you should recognize. If you don't, you should check them out.
The next thing is opening a file handle. We then set up a watcher for some I/O operations using AnyEvent's
io method. It needs the file handle we are going to operate on, and the kind of operation we'll do (we pick
r for reading) and a callback to run. This callback is the main thing that takes a bit to understand.
Every time we read a line that has a URL, we call
begin on the condition variable. We issue an HTTP request for that URL and once we finish fetching it and saving it, we issue the corresponding
end call. When all
begin calls have
ended, it will return to the
recv method, much like calling
We also created a progress timer that announces, every two seconds, the number of links we've sent. You'll notice it uses AnyEvent's
now, which is the recommended way to call
time when running in an event loop.
recv call in the end will wait until all
begin calls will be closed. Once we've worked on the entire file, it will print a nice message and the application will end.
Just the beginning...
Once you get used to programming asynchronously, it's like having scissors: you just run with it! Note: Do not run with scissors. ✂ 🏃