Comet Long Polling with PHP and jQuery
Comet describes a number of techniques with which a web server may push information to a client in a non-transactional format. Long Polling is one of such techniques, in which a browser’s request remains open until the server has information to send.
Long Polling is particularly useful in semi-synchronous applications, such as chat rooms and turn-based games, and is straightforward to implement with PHP and jQuery:
Serverside
The serverside component of Long Polling requires that, contrary to the usual transactional model, the request not be answered right away. To achieve this, we put PHP to sleep():
// How often to poll, in microseconds (1,000,000 μs equals 1 s)
define('MESSAGE_POLL_MICROSECONDS', 500000);
// How long to keep the Long Poll open, in seconds
define('MESSAGE_TIMEOUT_SECONDS', 30);
// Timeout padding in seconds, to avoid a premature timeout in case the last call in the loop is taking a while
define('MESSAGE_TIMEOUT_SECONDS_BUFFER', 5);
// Hold on to any session data you might need now, since we need to close the session before entering the sleep loop
$user_id = $_SESSION['id'];
// Close the session prematurely to avoid usleep() from locking other requests
session_write_close();
// Automatically die after timeout (plus buffer)
set_time_limit(MESSAGE_TIMEOUT_SECONDS+MESSAGE_TIMEOUT_SECONDS_BUFFER);
// Counter to manually keep track of time elapsed (seems crude to rely solely on PHP's timeout)
$counter = MESSAGE_TIMEOUT_SECONDS;
// Poll for messages and hang if nothing is found, until the timeout is exhausted
while($counter > 0)
{
// Check for new data (not illustrated)
if($data = getNewData($user_id))
{
// Break out of while loop if new data is populated
break;
}
else
{
// Otherwise, sleep for the specified time, after which the loop runs again
usleep(MESSAGE_POLL_MICROSECONDS);
// Decrement seconds from counter (the interval was set in μs, see above)
$counter -= MESSAGE_POLL_MICROSECONDS / 1000000;
}
}
// If we've made it this far, we've either timed out or have some data to deliver to the client
if(isset($data))
{
// Send data to client; you may want to precede it by a mime type definition header, eg. in the case of JSON or XML
echo $data;
}
This example holds on to the connection, for up to 30 seconds, until new information is found in the database, at which point it is delivered to the client. Note that we are using usleep(), which takes microseconds instead of seconds.
Clientside
The client is responsible for keeping the request alive at all times: once the process begins (usually on DOM ready, although in some browsers it might be offset, as noted below), when the request times out, and when the request returns useful information, at which point it must handle it:
$(function()
{
// Main Long Poll function
function longPoll()
{
// Open an AJAX call to the server's Long Poll PHP file
$.get('longpoll.php', function(data)
{
// Callback to handle message sent from server (not illustrated)
handleServerMessage(data);
// Open the Long Poll again
longPoll();
});
}
// Make the initial call to Long Poll
longPoll();
}
In JavaScript, we immediately open another Long Poll upon the previous call returning useful data or timing out; all of the timeouts are handled serverside.
Caveats
At first glance Long Polling is a no-brainer upgrade from the classic polling: it’s more responsive, has less network overhead, and is fairly straightforward to implement. However, it is not without its flaws:
Database is hit more often
In the example above, the database is hit twice every second. If we had 100 concurrent users for one hour, that would be 720,000 database requests, the overwhelming majority of which will return no data. For certain light database systems, this overhead might be negligible, but you’ll want to pair MySQL with a Memcache server that will handle these frequent requests with little overhead. Memcache works well with Long Polling, since the database access consists of all INSERTs, for the part of the program adding new data to be retrieved, and SELECTs, for the part that is fetching the new messages, as illustrated above.
PHP sleeps across the entire session
When you make a call to sleep() or usleep(), PHP will pause execution across the entire session, which means that any other AJAX requests or even pageloads will have to wait until the request returns or times out. To address this, ensure the session is closed before entering the serverside sleep loop.
Chrome and iOS Safari think the page is loading
Chrome and iOS devices’ Safari will display the “waiting” or “loading” messages, along with their respective spinning or bar loaders while a Long Poll is open. In Chrome, this can be combated by offsetting the execution of the initial Long Poll request by a few seconds after DOM load, with setTimeout(). I have not tested whether this approach works in iOS devices as well.
Header image by Jack Newton.
From a server administration standpoint, what does it do to your memory usage to have all these persistent connections open in PHP? Will it be trying to spawn an apache instance for each long poll you have open?
Memory usage from the PHP end is very light, since it will be sleeping over 99% of the time. The call to session_write_close() will not destroy the session in either PHP or Apache, it just circumvents PHP’s session variable locking (PHP’s session data is locked as to to prevent concurrent writes from multiple scripts, so only one script may operate on a session at a time).
Apache will treat a Long Poll request much as any other AJAX call, or any other request to a resource within a session, so it will not spawn a new instance of itself.
Where you might get in trouble with performance is on the MySQL end, hence the Memcache recommendation, which results in more memory usage altogether, but much less IO overhead.
how big is memory usage if you use infinite while loop instead of timing out after 30 seconds. Im working curently on a reverse ajax chat, and i saw on one example on the web (don’t really remember where) it uses infinite loop with set_time_limit(0) and while(true){}
wouldn’t that be an overkill because if that’s the case, it will never expire, even if you close ajax request, it won’t expire. If you go again to that chat it will just add a new instance of the loop over the previous one, and in no time you’ll have …che of apache :\
L, I think you’ve answered your own question here: in a Long Poll, the combination of an infinite loop and disabling PHP’s timeout can certainly crash the server. The author of the example you mention is probably making the potentially dangerous assumption that data will be available quickly and that Long Polls won’t stay open for long.
We should be prudent and handle the not-so-edge-case in which data is not immediately available and we spin each client for enough time to crash the server. In a chat room situation, for example, you can have a huge number of people connected to a system overnight, all idle. Worse yet, if a user closes the client and no timeouts are in place, as you note, the Long Poll remains open.
I would never, without exceptions, use a deliberately infinite loop such as while(true){} and set_time_limit(0). You’ll notice, in fact, that in the example above I use PHP’s timeout as well as counting the time explicitly to make absolutely certain that there are no Long Polls spinning longer than would be safe. This might be overkill, but an ounce of prevention is worth a ton of trauma!
Since you should be reopening the connection from the JavaScript end both when data is received or when it times out serverside, the small overhead of reestablishing a connection every 30 seconds or so is a small price to pay to ensure your server stays up.
Thank you for your fast reply E. i just found some function it’s called ignore_user_abort(), if i set it to false, when the user exits, or aborts ajax call, it will stop that instance of script. would that be helpful, i mean would that be the same as timing out after 30sec, in my example it will never expire until user exits the script, and in your example it will timeout every 30sec, and if user exits the script, it would stop execution of that instance…?
I have not played around with the ignore_user_abort() function/setting, but from what I can gather from the PHP documentation, it only works for CLI connections, not through the web.
Something I’ve also recently found out while delving deeper into this topic is that set_time_limit() is also irrelevant in a Long Polling situation, since time sleeping is not counted towards program execution (take a look at the sleep() function reference). In a Long Poll, we are sleeping most of the time, and the small fractions of time which we use to check for new data is the only time aggregated to PHP’s internal timeout clock. Although I originally hinted at it being overkill to measure the time yourself in addition to PHP, I now retract that statement: you must count the time manually it if you want to enforce a reliable timeout; might as well ignore PHP’s timeout altogether.
L, what are your concerns with reestablishing the connection every 30 seconds?
i don’t have much concerns about reestablishing connection after 30 sec, if that’s the best way, i’ll rewrite my script a little bit, it’s not a problem, i just tought it would be logicaly better [if it's the same matter, and you can use ignore_user_abort() function, but apperently as you mention it, i can't... didn't saw that about CLI] if you open it once and keep it open, then to open/close it every 30sec.