Setting the SO_KEEPALIVE Option
When connections are used, they can sometimes be idle for long periods. For example, a telnet session can be established to access a stock quotation service by a portfolio manager of a mutual fund company. He might perform a few initial inquiries and then leave the connection to the service open in case he wants to go back for more. In the meantime, however, the connection remains idle, possibly for hours at a time.
Any server that thinks it has a connected client must dedicate some resources to it. If the server is of the forking type, then an entire Linux process with its associated memory is dedicated to that client. When things are going well, this scenario does not present any problem. The difficulty arises when a network disruption occurs, and all 578 of your clients become disconnected from your stock quotation service.
After the network service is restored, an additional 578 clients will be attempting to connect to your server, as they re-establish connections. This is a real problem for you because your server has not yet realized that it lost the idle clients earlier—option SO_KEEPALIVE to the rescue!
The following example shows how to enable SO_KEEPALIVE on a socket s so that a disconnected idle connection can eventually be detected:
#define TRUE 1
#define FALSE 0
int z; /* Status code */
int s; /* Socket s */
so_keepalive = TRUE;
z = setsockopt(s,
if ( z )
The preceding example enables the SO_KEEPALIVE option so that when the socket connection is idle for long periods, a probe message is sent to the remote end. This is usually done after two hours of inactivity. There are three possible responses to a keep-alive probe message. They are
The peer responds appropriately to indicate that all is well. No indication is returned to the application, because this is the application's assumption to begin with.
The peer can respond indicating that it knows nothing about the connection. This indicates that the peer has been rebooted since the last communication with that host. The error ECONNRESET will then be returned to the application with the next socket operation.
No response is received from the peer. In this case, the kernel might make several more attempts to make contact. TCP will usually give up in approximately 11 minutes if no response is solicited. The error ETIMEDOUT is returned with the next socket operation when this happens. Other errors such as EHOSTUNREACH can be returned if the network is unable to reach the host any longer, for example (this can happen because of bad routing tables or router failures).
The time frames involved for SO_KEEPALIVE limit its general usefulness. The probe message is sent only after approximately two hours of inactivity. Then, when no response is elicited, it might take another 11 minutes before the connection returns an error. Nevertheless, this facility does eventually allow idle disconnected sockets to be detected, and then closed by the server. Consequently, servers that support potentially long idle connections should enable this feature.