How to detect when the client closes the connection?

Imagine the following scenario: You are writing a server application. Clients send their queries to the server, for every new client connection, the server starts a new process that is responsible for answering all queries received from the client. For every query received, the process works some time and finally sends the results to the query back to the client.

So far, so good. But what happens if the client closes the connection to the server while the process is working on the query it just received from the client? In this case, nobody will ever look at the query results, so it makes a lot of sense to kill the process right away, as soon as the client closes the connection. The problem is: How can we tell if a socket connection has been closed by the client?

We need a separate process that does nothing else but checking whether a connection has been closed by the client (of course, this process may monitor several connections at the same time). The most obvious way to accomplish this is having that process call read on the socket for a connection and check whether read returns 0 (i.e. reads zero bytes from the socket), in which case we know that the connection has been closed. The problem is that we actually do not want to call read because this would remove data from the stream that normally would have been read by the process that is responsible for that connection. How do we get the data from the monitoring process to the query processor in this case? So, clearly, calling read is not a good idea.

Fortunately, we have the select and poll functions, which can be used to detect whether a socket has been closed by the remote host. Unfortunately, the problem with both is that they only tell us "data available", even though the connection has been closed. This is because "data available" in fact means something completely different, namely "can call read on this socket without blocking". In order to determine if the connection has actually been closed (or if in fact new data is available to be read), we need to call read again. Bad.

The solution to the problem is called recv, and it does a wonderful job. The trick is that recv supports two very nice flags: MSG_PEEK tells recv to not remove anything from the stream, while MSG_DONTWAIT makes the function return immediately if there is no data available to be read. Below, you find a code snippet that accepts a new client connection and spawns a new process that responds to queries received over that connection. The parent process waits until the socket has been closed. This can either be the case because the child process has finished execution (and closed the socket) or because the client has closed the socket before the child process could finish its task.

fd = accept(listenSocket);
pid_t childProcess = fork();
if (childProcess == (pid_t)-1) {
	perror("Unable to create new process for client connection");
	exit(1);
}
else if (childProcess == 0) {
	// read from socket, process queries, etc.
}
else {
	// use the poll system call to be notified about socket status changes
	struct pollfd pfd;
	pfd.fd = fd;
	pfd.events = POLLIN | POLLHUP | POLLRDNORM;
	pfd.revents = 0;
	while (pfd.revents == 0) {
		// call poll with a timeout of 100 ms
		if (poll(&pfd, 1, 100) > 0) {
			// if result > 0, this means that there is either data available on the
			// socket, or the socket has been closed
			char buffer[32];
			if (recv(fd, buffer, sizeof(buffer), MSG_PEEK | MSG_DONTWAIT) == 0) {
				// if recv returns zero, that means the connection has been closed:
				// kill the child process
				kill(childProcess, SIGKILL);
				waitpid(childProcess, &status, WNOHANG);
				close(fd);
				// do something else, e.g. go on vacation
			}
		}
	}
}