Chapter 91: Non-blocking Socket I/O with ESP-IDF
Chapter Objectives
After completing this chapter, you will be able to:
- Understand the difference between blocking and non-blocking socket operations.
- Configure a socket for non-blocking I/O.
- Use the
select()
system call to monitor multiple socket descriptors for readability, writability, and error conditions. - Implement non-blocking TCP clients and servers.
- Handle common errors associated with non-blocking sockets, such as
EWOULDBLOCK
andEINPROGRESS
. - Appreciate the benefits of non-blocking I/O in creating responsive, single-threaded network applications.
- Understand how non-blocking I/O can be used to manage multiple connections efficiently.
Introduction
In network programming, responsiveness is often a key requirement. Traditional (blocking) socket calls can cause an application to hang if data is not immediately available or if a connection cannot be established instantly. This is particularly problematic in embedded systems like the ESP32, where a single unresponsive task can affect the entire system’s performance or even lead to watchdog timeouts.
Chapter 90, “Asynchronous Socket Programming,” introduced concepts where operations could proceed without waiting. Non-blocking Socket I/O is a fundamental technique that underpins many asynchronous and event-driven programming models. By setting a socket to non-blocking mode, operations like connect()
, accept()
, send()
, and recv()
return immediately, even if they cannot complete their task at that moment. This allows the application to perform other work or manage multiple connections concurrently without resorting to a multi-threaded approach where each connection has its own thread (which can be resource-intensive).
This chapter focuses on using non-blocking sockets in conjunction with the select()
system call. The select()
call allows a program to monitor multiple file descriptors (including sockets) to see if I/O operations on any of them can be performed without blocking. This event-driven approach is crucial for building efficient and scalable network applications on resource-constrained devices like the ESP32.
Theory
Blocking vs. Non-blocking Sockets
By default, socket operations in most systems, including those using the Berkeley Sockets API (which LWIP, the TCP/IP stack used in ESP-IDF, is based on), are blocking.
- Blocking Sockets:
connect()
: Blocks until the connection is established or an error occurs.accept()
: Blocks until an incoming connection is received.send()
: Blocks until all data is sent (or at least buffered by the TCP/IP stack).recv()
: Blocks until some data is received or the connection is closed.
recv()
call for one client, it cannot service other clients or perform other duties. - Non-blocking Sockets:When a socket is set to non-blocking mode:
connect()
: Initiates the connection and returns immediately. If the connection is not yet established, it typically returns an error likeEINPROGRESS
. The application must then use a mechanism likeselect()
to determine when the connection is complete.accept()
: Returns immediately. If no pending connections are present, it returns an error likeEWOULDBLOCK
orEAGAIN
.send()
: Tries to send data. If the socket’s send buffer is full, it may send only part of the data or no data at all, returning the number of bytes sent or an error likeEWOULDBLOCK
orEAGAIN
.recv()
: Tries to read data. If no data is available in the socket’s receive buffer, it returns an error likeEWOULDBLOCK
orEAGAIN
.
EWOULDBLOCK
(orEAGAIN
, which is often the same value) are not fatal errors in non-blocking mode. They simply indicate that the operation would have blocked if the socket were in blocking mode, and the application should try the operation again later.

Socket Operation | Blocking Mode Behavior | Non-Blocking Mode Behavior |
---|---|---|
connect() | Blocks task until connection established or error occurs. | Initiates connection, returns immediately. Typically returns -1 with errno = EINPROGRESS . Completion checked later (e.g., via select() ). |
accept() | Blocks task until an incoming connection is received. | Returns immediately. If no pending connections, returns -1 with errno = EWOULDBLOCK or EAGAIN . |
send() | Blocks task if send buffer is full, until space is available or all data is buffered. | Tries to send data. If buffer is full, may send partial data or nothing. Returns bytes sent, or -1 with errno = EWOULDBLOCK or EAGAIN if no data could be queued. |
recv() | Blocks task if no data is available, until data arrives or connection is closed. | Tries to read data. If no data available, returns -1 with errno = EWOULDBLOCK or EAGAIN . |
Task Impact | Task execution halts, potentially making the application unresponsive to other events. | Task execution continues, allowing it to perform other work or manage multiple I/O operations. Requires mechanisms like select() for efficient event handling. |
Error Handling for EWOULDBLOCK /EAGAIN |
N/A (Operation blocks instead of returning these errors). | These are not fatal errors. They indicate the operation would have blocked and should be retried when the socket is ready (e.g., as indicated by select() ). |
Setting a Socket to Non-blocking Mode
A socket can be configured to be non-blocking using the fcntl()
(file control) function. This is a standard POSIX function available through LWIP in ESP-IDF.
#include <sys/fcntl.h>
// sock is an existing socket descriptor
int flags = fcntl(sock, F_GETFL, 0);
if (flags < 0) {
ESP_LOGE(TAG, "fcntl F_GETFL failed: %s", strerror(errno));
// Handle error
}
if (fcntl(sock, F_SETFL, flags | O_NONBLOCK) < 0) {
ESP_LOGE(TAG, "fcntl F_SETFL O_NONBLOCK failed: %s", strerror(errno));
// Handle error
}
fcntl(sock, F_GETFL, 0)
: Retrieves the current file status flags for the socket.flags | O_NONBLOCK
: Adds theO_NONBLOCK
flag to the existing flags using a bitwise OR operation.fcntl(sock, F_SETFL, flags | O_NONBLOCK)
: Sets the modified flags back to the socket.
Once this is done, all subsequent operations on sock
will be non-blocking.
Polling and the Need for select()
If you simply set a socket to non-blocking and then try to read from it in a loop, you might end up in a “busy-wait” loop:
// Inefficient busy-waiting (DO NOT DO THIS)
while (1) {
ssize_t len = recv(sock, buffer, sizeof(buffer) - 1, 0);
if (len > 0) {
// Process data
} else if (len == 0) {
// Connection closed
break;
} else { // len < 0
if (errno == EWOULDBLOCK || errno == EAGAIN) {
// No data right now, try again immediately (bad!)
// This will consume CPU cycles unnecessarily.
// vTaskDelay(1); // A small delay helps but is still not ideal
continue;
} else {
// Real error
break;
}
}
}
This approach is highly inefficient as it continuously polls the socket, consuming CPU resources. A much better approach is to use a mechanism that notifies the application when a socket is ready for an I/O operation. This is where select()
comes in.
graph LR subgraph "Polling Approach 2: Using select() (Efficient)" direction LR S_Loop[Event Loop] --> S_Prepare[Prepare fd_set<br>Add sock to readfds]; S_Prepare --> S_Select{"Call select(nfds, &readfds, ..., &timeout)"}; S_Select -- "Socket Ready (Readable)" --> S_Recv["Call non-blocking recv()<br>(Should not block significantly)"]; S_Recv --> S_Process[Process Data]; S_Select -- "Timeout / No Activity" --> S_OtherWork[Perform Other Tasks<br>or Sleep]; S_OtherWork --> S_Loop; S_Process --> S_Loop; style S_Select fill:#DBEAFE,stroke:#2563EB,color:#1E40AF style S_OtherWork fill:#D1FAE5,stroke:#059669,color:#065F46 style S_Loop fill:#FEF3C7,stroke:#D97706,color:#92400E end subgraph "Polling Approach 1: Busy-Waiting (Inefficient)" direction LR BW_Loop[Loop Continuously] --> BW_Recv{"Call non-blocking recv()"}; BW_Recv -- "Data Received" --> BW_Process[Process Data]; BW_Recv -- "EWOULDBLOCK" --> BW_Retry[Consume CPU,<br>Try Again Immediately]; BW_Retry --> BW_Loop; BW_Process --> BW_Loop; style BW_Retry fill:#FEE2E2,stroke:#DC2626,color:#991B1B style BW_Loop fill:#FEF3C7,stroke:#D97706,color:#92400E end classDef primary fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6; classDef decision fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E; classDef process fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF; classDef check fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B; classDef success fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46; class BW_Recv decision; class BW_Process process; class S_Prepare process; class S_Recv process; class S_Process process;
The select()
System Call
The select()
system call allows a program to monitor multiple file descriptors (sockets, in our case) to see if any of them are “ready” for a particular class of I/O operation (e.g., reading, writing) or have an exceptional condition pending.
#include <sys/select.h>
int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);
Parameters:
nfds
: This argument should be set to the highest-numbered file descriptor in any of the three sets, plus 1.readfds
: An optional pointer to anfd_set
structure that, on input, specifies the file descriptors to be checked for readability. On successful return, it is modified to indicate which of these file descriptors are actually readable.writefds
: An optional pointer to anfd_set
that, on input, specifies the file descriptors to be checked for writability. On successful return, it is modified to indicate which of these file descriptors are actually writable.exceptfds
: An optional pointer to anfd_set
that, on input, specifies the file descriptors to be checked for exceptional conditions (e.g., out-of-band data). On successful return, it is modified to indicate which of these file descriptors have exceptional conditions.timeout
: An optional pointer to astruct timeval
that specifies the maximum intervalselect()
should block waiting for a file descriptor to become ready.- If
timeout
is NULL,select()
blocks indefinitely. - If
timeout
points to astruct timeval
with zero values (tv_sec = 0
,tv_usec = 0
),select()
returns immediately (polling). - Otherwise,
select()
blocks for the specified duration.
- If
Return Value:
- On success,
select()
returns the total number of file descriptors that are ready across all the sets. - Returns
0
if the timeout expired before any file descriptors became ready. - Returns
-1
on error, witherrno
set appropriately.
fd_set
and Associated Macros:
An fd_set
is a data structure that can hold a set of file descriptors. Several macros are used to manipulate these sets:
FD_ZERO(fd_set *set)
: Initializes the set to be empty. This must be called before using anfd_set
for the first time and typically before each call toselect()
, asselect()
modifies the sets.FD_SET(int fd, fd_set *set)
: Adds the file descriptorfd
to the set.FD_CLR(int fd, fd_set *set)
: Removes the file descriptorfd
from the set.FD_ISSET(int fd, fd_set *set)
: Returns a non-zero value iffd
is a member of the set, and zero otherwise. This is used afterselect()
returns to check which file descriptors are ready.
graph LR subgraph "Application Task" AppTask["Application Task<br>(Single Thread)"] end subgraph "Socket Descriptors (fd_set)" ReadSet["<b>readfds</b><br>(Sockets to check for readability)"] WriteSet["<b>writefds</b><br>(Sockets to check for writability)"] ExceptSet["<b>exceptfds</b><br>(Sockets to check for errors)"] end subgraph "Monitored Sockets" Sock1["Socket 1 (e.g., Listening)"] Sock2["Socket 2 (e.g., Client A)"] Sock3["Socket 3 (e.g., Client B)"] SockN["Socket N ..."] end AppTask -- "Prepares fd_sets" --> ReadSet; AppTask -- "Prepares fd_sets" --> WriteSet; AppTask -- "Prepares fd_sets" --> ExceptSet; ReadSet -.-> Sock1; ReadSet -.-> Sock2; WriteSet -.-> Sock2; ReadSet -.-> Sock3; ExceptSet -.-> SockN; SelectCall["<b>select(nfds, &readfds, &writefds, &exceptfds, &timeout)</b><br>Task blocks here (efficiently)"] AppTask --> SelectCall; SelectCall -- "Returns > 0<br>(One or more sockets ready)" --> AppTaskReady["Task Unblocks:<br>Checks modified fd_sets (FD_ISSET)<br>Handles I/O on ready sockets"]; SelectCall -- "Returns 0<br>(Timeout expired)" --> AppTaskTimeout["Task Unblocks:<br>Performs periodic tasks"]; SelectCall -- "Returns -1<br>(Error occurred)" --> AppTaskError["Task Unblocks:<br>Handles select() error (checks errno)"]; AppTaskReady --> AppTask; AppTaskTimeout --> AppTask; AppTaskError --> AppTask; classDef primary fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6; classDef decision fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E; classDef process fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF; classDef check fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B; classDef success fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46; classDef set fill:#FFF9C4,stroke:#FBC02D,stroke-width:1px,color:#795548; class AppTask primary; class ReadSet,WriteSet,ExceptSet set; class Sock1,Sock2,Sock3,SockN process; class SelectCall check; class AppTaskReady success; class AppTaskTimeout decision; class AppTaskError check;
Typical Workflow with select()
:
- Create Sockets: Create and configure all sockets you want to monitor. Set them to non-blocking mode if you intend to perform non-blocking operations after
select()
indicates readiness. - Initialize
fd_set
s: CallFD_ZERO()
forreadfds
,writefds
, andexceptfds
. - Add Sockets to
fd_set
s: UseFD_SET()
to add the file descriptors of interest to the appropriate sets. For example, a listening socket would be added toreadfds
to check for incoming connections. A connected socket might be added toreadfds
to check for incoming data, or towritefds
if you want to send data and need to know when the send buffer has space. - Determine
nfds
: Find the highest socket descriptor value among all sockets added to the sets and add 1 to it. - Set Timeout: Configure the
struct timeval
for the desired timeout behavior. - Call
select()
: Invokeselect()
with the prepared sets,nfds
, and timeout. - Check Return Value:
- If > 0: One or more sockets are ready.
- If = 0: Timeout occurred.
- If < 0: An error occurred (check
errno
).
- Check
fd_set
s: Ifselect()
returned > 0, useFD_ISSET()
to iterate through your original list of sockets and determine which ones are ready in each set (readfds
,writefds
,exceptfds
). - Perform I/O: For each ready socket, perform the corresponding non-blocking operation (e.g.,
accept()
,recv()
,send()
). Sinceselect()
indicated readiness, these operations should not block, but you still need to handle short counts orEWOULDBLOCK
if the condition changes rapidly or if only a small amount of data/buffer space is available. - Loop: Go back to step 2 (or 3 if the set of monitored sockets doesn’t change often, but remember
select()
modifies the sets passed to it, so they usually need to be re-populated).
graph TD Start[Start: Initialize Sockets<br>Set to Non-Blocking] --> Loop{Main Event Loop}; Loop --> PrepFDs["1- Initialize & Populate fd_sets<br>FD_ZERO(all_sets)<br>FD_SET(relevant_sockets, specific_set)"]; PrepFDs --> MaxFD["2- Determine nfds<br>(max_socket_fd + 1)"]; MaxFD --> SetTimeout["3- Configure struct timeval (timeout)"]; SetTimeout --> CallSelect["4- Call select(nfds, &readfds, &writefds, &exceptfds, &timeout)"]; CallSelect --> CheckReturn{"5- Check select() Return Value"}; CheckReturn -- "> 0 (Sockets Ready)" --> CheckSets["6- Check fd_sets (FD_ISSET)<br>Iterate through monitored sockets"]; CheckSets --> HandleIO["7- Perform Non-Blocking I/O<br>accept(), recv(), send()<br>Handle results (data, close, EWOULDBLOCK)"]; HandleIO --> Loop; CheckReturn -- "= 0 (Timeout)" --> HandleTimeout["Perform Periodic/Timeout Tasks"]; HandleTimeout --> Loop; CheckReturn -- "< 0 (Error)" --> HandleError["Handle select() Error (check errno, e.g., EINTR)"]; HandleError --> Loop; Loop -- "Shutdown Condition" --> End[End: Clean up sockets, Exit Loop]; classDef primary fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6; classDef decision fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E; classDef process fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF; classDef check fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B; classDef success fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46; class Start primary; class Loop decision; class PrepFDs,MaxFD,SetTimeout,CheckSets,HandleIO,HandleTimeout,HandleError process; class CallSelect check; class CheckReturn decision; class End success;
Non-blocking connect()
When connect()
is called on a non-blocking socket, it usually returns immediately with errno
set to EINPROGRESS
. This means the TCP handshake has started but is not yet complete. To determine when the connection is established or has failed:
- Add the socket descriptor to the
writefds
set forselect()
. A successful connection is indicated by the socket becoming writable. - Optionally, add the socket descriptor to the
readfds
set as well. Some systems might indicate a connection error by making the socket readable (and a subsequent read would return the error). - Call
select()
. - If
select()
indicates the socket is writable:- The connection might be successful. To confirm, you can use
getsockopt()
withSO_ERROR
to retrieve any pending error on the socket. Ifgetsockopt()
returns 0 for the error, the connection is established. - If
getsockopt()
returns an error code (e.g.,ECONNREFUSED
), the connection failed.
- The connection might be successful. To confirm, you can use
- If
select()
indicates an error (e.g., inexceptfds
, orreadfds
for some systems when an error occurs), the connection likely failed. Again, usegetsockopt(sock, SOL_SOCKET, SO_ERROR, ...)
to get the specific error.
graph TD A["Start: Create Socket<br>Set to Non-Blocking (fcntl)"] --> B["Call connect(sock, server_addr, ...)"]; B --> C{"connect() returns -1 AND<br>errno == EINPROGRESS?"}; C -- "Yes (Connection Initiated)" --> D["Add sock to <b>writefds</b> for select()<br>(Optionally readfds/exceptfds for errors on some systems)"]; D --> E["Call select(nfds, ..., &writefds, ..., &timeout)"]; E --> F{"select() indicates sock is Writable?"}; F -- "Yes" --> G["Connection attempt completed.<br>Verify actual outcome."]; G --> H["Call getsockopt(sock, SOL_SOCKET, SO_ERROR, &error_val, &len)"]; H --> I{error_val == 0?}; I -- "Yes" --> J["<b>Connection Successful!</b><br>Socket is ready for send/recv."]; I -- "No (error_val != 0)" --> K["<b>Connection Failed.</b><br>Error: strerror(error_val)<br>Close socket."]; F -- "No (e.g., select() Timeout)" --> L["Handle select() Timeout or Error.<br>Connection attempt may have failed or still pending.<br>Consider retry or abort."]; C -- "No (Immediate Result)" --> M{"connect() returned 0?"}; M -- "Yes (Immediate Success - Rare)" --> J; M -- "No (Immediate Failure, errno != EINPROGRESS)" --> N["<b>Connection Failed Immediately.</b><br>Error: strerror(errno)<br>Close socket."]; classDef primary fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6; classDef decision fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E; classDef process fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF; classDef success fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46; classDef error fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B; class A primary; class B,D,E,G,H process; class C,F,I,M decision; class J success; class K,L,N error;
Integration with FreeRTOS
In an ESP-IDF FreeRTOS environment, a common pattern is to have a dedicated task that manages network connections. This task would contain the main loop that calls select()
and dispatches events.
- Single Task for Multiple Connections: One FreeRTOS task can manage many non-blocking sockets using
select()
. This is much more memory-efficient than creating a separate task for each connection. - Timeout for
select()
: The timeout value forselect()
can be used to allow the network task to periodically perform other housekeeping duties or check for signals from other tasks (e.g., via a queue or event group) if no socket activity occurs. - Yielding: If
select()
has a non-zero timeout, the calling task will block (efficiently, without consuming CPU) until either a socket becomes ready or the timeout expires. This allows other FreeRTOS tasks to run.
Practical Examples
Before running these examples, ensure your ESP32 is configured with Wi-Fi credentials (e.g., via idf.py menuconfig
under Example Connection Configuration
).
Example 1: Non-blocking TCP Client
This example demonstrates a TCP client that connects to a server (e.g., tcpbin.com
on port 4242
or a local Python server) in a non-blocking manner, sends a message, and reads the response.
Project Setup:
- Create a new ESP-IDF project:
idf.py create-project non_blocking_client
- Navigate into the project:
cd non_blocking_client
- Replace the content of
main/main.c
with the code below. - Ensure your
sdkconfig
has Wi-Fi enabled and configured.
main/main.c
:
#include <stdio.h>
#include <string.h>
#include <sys/fcntl.h>
#include <sys/errno.h>
#include <sys/param.h> // For MIN/MAX
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "freertos/event_groups.h"
#include "esp_system.h"
#include "esp_wifi.h"
#include "esp_event.h"
#include "esp_log.h"
#include "nvs_flash.h"
#include "lwip/err.h"
#include "lwip/sockets.h"
#include "lwip/sys.h"
#include <lwip/netdb.h>
#define WIFI_SSID CONFIG_EXAMPLE_WIFI_SSID
#define WIFI_PASS CONFIG_EXAMPLE_WIFI_PASSWORD
#define MAX_FAILURES 10
// Server details - use a public echo server or your own
#define SERVER_HOST "tcpbin.com"
#define SERVER_PORT "4242"
#define MESSAGE "Hello from ESP32 (non-blocking)!\n"
static const char *TAG = "nb_client";
/* FreeRTOS event group to signal when we are connected*/
static EventGroupHandle_t s_wifi_event_group;
/* The event group allows multiple bits for each event, but we only care about two events:
* - we are connected to the AP with an IP
* - we failed to connect after the maximum amount of retries */
#define WIFI_CONNECTED_BIT BIT0
#define WIFI_FAIL_BIT BIT1
static int s_retry_num = 0;
static void event_handler(void* arg, esp_event_base_t event_base,
int32_t event_id, void* event_data)
{
if (event_base == WIFI_EVENT && event_id == WIFI_EVENT_STA_START) {
esp_wifi_connect();
} else if (event_base == WIFI_EVENT && event_id == WIFI_EVENT_STA_DISCONNECTED) {
if (s_retry_num < MAX_FAILURES) {
esp_wifi_connect();
s_retry_num++;
ESP_LOGI(TAG, "retry to connect to the AP");
} else {
xEventGroupSetBits(s_wifi_event_group, WIFI_FAIL_BIT);
}
ESP_LOGI(TAG,"connect to the AP fail");
} else if (event_base == IP_EVENT && event_id == IP_EVENT_STA_GOT_IP) {
ip_event_got_ip_t* event = (ip_event_got_ip_t*) event_data;
ESP_LOGI(TAG, "got ip:" IPSTR, IP2STR(&event->ip_info.ip));
s_retry_num = 0;
xEventGroupSetBits(s_wifi_event_group, WIFI_CONNECTED_BIT);
}
}
void wifi_init_sta(void)
{
s_wifi_event_group = xEventGroupCreate();
ESP_ERROR_CHECK(esp_netif_init());
ESP_ERROR_CHECK(esp_event_loop_create_default());
esp_netif_create_default_wifi_sta();
wifi_init_config_t cfg = WIFI_INIT_CONFIG_DEFAULT();
ESP_ERROR_CHECK(esp_wifi_init(&cfg));
esp_event_handler_instance_t instance_any_id;
esp_event_handler_instance_t instance_got_ip;
ESP_ERROR_CHECK(esp_event_handler_instance_register(WIFI_EVENT,
ESP_EVENT_ANY_ID,
&event_handler,
NULL,
&instance_any_id));
ESP_ERROR_CHECK(esp_event_handler_instance_register(IP_EVENT,
IP_EVENT_STA_GOT_IP,
&event_handler,
NULL,
&instance_got_ip));
wifi_config_t wifi_config = {
.sta = {
.ssid = WIFI_SSID,
.password = WIFI_PASS,
.threshold.authmode = WIFI_AUTH_WPA2_PSK, // Adjust if needed
},
};
ESP_ERROR_CHECK(esp_wifi_set_mode(WIFI_MODE_STA) );
ESP_ERROR_CHECK(esp_wifi_set_config(WIFI_IF_STA, &wifi_config) );
ESP_ERROR_CHECK(esp_wifi_start() );
ESP_LOGI(TAG, "wifi_init_sta finished.");
/* Waiting until either the connection is established (WIFI_CONNECTED_BIT) or connection failed for the maximum
* number of re-tries (WIFI_FAIL_BIT). The bits are set by event_handler() (see above) */
EventBits_t bits = xEventGroupWaitBits(s_wifi_event_group,
WIFI_CONNECTED_BIT | WIFI_FAIL_BIT,
pdFALSE,
pdFALSE,
portMAX_DELAY);
if (bits & WIFI_CONNECTED_BIT) {
ESP_LOGI(TAG, "connected to ap SSID:%s", WIFI_SSID);
} else if (bits & WIFI_FAIL_BIT) {
ESP_LOGI(TAG, "Failed to connect to SSID:%s, password:%s", WIFI_SSID, WIFI_PASS);
} else {
ESP_LOGE(TAG, "UNEXPECTED EVENT");
}
}
void non_blocking_tcp_client_task(void *pvParameters)
{
char rx_buffer[128];
int sock = -1;
// Wait for Wi-Fi connection
xEventGroupWaitBits(s_wifi_event_group, WIFI_CONNECTED_BIT, pdFALSE, pdTRUE, portMAX_DELAY);
ESP_LOGI(TAG, "WiFi Connected. Starting TCP client...");
while (1) { // Main loop for retrying connection
const struct addrinfo hints = {
.ai_family = AF_INET, // Use AF_INET6 for IPv6
.ai_socktype = SOCK_STREAM,
};
struct addrinfo *res;
int err;
ESP_LOGI(TAG, "DNS lookup for host %s", SERVER_HOST);
err = getaddrinfo(SERVER_HOST, SERVER_PORT, &hints, &res);
if (err != 0 || res == NULL) {
ESP_LOGE(TAG, "DNS lookup failed err=%d res=%p", err, res);
vTaskDelay(pdMS_TO_TICKS(1000));
continue;
}
// LWIP_SO_RCVTIMEO: sets timeout for socket receiving.
// If this option is enabled, the function `recv` will not block forever,
// instead it will return if timeout.
// This is another way to prevent blocking, but select() is more versatile for multiple sockets.
// For this example, we focus on O_NONBLOCK and select().
// struct timeval receiving_timeout;
// receiving_timeout.tv_sec = 5;
// receiving_timeout.tv_usec = 0;
sock = socket(res->ai_family, res->ai_socktype, 0);
if (sock < 0) {
ESP_LOGE(TAG, "Failed to create socket. errno: %d", errno);
freeaddrinfo(res);
vTaskDelay(pdMS_TO_TICKS(1000));
continue;
}
ESP_LOGI(TAG, "Socket created");
// Set socket to non-blocking
int flags = fcntl(sock, F_GETFL, 0);
if (flags < 0) {
ESP_LOGE(TAG, "fcntl(F_GETFL) failed. errno: %d", errno);
goto error_cleanup;
}
if (fcntl(sock, F_SETFL, flags | O_NONBLOCK) < 0) {
ESP_LOGE(TAG, "fcntl(F_SETFL, O_NONBLOCK) failed. errno: %d", errno);
goto error_cleanup;
}
ESP_LOGI(TAG, "Socket set to non-blocking mode");
ESP_LOGI(TAG, "Attempting to connect to %s:%s...", SERVER_HOST, SERVER_PORT);
err = connect(sock, res->ai_addr, res->ai_addrlen);
freeaddrinfo(res); // Free addrinfo after connect is initiated
if (err < 0) {
if (errno == EINPROGRESS) {
ESP_LOGI(TAG, "Connection in progress (EINPROGRESS)...");
fd_set wfds;
struct timeval tv;
FD_ZERO(&wfds);
FD_SET(sock, &wfds);
tv.tv_sec = 5; // 5 second timeout for connection
tv.tv_usec = 0;
// Wait for the socket to become writable (connection established) or timeout
int select_err = select(sock + 1, NULL, &wfds, NULL, &tv);
if (select_err < 0) {
ESP_LOGE(TAG, "select() error. errno: %d", errno);
goto error_cleanup;
} else if (select_err == 0) {
ESP_LOGE(TAG, "select() timeout: Connection timed out.");
goto error_cleanup;
} else { // select_err > 0
if (FD_ISSET(sock, &wfds)) {
int sockopt_err;
socklen_t optlen = sizeof(sockopt_err);
getsockopt(sock, SOL_SOCKET, SO_ERROR, &sockopt_err, &optlen);
if (sockopt_err == 0) {
ESP_LOGI(TAG, "Connection established!");
} else {
ESP_LOGE(TAG, "Connection failed with SO_ERROR: %d (%s)", sockopt_err, strerror(sockopt_err));
goto error_cleanup;
}
} else {
ESP_LOGE(TAG, "select() returned but socket not in writefds? Should not happen.");
goto error_cleanup;
}
}
} else { // Other connect error
ESP_LOGE(TAG, "Socket connect failed. errno: %d (%s)", errno, strerror(errno));
goto error_cleanup;
}
} else { // err == 0, connect returned immediately (rare for non-blocking)
ESP_LOGI(TAG, "Connection established immediately!");
}
// Socket is connected and non-blocking
// Send data
ESP_LOGI(TAG, "Sending message: %s", MESSAGE);
int written = send(sock, MESSAGE, strlen(MESSAGE), 0);
if (written < 0) {
if (errno == EWOULDBLOCK || errno == EAGAIN) {
ESP_LOGW(TAG, "Send would block, try later or use select for writability");
// For this example, we'll just log and proceed to receive attempt
// A robust client would use select() on writefds before retrying send.
} else {
ESP_LOGE(TAG, "Error occurred during sending. errno: %d", errno);
goto error_cleanup;
}
} else {
ESP_LOGI(TAG, "Sent %d bytes", written);
}
// Receive data
ESP_LOGI(TAG, "Waiting to receive data...");
struct timeval tv_recv;
tv_recv.tv_sec = 5; // Timeout for receiving data
tv_recv.tv_usec = 0;
fd_set rfds;
FD_ZERO(&rfds);
FD_SET(sock, &rfds);
int select_recv_err = select(sock + 1, &rfds, NULL, NULL, &tv_recv);
if (select_recv_err < 0) {
ESP_LOGE(TAG, "select() for receive failed. errno: %d", errno);
goto error_cleanup;
} else if (select_recv_err == 0) {
ESP_LOGW(TAG, "Receive timeout (no data received within 5s).");
// This is not necessarily an error, server might not send anything back
} else { // select_recv_err > 0
if (FD_ISSET(sock, &rfds)) {
ESP_LOGI(TAG, "Socket is readable. Receiving data...");
ssize_t len = recv(sock, rx_buffer, sizeof(rx_buffer) - 1, 0);
if (len < 0) {
if (errno == EWOULDBLOCK || errno == EAGAIN) {
ESP_LOGW(TAG, "Recv would block, even after select. This can happen if data arrives and is consumed by stack between select and recv.");
} else {
ESP_LOGE(TAG, "recv failed. errno: %d", errno);
goto error_cleanup;
}
} else if (len == 0) {
ESP_LOGI(TAG, "Connection closed by server.");
goto error_cleanup;
} else {
rx_buffer[len] = 0; // Null-terminate whatever we received
ESP_LOGI(TAG, "Received %d bytes: '%s'", len, rx_buffer);
}
}
}
error_cleanup:
if (sock != -1) {
ESP_LOGI(TAG, "Shutting down socket and closing connection...");
shutdown(sock, SHUT_RDWR); // Graceful shutdown
close(sock);
sock = -1;
}
ESP_LOGI(TAG, "Client task finished or error. Restarting in 10 seconds...");
vTaskDelay(pdMS_TO_TICKS(10000)); // Wait before retrying
} // end while(1)
vTaskDelete(NULL);
}
void app_main(void)
{
//Initialize NVS
esp_err_t ret = nvs_flash_init();
if (ret == ESP_ERR_NVS_NO_FREE_PAGES || ret == ESP_ERR_NVS_NEW_VERSION_FOUND) {
ESP_ERROR_CHECK(nvs_flash_erase());
ret = nvs_flash_init();
}
ESP_ERROR_CHECK(ret);
ESP_LOGI(TAG, "ESP_WIFI_MODE_STA");
wifi_init_sta();
xTaskCreate(non_blocking_tcp_client_task, "nb_tcp_client", 4096, NULL, 5, NULL);
}
CMakeLists.txt (in main directory):
Ensure it contains:
idf_component_register(SRCS "main.c"
INCLUDE_DIRS ".")
Build and Flash Instructions:
- Set your target ESP32 variant:
idf.py set-target esp32
(oresp32s2
,esp32s3
,esp32c3
, etc.) - Configure Wi-Fi:
idf.py menuconfig
->Example Connection Configuration
-> Enter SSID and Password. Save and exit. - Build:
idf.py build
- Flash:
idf.py -p /dev/ttyUSB0 flash
(Replace/dev/ttyUSB0
with your ESP32’s serial port) - Monitor:
idf.py -p /dev/ttyUSB0 monitor
Observe:
The client will attempt to connect to tcpbin.com:4242. It will log the non-blocking connection process. If successful, it sends the message and attempts to read a response (tcpbin.com echoes the message back).
Tip: You can set up a simple Python echo server locally for testing:
# simple_echo_server.py
import socket
HOST = '0.0.0.0' # Listen on all available interfaces
PORT = 12345
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.bind((HOST, PORT))
s.listen()
print(f"Listening on {HOST}:{PORT}")
conn, addr = s.accept()
with conn:
print(f"Connected by {addr}")
while True:
data = conn.recv(1024)
if not data:
break
print(f"Received: {data.decode().strip()}, sending back.")
conn.sendall(data)
Then change SERVER_HOST
to your computer’s IP address and SERVER_PORT
to 12345
in the C code.
Example 2: Non-blocking TCP Server (Handling Multiple Clients)
This example demonstrates a TCP server that listens for incoming connections on a specific port. It uses select()
to manage the listening socket and multiple connected client sockets in a non-blocking manner. The server echoes back any data received from clients.
Project Setup:
- Create a new ESP-IDF project:
idf.py create-project non_blocking_server
- Navigate into the project:
cd non_blocking_server
- Replace the content of
main/main.c
with the code below. - Ensure your
sdkconfig
has Wi-Fi enabled and configured.
main/main.c
:
#include <stdio.h>
#include <string.h>
#include <sys/fcntl.h>
#include <sys/errno.h>
#include <sys/param.h>
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "freertos/event_groups.h"
#include "esp_system.h"
#include "esp_wifi.h"
#include "esp_event.h"
#include "esp_log.h"
#include "nvs_flash.h"
#include "lwip/err.h"
#include "lwip/sockets.h"
#include "lwip/sys.h"
#include <lwip/netdb.h>
#define WIFI_SSID CONFIG_EXAMPLE_WIFI_SSID
#define WIFI_PASS CONFIG_EXAMPLE_WIFI_PASSWORD
#define MAX_FAILURES 10
#define SERVER_PORT 12345
#define MAX_CLIENTS 5 // Maximum number of concurrent clients
static const char *TAG = "nb_server";
static EventGroupHandle_t s_wifi_event_group;
#define WIFI_CONNECTED_BIT BIT0
#define WIFI_FAIL_BIT BIT1
static int s_retry_num = 0;
// Client socket tracking
static int client_sockets[MAX_CLIENTS];
static int num_clients = 0;
// Wi-Fi event handler (same as client example)
static void event_handler(void* arg, esp_event_base_t event_base,
int32_t event_id, void* event_data)
{
if (event_base == WIFI_EVENT && event_id == WIFI_EVENT_STA_START) {
esp_wifi_connect();
} else if (event_base == WIFI_EVENT && event_id == WIFI_EVENT_STA_DISCONNECTED) {
if (s_retry_num < MAX_FAILURES) {
esp_wifi_connect();
s_retry_num++;
ESP_LOGI(TAG, "retry to connect to the AP");
} else {
xEventGroupSetBits(s_wifi_event_group, WIFI_FAIL_BIT);
}
ESP_LOGI(TAG,"connect to the AP fail");
} else if (event_base == IP_EVENT && event_id == IP_EVENT_STA_GOT_IP) {
ip_event_got_ip_t* event = (ip_event_got_ip_t*) event_data;
ESP_LOGI(TAG, "got ip:" IPSTR, IP2STR(&event->ip_info.ip));
s_retry_num = 0;
xEventGroupSetBits(s_wifi_event_group, WIFI_CONNECTED_BIT);
}
}
// Wi-Fi init (same as client example)
void wifi_init_sta(void)
{
s_wifi_event_group = xEventGroupCreate();
ESP_ERROR_CHECK(esp_netif_init());
ESP_ERROR_CHECK(esp_event_loop_create_default());
esp_netif_create_default_wifi_sta();
wifi_init_config_t cfg = WIFI_INIT_CONFIG_DEFAULT();
ESP_ERROR_CHECK(esp_wifi_init(&cfg));
esp_event_handler_instance_t instance_any_id;
esp_event_handler_instance_t instance_got_ip;
ESP_ERROR_CHECK(esp_event_handler_instance_register(WIFI_EVENT, ESP_EVENT_ANY_ID, &event_handler, NULL, &instance_any_id));
ESP_ERROR_CHECK(esp_event_handler_instance_register(IP_EVENT, IP_EVENT_STA_GOT_IP, &event_handler, NULL, &instance_got_ip));
wifi_config_t wifi_config = {
.sta = { .ssid = WIFI_SSID, .password = WIFI_PASS, .threshold.authmode = WIFI_AUTH_WPA2_PSK },
};
ESP_ERROR_CHECK(esp_wifi_set_mode(WIFI_MODE_STA) );
ESP_ERROR_CHECK(esp_wifi_set_config(WIFI_IF_STA, &wifi_config) );
ESP_ERROR_CHECK(esp_wifi_start() );
ESP_LOGI(TAG, "wifi_init_sta finished.");
EventBits_t bits = xEventGroupWaitBits(s_wifi_event_group, WIFI_CONNECTED_BIT | WIFI_FAIL_BIT, pdFALSE, pdFALSE, portMAX_DELAY);
if (bits & WIFI_CONNECTED_BIT) {
ESP_LOGI(TAG, "connected to ap SSID:%s", WIFI_SSID);
} else if (bits & WIFI_FAIL_BIT) {
ESP_LOGI(TAG, "Failed to connect to SSID:%s", WIFI_SSID);
} else { ESP_LOGE(TAG, "UNEXPECTED EVENT"); }
}
// Helper to set a socket to non-blocking
static int make_socket_non_blocking(int sock_fd) {
int flags = fcntl(sock_fd, F_GETFL, 0);
if (flags < 0) {
ESP_LOGE(TAG, "fcntl(F_GETFL) failed for fd %d: %s", sock_fd, strerror(errno));
return -1;
}
if (fcntl(sock_fd, F_SETFL, flags | O_NONBLOCK) < 0) {
ESP_LOGE(TAG, "fcntl(F_SETFL, O_NONBLOCK) failed for fd %d: %s", sock_fd, strerror(errno));
return -1;
}
return 0;
}
// Helper to add client socket
static void add_client_socket(int sock_fd) {
if (num_clients < MAX_CLIENTS) {
client_sockets[num_clients++] = sock_fd;
ESP_LOGI(TAG, "Client connected, socket %d. Total clients: %d", sock_fd, num_clients);
} else {
ESP_LOGW(TAG, "Max clients reached. Rejecting new connection.");
close(sock_fd);
}
}
// Helper to remove client socket
static void remove_client_socket(int sock_fd) {
int i;
for (i = 0; i < num_clients; i++) {
if (client_sockets[i] == sock_fd) {
ESP_LOGI(TAG, "Client disconnected, socket %d.", sock_fd);
close(sock_fd); // Ensure socket is closed
// Shift remaining sockets
for (int j = i; j < num_clients - 1; j++) {
client_sockets[j] = client_sockets[j + 1];
}
num_clients--;
break;
}
}
}
void non_blocking_tcp_server_task(void *pvParameters)
{
char rx_buffer[128];
int listen_sock = -1;
struct sockaddr_in server_addr, client_addr;
socklen_t client_addr_len = sizeof(client_addr);
// Initialize client sockets array
for(int i=0; i<MAX_CLIENTS; ++i) client_sockets[i] = -1;
// Wait for Wi-Fi connection
xEventGroupWaitBits(s_wifi_event_group, WIFI_CONNECTED_BIT, pdFALSE, pdTRUE, portMAX_DELAY);
ESP_LOGI(TAG, "WiFi Connected. Starting TCP server on port %d...", SERVER_PORT);
listen_sock = socket(AF_INET, SOCK_STREAM, 0);
if (listen_sock < 0) {
ESP_LOGE(TAG, "Failed to create listening socket. errno: %s", strerror(errno));
vTaskDelete(NULL);
return;
}
if (make_socket_non_blocking(listen_sock) != 0) {
close(listen_sock);
vTaskDelete(NULL);
return;
}
memset(&server_addr, 0, sizeof(server_addr));
server_addr.sin_family = AF_INET;
server_addr.sin_addr.s_addr = htonl(INADDR_ANY);
server_addr.sin_port = htons(SERVER_PORT);
if (bind(listen_sock, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0) {
ESP_LOGE(TAG, "Bind failed. errno: %s", strerror(errno));
close(listen_sock);
vTaskDelete(NULL);
return;
}
if (listen(listen_sock, 5) < 0) { // Backlog of 5
ESP_LOGE(TAG, "Listen failed. errno: %s", strerror(errno));
close(listen_sock);
vTaskDelete(NULL);
return;
}
ESP_LOGI(TAG, "Server listening on port %d", SERVER_PORT);
fd_set readfds;
struct timeval tv;
while (1) {
FD_ZERO(&readfds);
FD_SET(listen_sock, &readfds); // Add listening socket to set
int max_sd = listen_sock;
// Add client sockets to set
for (int i = 0; i < num_clients; i++) {
int sd = client_sockets[i];
if (sd > 0) {
FD_SET(sd, &readfds);
}
if (sd > max_sd) {
max_sd = sd;
}
}
tv.tv_sec = 5; // Timeout for select (e.g., 5 seconds)
tv.tv_usec = 0;
int activity = select(max_sd + 1, &readfds, NULL, NULL, &tv);
if (activity < 0 && errno != EINTR) {
ESP_LOGE(TAG, "select error: %s", strerror(errno));
// Potentially break or attempt recovery
vTaskDelay(pdMS_TO_TICKS(100)); // Avoid busy loop on persistent error
continue;
}
if (activity == 0) {
// ESP_LOGD(TAG, "select() timeout. No activity.");
// This is normal, can be used for periodic tasks
continue;
}
// Check for incoming connection on listening socket
if (FD_ISSET(listen_sock, &readfds)) {
int new_socket = accept(listen_sock, (struct sockaddr *)&client_addr, &client_addr_len);
if (new_socket < 0) {
if (errno == EWOULDBLOCK || errno == EAGAIN) {
// This shouldn't happen if select indicated readability for accept,
// but handle defensively.
ESP_LOGW(TAG, "accept would block, errno: %s", strerror(errno));
} else {
ESP_LOGE(TAG, "accept failed: %s", strerror(errno));
}
} else {
ESP_LOGI(TAG, "New connection, socket fd is %d, ip is: %s, port: %d",
new_socket, inet_ntoa(client_addr.sin_addr), ntohs(client_addr.sin_port));
if (make_socket_non_blocking(new_socket) != 0) {
close(new_socket);
} else {
add_client_socket(new_socket);
}
}
}
// Check for I/O on client sockets
for (int i = 0; i < num_clients; i++) {
int sd = client_sockets[i];
if (sd > 0 && FD_ISSET(sd, &readfds)) {
ssize_t len = recv(sd, rx_buffer, sizeof(rx_buffer) - 1, 0);
if (len > 0) {
rx_buffer[len] = 0;
ESP_LOGI(TAG, "Received %d bytes from socket %d: %s", len, sd, rx_buffer);
// Echo back
if (send(sd, rx_buffer, len, 0) < 0) {
if (errno != EWOULDBLOCK && errno != EAGAIN) {
ESP_LOGE(TAG, "Send failed on socket %d: %s", sd, strerror(errno));
remove_client_socket(sd); // Remove on send error
} else {
ESP_LOGW(TAG, "Send would block on socket %d", sd);
// Data will be sent later when socket is writable.
// For a robust server, you'd add sd to writefds for select().
}
}
} else if (len == 0) {
ESP_LOGI(TAG, "Connection closed by client on socket %d", sd);
remove_client_socket(sd);
} else { // len < 0
if (errno == EWOULDBLOCK || errno == EAGAIN) {
// This means no data was available right now, which is fine.
// select() should prevent this mostly, but race conditions are possible.
ESP_LOGV(TAG, "Recv would block on socket %d", sd);
} else {
ESP_LOGE(TAG, "recv failed on socket %d: %s", sd, strerror(errno));
remove_client_socket(sd); // Remove on error
}
}
}
}
} // end while(1)
ESP_LOGI(TAG, "Shutting down server...");
for(int i=0; i < num_clients; ++i) {
if(client_sockets[i] != -1) {
shutdown(client_sockets[i], SHUT_RDWR);
close(client_sockets[i]);
}
}
shutdown(listen_sock, SHUT_RDWR);
close(listen_sock);
vTaskDelete(NULL);
}
void app_main(void)
{
esp_err_t ret = nvs_flash_init();
if (ret == ESP_ERR_NVS_NO_FREE_PAGES || ret == ESP_ERR_NVS_NEW_VERSION_FOUND) {
ESP_ERROR_CHECK(nvs_flash_erase());
ret = nvs_flash_init();
}
ESP_ERROR_CHECK(ret);
ESP_LOGI(TAG, "ESP_WIFI_MODE_STA");
wifi_init_sta(); // Connect ESP32 to an AP
xTaskCreate(non_blocking_tcp_server_task, "nb_tcp_server", 4096*2, NULL, 5, NULL); // Increased stack for server
}
Build and Flash Instructions: (Same as Example 1)
- Set target, configure Wi-Fi, build, flash, monitor.
Observe:
- The ESP32 will connect to your Wi-Fi network and print its IP address.
- The server will start listening on port
12345
. - Use a TCP client tool (like
netcat
ortelnet
) to connect to the ESP32’s IP address on port12345
.- Example using
netcat
:nc <ESP32_IP_ADDRESS> 12345
- Example using
telnet
:telnet <ESP32_IP_ADDRESS> 12345
- Example using
- You can open multiple client connections. Type messages in your client terminals. The ESP32 server should log the received messages and echo them back to the respective clients.
- Observe how
select()
handles activity from the listening socket (new connections) and multiple client sockets (incoming data).
Variant Notes
The core concepts of non-blocking sockets and the select()
call are part of the LWIP stack, which is used across all ESP32 variants (ESP32, ESP32-S2, ESP32-S3, ESP32-C3, ESP32-C6, ESP32-H2). Therefore, the API and fundamental behavior described in this chapter are consistent.
- Performance: CPU clock speed, available RAM, and the efficiency of the Wi-Fi or Ethernet peripheral can influence the maximum number of concurrent connections or data throughput. Newer variants like ESP32-S3 or ESP32-C6 might offer better performance than older ones.
- Network Interfaces:
- All listed variants support Wi-Fi.
- Some ESP32 variants have built-in Ethernet MACs (e.g., original ESP32), while others might require an external PHY via SPI (e.g., ESP32-S2/S3 can use SPI Ethernet modules). The socket programming itself remains the same, but the underlying network interface initialization will differ. The examples use Wi-Fi. If using Ethernet, the network initialization part (
wifi_init_sta
) would be replaced with Ethernet initialization.
- Resource Limits: The maximum number of concurrently open sockets is limited by LWIP configuration (
LWIP_MAX_SOCKETS
insdkconfig
) and available memory. Default values are usually sufficient for many applications, but for servers handling many connections, this might need adjustment. - ESP32-H2: This variant also supports Thread (802.15.4) and Bluetooth LE. While this chapter focuses on TCP/IP over Wi-Fi/Ethernet, the socket API can also be used over other network interfaces if an IP layer is provided (e.g., 6LoWPAN over Thread).
In general, the code examples provided should work on any ESP32 variant with ESP-IDF v5.x and appropriate Wi-Fi configuration, as they rely on standard POSIX socket APIs provided by LWIP.
Common Mistakes & Troubleshooting Tips
Mistake / Issue | Symptom(s) | Troubleshooting / Solution |
---|---|---|
Forgetting to Re-initialize fd_set s |
select() behaves unpredictably after the first call; misses events or reports stale ones. |
Fix: Always re-initialize working fd_set s (e.g., readfds = master_read_fds; or FD_ZERO() then FD_SET() for all active FDs) before each select() call, as select() modifies them. |
Misinterpreting EWOULDBLOCK /EAGAIN or EINPROGRESS |
Treating these return codes from non-blocking recv() , send() , or connect() as fatal errors and closing the socket. |
Fix: These are expected non-fatal indicators that an operation would block. Use select() to wait for socket readiness before retrying the operation. For EINPROGRESS , use select() for writability and then getsockopt(SO_ERROR) . |
Busy-Waiting (Polling without select() ) |
High CPU usage; task continuously calls non-blocking I/O functions in a tight loop. System may become unresponsive. | Fix: Use select() with a timeout to efficiently wait for socket readiness. This allows the task to sleep when no I/O is possible, freeing CPU for other tasks. |
Incorrect nfds Parameter for select() |
select() may not monitor all intended sockets, or could crash/behave erratically. errno might be EBADF . |
Fix: nfds must be the highest file descriptor number present in any of the sets, plus one. Track the maximum FD value used. |
Mishandling Non-Blocking connect() Outcome |
Assuming connect() succeeded if it doesn’t return -1 immediately, or not checking SO_ERROR after select() indicates writability for a connecting socket. |
Fix: After connect() returns EINPROGRESS , use select() to monitor for writability. If writable, must call getsockopt(sock, SOL_SOCKET, SO_ERROR, &err, &len) . If err is 0, connection succeeded. Otherwise, it failed. |
Not Checking select() Return Value for Errors |
Ignoring a -1 return from select() and proceeding as if sockets are ready, or not handling EINTR . |
Fix: Always check if select() returned -1. If so, check errno . If errno is EINTR , the call can often be retried. Other errors may be critical. |
Forgetting to Set Socket to Non-Blocking | select() might indicate readiness, but a subsequent I/O call (accept() , recv() , send() ) unexpectedly blocks. |
Fix: Ensure all sockets intended for use with an event loop like select() are explicitly set to non-blocking mode using fcntl() . |
Warning: When using
select()
, always check its return value. If it’s -1, checkerrno
. Iferrno
isEINTR
, the call was interrupted by a signal and can usually be retried. Other errors might be more serious.
Exercises
- Non-blocking UDP Echo Client/Server:
- Adapt the TCP client and server examples to use UDP (
SOCK_DGRAM
). - The UDP client should send a message to the server.
- The UDP server should use
select()
to wait for incoming UDP packets on its listening socket, then echo the received packet back to the sender’s address (obtained fromrecvfrom()
). - Remember that UDP is connectionless, so there’s no
connect()
oraccept()
in the same way as TCP. The “server” binds to a port and usesrecvfrom()
. The “client” usessendto()
. Both can useselect()
on their respective sockets.
- Adapt the TCP client and server examples to use UDP (
- TCP Client with Send Timeout:
- Modify the non-blocking TCP client example. If
send()
returnsEWOULDBLOCK
, useselect()
to wait for the socket to become writable before retrying thesend()
. Implement a timeout for this send operation using theselect()
timeout. If the send cannot complete within, say, 2 seconds, report an error and close the connection.
- Modify the non-blocking TCP client example. If
- TCP Server with Client Timeout:
- Extend the non-blocking TCP server. For each connected client, if no data is received from that client for a specific period (e.g., 30 seconds), the server should automatically close that client’s connection.
- Hint: You’ll need to track the last activity time for each client. The main
select()
timeout can be used to periodically check these client activity timeouts.
- Simple Multi-Client Chat Server:
- Build upon the non-blocking TCP server example.
- When the server receives a message from one client, it should broadcast that message to all other connected clients.
- Prefix messages with the sender’s socket descriptor (or a unique ID) so clients know who sent what.
- Be careful when broadcasting: if a
send()
to one client would block, you need to handle it without blocking the broadcasts to other clients (e.g., by queuing the message for that specific client or usingselect()
withwritefds
for that client).
Summary
- Non-blocking sockets allow I/O operations to return immediately, even if they cannot complete, preventing tasks from hanging.
fcntl()
withO_NONBLOCK
is used to set a socket to non-blocking mode.- Operations on non-blocking sockets may return errors like
EWOULDBLOCK
,EAGAIN
, orEINPROGRESS
, which indicate the operation should be retried later. select()
is a system call that efficiently monitors multiple file descriptors (sockets) for readiness (read, write, or error conditions).fd_set
structures and macros (FD_ZERO
,FD_SET
,FD_CLR
,FD_ISSET
) are used to manage the sets of descriptors forselect()
.select()
is crucial for building event-driven network applications that can handle multiple connections or I/O streams concurrently within a single thread/task.- Properly handling the return values of non-blocking calls and
select()
, including re-initializingfd_set
s, is essential for correct operation. - This approach is highly suitable for resource-constrained environments like ESP32, enabling responsive and efficient network applications.
Further Reading
- ESP-IDF Programming Guide – Socket API:
- LWIP Wiki – Sockets API: (LWIP is the underlying TCP/IP stack)
- LWIP sockets.h documentation (or browse the specific version used by your ESP-IDF).
- POSIX
select()
andpselect()
Specification: - Beej’s Guide to Network Programming: (A classic guide, very helpful for understanding socket programming in general)
- Beej’s Guide to Network Programming – Using Internet Sockets (See sections on blocking/non-blocking and
select()
).
- Beej’s Guide to Network Programming – Using Internet Sockets (See sections on blocking/non-blocking and