-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Summary
HTTPClient can experience socket exhaustion when polling multiple devices over extended periods (6+ hours), resulting in HTTPC_ERROR_CONNECTION_REFUSED (-1) errors. This affects IoT and home automation systems managing 5+ HTTP endpoints.
Environment
- Board: ESP32 (all variants)
- Arduino-ESP32: Latest (tested on 2.x and 3.x)
- HTTPClient version: Current main branch
- Scenario: 8 devices polled every 15 seconds over 24+ hours
Problem Analysis
Code Review of HTTPClient.cpp
Looking at the actual implementation:
Line 1109 (HTTPClient::connect()):
if (!_client->connect(_host.c_str(), _port, _connectTimeout)) {
log_d("failed connect to %s:%u", _host.c_str(), _port);
return false;
}Line 367-392 (HTTPClient::disconnect()):
void HTTPClient::disconnect(bool preserveClient) {
if (connected()) {
if (_reuse && _canReuse) {
log_d("tcp keep open for reuse");
} else {
log_d("tcp stop");
_client->stop(); // ← Socket closed here
// ... client set to nullptr ...
}
}
}Line 358-361 (HTTPClient::end()):
void HTTPClient::end(void) {
disconnect(false);
clear();
// ← Returns immediately, no delay for socket cleanup
}The Problem
The current implementation correctly calls _client->stop(), but immediately returns control. On ESP32, TCP sockets need time to transition through proper closure states (FIN, TIME_WAIT, etc.). When applications immediately create new connections:
- Old socket still in TIME_WAIT (30-120 seconds depending on network)
- New connection request creates new socket
- Over hours: socket pool exhausts (ESP32 default: ~10 sockets via
MEMP_NUM_NETCONN)
This is exacerbated when:
- Using aggressive
setConnectTimeout()values (< 1000ms) - Polling multiple devices at high frequency (< 15 seconds)
- Connection failures leave sockets in inconsistent states
User-Side Symptoms
Initial boot: All devices connect fine
After 6-8 hours:
P1 > HTTP code: -1 (HTTPC_ERROR_CONNECTION_REFUSED)
Socket 1 > HTTP error -1
Socket 2 > HTTP error -1
[... cascade failure of all HTTP requests]
After ESP32 reboot: Everything works again
Root Causes
1. No Socket Cleanup Delay in end()
Current code:
void HTTPClient::end(void) {
disconnect(false);
clear();
// Immediately returns - socket may still be closing
}ESP32's lwIP stack needs time to fully close sockets. Without delay, rapid reconnections exhaust the pool.
2. Aggressive setConnectTimeout() Creates Half-Open Sockets
When users set very short timeouts:
http.setConnectTimeout(300); // 300ms - too aggressive for WiFiFailed connections during SYN/ACK handshake can leave sockets in SYN_SENT state, consuming resources.
3. Documentation Doesn't Warn About Multi-Device Patterns
The README and examples don't address:
- Socket pool limits on ESP32
- Best practices for polling multiple endpoints
- Recommended intervals to prevent exhaustion
Proposed Solutions
Solution 1: Add Cleanup Delay to end() (Minimal Impact)
File: libraries/HTTPClient/src/HTTPClient.cpp
void HTTPClient::end(void) {
disconnect(false);
clear();
// Give lwIP time to process socket closure
// Prevents socket exhaustion in multi-device polling scenarios
// Impact: ~50ms delay per request (negligible for most applications)
delay(50);
}Pros:
- ✅ Fixes the root cause
- ✅ Minimal performance impact (50ms)
- ✅ Transparent to users
- ✅ Prevents gradual socket exhaustion
Cons:
- ❌ Adds fixed delay to all HTTPClient usage
- ❌ May not be appropriate for time-critical applications
Alternative: Make it configurable:
class HTTPClient {
public:
void setCleanupDelay(uint16_t delayMs); // Default: 50ms
private:
uint16_t _cleanupDelay = 50;
};
void HTTPClient::end(void) {
disconnect(false);
clear();
if (_cleanupDelay > 0) {
delay(_cleanupDelay);
}
}Solution 2: Warn About Aggressive Timeouts (Documentation)
File: libraries/HTTPClient/README.md
Add warning about setConnectTimeout():
### Important: Connection Timeout Considerations
The `setConnectTimeout()` method sets the TCP connection timeout.
**⚠️ WARNING:** Values below 1000ms can cause socket exhaustion over time, especially
when polling multiple devices. Failed connections may leave sockets in inconsistent
states that aren't properly cleaned up.
**Recommended values:**
- WiFi networks: 3000-5000ms
- Ethernet: 2000-3000ms
- Unreliable networks: 5000-10000ms
**Avoid:** Values < 1000ms unless you have specific timing requirements and understand
the implications for socket pool management.Solution 3: Add Multi-Device Example (Best Practices)
File: libraries/HTTPClient/examples/MultiDevicePolling/MultiDevicePolling.ino
Create example demonstrating:
- Proper polling intervals (15-30 seconds)
- Manual cleanup delays if not added to library
- Staggered device initialization
- Error handling and backoff strategies
See attached example code below.
Solution 4: Add Socket Pool Diagnostic (Developer Tool)
Optional enhancement to help developers debug:
class HTTPClient {
public:
static int getActiveSockets(); // Debug helper
};This would require cooperation with NetworkClient layer, but could help developers identify exhaustion before it becomes critical.
Recommended Implementation Priority
- High Priority: Solution 2 (Documentation) - Immediate, no code changes
- High Priority: Solution 3 (Example) - Helps developers avoid the problem
- Medium Priority: Solution 1 (Cleanup delay) - Fixes root cause but needs careful consideration
- Low Priority: Solution 4 (Diagnostics) - Nice-to-have for advanced users
Evidence / Test Results
Before Fixes (User Code Only)
Uptime: 6-8 hours before failure
Symptoms: Cascade HTTP -1 errors
Socket pool: Exhausted
Recovery: Requires ESP32 reboot
After Fixes (User Code + Manual Delays)
Uptime: 24+ hours stable
Active HTTP requests: ~23,000 over 24h
Socket pool: 3-4/10 in use (sustainable)
Errors: Zero socket-related failures
RAM: Stable at 164KB
Key change in user code:
http.end();
client.stop();
delay(100); // Manual cleanup delayThis proves that socket cleanup delay solves the issue.
Multi-Device Polling Example
/**
* MultiDevicePolling.ino
*
* Demonstrates reliable HTTP polling of multiple devices over extended periods.
* Prevents socket exhaustion through proper cleanup and timing patterns.
*
* Tested stable for 24+ hours with 8 devices.
*/
#include <WiFi.h>
#include <HTTPClient.h>
const char* ssid = "your-ssid";
const char* password = "your-password";
// Configuration
const int NUM_DEVICES = 8;
const unsigned long POLL_INTERVAL = 15000; // 15 seconds
const uint16_t HTTP_TIMEOUT = 5000; // 5 seconds
// Device URLs
const char* deviceURLs[NUM_DEVICES] = {
"http://192.168.1.101/api/status",
"http://192.168.1.102/api/status",
"http://192.168.1.103/api/status",
"http://192.168.1.104/api/status",
"http://192.168.1.105/api/status",
"http://192.168.1.106/api/status",
"http://192.168.1.107/api/status",
"http://192.168.1.108/api/status"
};
unsigned long lastPoll[NUM_DEVICES] = {0};
void setup() {
Serial.begin(115200);
WiFi.begin(ssid, password);
while (WiFi.status() != WL_CONNECTED) {
delay(500);
Serial.print(".");
}
Serial.println("\nWiFi Connected!");
// Stagger initial polls to prevent simultaneous requests
for (int i = 0; i < NUM_DEVICES; i++) {
lastPoll[i] = millis() - POLL_INTERVAL + (i * 2000);
}
}
void loop() {
unsigned long now = millis();
for (int i = 0; i < NUM_DEVICES; i++) {
if (now - lastPoll[i] >= POLL_INTERVAL) {
pollDevice(i);
lastPoll[i] = now;
}
}
delay(10); // Prevent tight loop
}
void pollDevice(int index) {
// IMPORTANT: Use LOCAL instances per request
NetworkClient client;
HTTPClient http;
// Set reasonable timeouts
http.setTimeout(HTTP_TIMEOUT);
// DON'T use aggressive setConnectTimeout() - causes socket issues
http.setReuse(false); // Disable keep-alive for simpler cleanup
if (!http.begin(client, deviceURLs[index])) {
Serial.printf("Device %d: Connection failed\n", index);
client.stop();
delay(100); // Socket cleanup - CRITICAL for preventing exhaustion
return;
}
int httpCode = http.GET();
if (httpCode == HTTP_CODE_OK) {
String payload = http.getString();
Serial.printf("Device %d: %s\n", index, payload.c_str());
} else {
Serial.printf("Device %d: HTTP error %d\n", index, httpCode);
}
// Proper cleanup sequence
http.end();
client.stop();
// CRITICAL: Allow TCP socket to fully close
// Without this delay, rapid polling exhausts ESP32's socket pool (default: ~10 sockets)
// This delay can be removed if HTTPClient::end() is enhanced to include it
delay(100);
}Impact
This issue affects:
- IoT monitoring systems (polling sensors/devices)
- Home automation (managing smart home devices)
- Industrial applications (equipment monitoring)
- Any ESP32 project polling 5+ HTTP endpoints over extended periods
Proper documentation and/or code fixes would prevent the common pattern of:
"Works great for hours, then mysteriously fails, requires reboot"
Thank you for maintaining this excellent library! The goal is to help other developers avoid the debugging journey we went through. 🙏