Currently, when multiple nodes run on the same machine with identical REST API ports, only one can bind successfully. The others start but run without accessible REST APIs, causing:
-
Heartbeat confusion: Both nodes report with different UniqueIDs, causing alternating producer records
-
Silent failures: Nodes run unmonitorably without REST API access
-
Wasted resources: Multiple nodes running unnecessarily in parallel
Discovery:
Through monitoring analysis, I’ve observed numerous misconfigurations where operators accidentally run multiple node instances (master + fallback, or even more) on the same port, leading to unpredictable behavior.
Proposed solution:
Add explicit port binding check at startup. If the port is already in use, the node should fail immediately with a clear error message. This forces operators to configure distinct ports (e.g., 8080, 8081) and prevents silent misconfigurations.
Example:
func main() {
port := config.RestAPIPort
// Attempt to bind the REST API port
listener, err := net.Listen("tcp", fmt.Sprintf(":%d", port))
if err != nil {
log.Fatalf("FATAL: Port %d is already in use. Cannot start node to prevent conflicts with other instances on the same machine.", port)
os.Exit(1)
}
log.Printf("REST API successfully started on port %d", port)
startNode(listener)
}