A Jenkins agent goes offline when it cannot open or hold the remoting channel back to the controller. The fix depends on which link in that chain broke: the transport (TCP agent port closed, wrong controller URL, a reverse proxy not forwarding the agent port), the secret (wrong or regenerated, so the handshake is refused), the JDK (controller and agent on incompatible Java versions), the agent host (process died, remote root disk full, network lost), a label with no online agent to satisfy it, or a cloud/Kubernetes pod that never provisioned. Open Manage Jenkins → Nodes, click the offline agent, and read its launch log — the error string points at exactly one of these. This is distinct from executor starvation, where agents are online but every executor is busy; see executor starvation.
Offline vs executor-starved (get this right first)
These look similar in the queue and are fixed in opposite ways:
- Offline (this page). The agent cannot connect or keep its channel open. Its executors are removed from the pool. Manage Jenkins → Nodes shows a red x. You fix the connection — transport, secret, JDK, host, or provisioning.
- Executor-starved. Agents are online and reachable, but all executors are busy, or no online agent carries the job's label. You fix capacity or labels, not the connection. See executor starvation.
The single tell: a red, disconnected node icon is offline; a green node whose slots are all occupied is starvation.
How agents connect, in one paragraph
Jenkins Remoting is the library that implements the agent ⇔ controller channel (Jenkins Remoting). An inbound agent (formerly "JNLP") dials out to the controller; an SSH agent is launched by the controller over SSH. For inbound agents there are two transports. The classic path uses a separate TCP agent port — disabled by default since Jenkins 2.0, exposed as 50000 in the Docker images, and configurable as fixed or random in Manage Jenkins → Security (Exposed Services and Ports). Since Jenkins 2.217 an inbound agent can instead use WebSocket transport, which needs no extra TCP port and no special security configuration because it rides the existing HTTP(S) port (Exposed Services and Ports). A typical inbound launch is:
java -jar agent.jar \
-url https://jenkins.example.com/ \
-secret <hex-secret> \
-name build-agent-01 \
-workDir /var/jenkins
# add -webSocket to use the WebSocket transport instead of the TCP agent port
The -secret is a long string of hex digits the client needs to establish the
connection (Inbound agent — jenkinsci/remoting).
With -webSocket, only a single connection is made; without it, the agent first
connects over HTTP(S) to retrieve connection info, then opens the TCP agent port
(Inbound agent — jenkinsci/remoting).
Diagnose it
Three logs tell you almost everything; read them in this order.
- Manage Jenkins → Nodes, then click the agent. The node page shows whether it is connected and, when offline, the cause (a monitor threshold, a launch failure, or a manual disconnect) (Managing nodes).
- The agent's launch / log output — the console where
agent.jarruns, or the node's log page. This is where transport, secret, and JDK errors surface verbatim. - The controller log (Manage Jenkins → System Log) for the matching rejection or provisioning entry.
For a Kubernetes agent, add kubectl against the agent pod:
kubectl get pods -n jenkins-agents
# NAME READY STATUS RESTARTS AGE
# build-agent-01-7q4kx-jnlp 0/1 Error 0 12s
kubectl describe pod build-agent-01-7q4kx -n jenkins-agents # events: scheduling, image pull, exit code
kubectl logs build-agent-01-7q4kx -c jnlp -n jenkins-agents # the remoting/connect-back output
The error string maps almost one-to-one to a cause below.
Causes, each end to end
Transport: inbound TCP agent can't reach the controller
The agent reaches the controller over HTTP(S) but then cannot open the TCP agent port — because the port is disabled, firewalled, or a reverse proxy in front of Jenkins forwards only HTTP and drops the agent port.
-
What it is. The inbound TCP port is disabled by default since Jenkins 2.0 and must be enabled in Manage Jenkins → Security; a random port changes on reboot and is hard to firewall, while a fixed port is stable (Exposed Services and Ports). A reverse proxy that terminates HTTP usually does not forward the raw TCP agent port unless explicitly configured.
-
Diagnose. The agent log shows the HTTP step succeeding, then a timeout or refusal on the agent port — for example:
INFO: Locating server among [https://jenkins.example.com/] INFO: Agent discovery successful INFO: Agent address: jenkins.example.com INFO: Agent port: 50000 SEVERE: Failed to connect to jenkins.example.com:50000 java.net.ConnectException: Connection refused (Connection refused)From a shell on the agent host, confirm the port is actually reachable with
nc -vz jenkins.example.com 50000(a refusal or timeout here is the proof). -
Fix. Pick one transport and make it consistent end to end. Easiest behind a proxy: switch the agent to WebSocket (
-webSocket), which rides the HTTP(S) port and needs no extra port or firewall rule (Exposed Services and Ports). If you keep TCP, set a fixed agent port, open it through the firewall, and forward it at the proxy. Tradeoff: WebSocket simplifies networking but pins you to the HTTP path's proxy timeouts; a fixed TCP port is direct but is one more rule to manage and one more thing a proxy can silently drop.
Transport: wrong controller URL
The agent's -url does not match how the controller advertises itself, so the
agent either can't find the controller or is handed an address it can't route to.
- Diagnose. The log fails at "Locating server" / "Agent discovery", or
discovery succeeds but hands back an internal hostname the agent can't resolve.
Check the
-urlvalue and the Jenkins URL under Manage Jenkins → System. - Fix. Set
-urlto the externally reachable controller URL, and set the Jenkins URL in system config to the same address agents actually use. Mismatched internal-vs-external hostnames are the usual culprit behind a proxy.
Secret: wrong or expired agent secret → handshake refused
The agent connects but the controller rejects it because the secret does not match — commonly after the node was deleted and recreated, or the secret was regenerated, leaving the agent launching with a stale value.
-
What it is. The secret is the hex string the client must present to establish the connection (Inbound agent — jenkinsci/remoting).
-
Diagnose. The log gets past discovery, then the controller closes the channel during the handshake:
INFO: Connecting to jenkins.example.com:50000 INFO: Trying protocol: JNLP4-connect SEVERE: The server rejected the connection: build-agent-01 is already connected or the secret did not match java.io.IOException: The server rejected the connection -
Fix. Re-copy the current secret from the node page (Manage Jenkins → Nodes → the agent shows the exact launch command and secret) and relaunch the agent with it. Note the remoting guidance: if a secret is compromised, do not reuse the agent name on that controller (Inbound agent — jenkinsci/remoting).
JDK: Java version mismatch between controller and agent
Remoting requires a compatible JVM on both ends. Since Jenkins 2.357 (and LTS 2.361.1) both the controller JVM and the agent JVM must run Java 11 or newer (Jenkins requires Java 11).
-
Diagnose. A controller on Java 11 with an agent on Java 8 throws, on the agent side:
Exception in thread "main" java.lang.UnsupportedClassVersionError: hudson/remoting/Launcher has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0Class file 55.0 is Java 11; 52.0 is Java 8 (Jenkins requires Java 11). Confirm with
java -versionon the agent host. -
Fix. Run the agent JVM (the process executing
agent.jar/remoting.jar) on Java 11 or newer (Jenkins requires Java 11). This is separate from the JDK your builds use — you can still build with Java 8 via Global Tool Configuration; only the agent process itself must meet the minimum.
Agent host: process died, disk full, or lost network
The agent was healthy and then dropped because something on the host failed: the
agent.jar process exited, the remote root filled up, or the network went away.
Jenkins also takes a node offline on its own when a monitor threshold is crossed.
- What it is. Jenkins monitors each node for disk space, free temp space, free swap, clock difference, and response time, and takes the node offline if any value crosses its threshold (Managing nodes).
- Diagnose. The node page names the offline cause directly — e.g. "Disk space
is too low" or "Free Swap Space is too low" for a monitor trip, versus a
connection-lost cause for a dead process or network drop. On the host, check
the agent process is alive and
df -hthe agent's remote root. - Fix. For a monitor trip, clear the underlying condition (free disk on the
remote root, fix clock skew with NTP) and bring the node back online — do not
just disable the monitor. For a dead process, restart
agent.jar(run it under a supervisor/systemd so it restarts on exit). For a network drop, the agent reconnects once connectivity returns.
Label: no online agent carries the requested label
A build pinned to a label can run only on an agent with that label. If the only agent carrying it is offline, the job waits — but the cause here is the offline agent, not capacity.
- Diagnose. The queue reason reads "there are no nodes with the label '...'". Confirm whether an agent with that label exists and is merely offline (this page) versus online-but-busy. If the labelled agent is offline, fix its connection using the cause above that matches its log.
- Differentiation. If an agent with the label is online and the job still waits, that is not an offline problem — it is executor starvation (all slots busy, or label/usage restrictions). Offline = the channel is down; starved = the channel is up but no slot is free.
Cloud / Kubernetes: pod or instance never provisioned
With the Kubernetes plugin, an agent is a pod created per build from a pod
template; the pod's jnlp container is launched as an inbound agent and connects
back using injected JENKINS_URL, JENKINS_SECRET, and JENKINS_AGENT_NAME
environment variables
(kubernetes-plugin). It shows
offline when the pod never reaches Running, or starts and immediately exits.
- Diagnose.
kubectl get podsfor the agent namespace, thendescribethe pod for events (scheduling failures, image pull errors, missing required fields) andlogs -c jnlpfor the connect-back output. The plugin's troubleshooting guidance is to check pod status via kubectl and raise the controller log level fororg.csanchez.jenkins.plugins.kubernetes(kubernetes-plugin). A pod that scheduled but exits points at the container; one stuckPendingpoints at the cluster (resources, image). - Common cause — image JRE. The pod image must have a JRE compatible with the
Java version the controller requires
(kubernetes-plugin); an
incompatible or missing JRE produces the same
UnsupportedClassVersionErroras the JDK case above, or ajnlpcontainer that exits immediately. - Common cause — connect-back transport. If the pod can't reach the agent port, enable WebSocket on the cloud configuration so agents connect over HTTP(S) rather than the TCP service port — the documented fix when the controller sits behind a proxy (kubernetes-plugin).
- Fix. Correct whatever the events/logs name: fix the pod template (image,
required fields, resource requests so it can schedule), align the image JRE with
the controller's Java version, and set the connect-back transport/URL the pod
can actually reach. Tradeoff:
podRetentionsuch asonFailure()keeps failed pods around so you can inspect them, at the cost of leftover pods you must clean up (kubernetes-plugin).
Fix it (order of operations)
- Manage Jenkins → Nodes, open the offline agent, read the offline cause.
- Read the agent launch log. Match the error string: connection refused on
the agent port (transport), discovery/URL failure (URL), "server rejected the
connection" (secret),
UnsupportedClassVersionError(JDK). - Confirm it is offline, not starved. A red node is offline; an online node with all slots busy is executor starvation.
- Transport: prefer WebSocket behind a proxy; otherwise fix the fixed TCP
port + firewall + proxy forwarding. Align
-urlwith the system Jenkins URL. - Secret: re-copy the current secret from the node page and relaunch.
- JDK: put the agent JVM on Java 11+; this is separate from the build JDK.
- Host: clear the tripped monitor (disk, clock), restart a dead
agent.jarunder a supervisor, restore network. - Kubernetes:
kubectl describe/logsthe pod; fix the template, image JRE, resources, and connect-back transport.
How Intellira diagnoses this
Intellira is read-only: it never restarts an agent, deletes a pod, or edits a
node — it correlates evidence and names the cause. It reads the Jenkins MCP node
state and offline cause, the agent and controller logs, and — for Kubernetes
agents — the agent pod's status, events, and jnlp container logs via the
Kubernetes MCP server. It then classifies the failure: a connection-refused on
the agent port reads as a transport/firewall problem; a rejected handshake reads
as a stale secret; an UnsupportedClassVersionError reads as a JDK mismatch; a
monitor-tripped node reads as a host condition (disk/clock); a pod that never
reaches Running reads as a provisioning/template fault. Critically, it
distinguishes offline (channel down, executors gone) from executor starvation
(channel up, slots busy) so the remediation it surfaces matches the actual link
that broke, rather than reporting a generic "agent unavailable."
Sources
- Jenkins Remoting — jenkins.io
- Exposed Services and Ports — jenkins.io
- Managing nodes — jenkins.io
- Jenkins requires Java 11 or newer — jenkins.io
- Inbound agent — jenkinsci/remoting (GitHub)
- Kubernetes plugin — jenkinsci/kubernetes-plugin (GitHub)
By Intellira Engineering. AI-assisted draft; claims cited inline; last verified 2026-06-02. Pending technical review.