Deployment Failures¶
This page explains how to diagnose and recover from failures that occur during the deployment pipeline — from preflight checks through rollout. For UNC/SMB path problems that prevent preflight from ever starting see Connectivity Issues.
Background: Every deployment runs a fail-closed preflight check pipeline before any artifact is written to disk. The pipeline is documented in full at Deployment Preflight Reference. The sections below map each failure mode to its pipeline stage.
Prerequisites¶
- ntkDeploy installed and configured — see First Launch.
- A deployment has been attempted and produced an error. Check the Audit Log for failure codes.
Preflight Blocked: Policy Connectivity Gate Closed¶
Symptom¶
The Deployment Wizard halts at the preflight step with failure code
policy_connectivity_gate_closed or policy_connectivity_gate_unavailable. No device
receives a configuration.
Cause¶
The Policy Connectivity Gate is a fail-closed circuit breaker that closes when the policy server becomes unreachable and stays closed until the connectivity monitoring service confirms recovery.
| Code | Meaning |
|---|---|
policy_connectivity_gate_closed |
Gate was closed by a previous connectivity failure |
policy_connectivity_gate_unavailable |
Gate service was never initialised (app startup issue) |
Resolution¶
- Test the policy server connection. Go to Settings → Policy Server and click Test connection. A success result does not directly re-open the gate, but it confirms whether the endpoint is currently reachable.
- Verify endpoint reachability from the command line:
Test-NetConnection -ComputerName policy.example.com -Port 8443 - Restart ntkDeploy. On restart the connectivity gate service reinitialises and the gate re-evaluates from a neutral state. If the server is reachable, the gate will open automatically.
- For
policy_connectivity_gate_unavailablespecifically, a restart is the only remedy — the gate service is only available after the service locator initialises at boot. - If the gate remains closed after restart, check the Audit Log for
policy_connectivity_gateevents — the metadata field records the exact failure outcome.
Further reading: Settings Reference → Policy Server • Preflight step 6 — connectivity gate
Preflight Blocked: Device Owner Not Assigned¶
Symptom¶
Preflight fails with code device_owner_unassigned for one or more device keys in the
target device group.
Cause¶
The per-device policy preflight (step 7 of the pipeline) requires that every device key has a person assigned as its owner. Without an owner, ABAC attribute lookups cannot be performed and the deployment is blocked.
Resolution¶
- Navigate to Device Groups, open the affected group, and identify which device keys lack owner assignments.
- Go to Policies → People and confirm the relevant person records exist. If not, create them first — see Managing Policies.
- Return to the Device Group detail and assign a person to each un-owned device key using the Assign Owner control.
- Re-run the preflight from the Deployment Wizard.
Further reading: Preflight code table —
device_owner_unassigned• Device Enrollment
Preflight Blocked: ABAC Assignment Missing or Expired¶
Symptom¶
Preflight fails with assignments_missing or assignments_expired for specific devices.
Cause¶
The profile includes ABAC attributes that require valid policy assignments. The policy server found that either no assignment exists or an existing assignment has passed its expiry date.
| Code | Meaning |
|---|---|
assignments_missing |
No valid assignment exists for a required ABAC attribute |
assignments_expired |
An assignment exists but its expiry date has passed |
Resolution¶
- Open Policies → Assignments and filter by the affected person or device key.
- For
assignments_missing: create a new assignment for the required attribute — see Managing Policies. - For
assignments_expired: renew the assignment (update the expiry date) or create a replacement assignment. - Re-run the Deployment Wizard after updating assignments.
Tip: The
assignments_expiredcode is also raised if the device's system clock is significantly ahead of the policy server. Confirm clock synchronisation if you see unexpected expiry failures.
Preflight Blocked: Schema Validation Failure¶
Symptom¶
Preflight halts at the schema validation stage (steps 1–5) before policy checks are even
reached. The wizard shows a code such as invalid_state, invalid_json,
invalid_format, or schema_validation.
Cause¶
The profile version being deployed failed the schema validation pipeline. Each code maps to a distinct check:
| Code | Failed check |
|---|---|
invalid_state |
Profile version status is not valid |
invalid_json |
settingsJson contains malformed JSON or failed re-encode |
invalid_format |
JSON root is not an object ({}) |
schema_validation |
One or more fields violate schema v1.0 rules |
Resolution¶
- Open the profile in the editor. The status badge at the top shows the current validation state.
- Press
Ctrl+Enter(or the Validate button) to trigger re-validation and see the field-level error list. - For
invalid_jsonorinvalid_format, switch to the Raw JSON tab and correct the syntax. The editor highlights the offending line. - Once all errors are resolved and the status badge shows
Valid, return to the Deployment Wizard and retry.
Further reading: Profile Schemas Reference • Creating a Profile
Preflight Blocked: Non-Actionable Policy Preflight Failure¶
Symptom¶
Preflight fails with the generic code preflight_failed. The wizard may show limited
detail in the UI.
Cause¶
The deployment pipeline wraps certain per-device preflight codes into the generic
preflight_failed code before returning them. The underlying raw code is attached as
details. Raw codes wrapped this way include:
| Raw code | Meaning |
|---|---|
server_unreachable |
Policy server did not respond to the ping |
fetch_failed |
Could not retrieve assignments or attribute definitions |
attribute_not_found |
A required ABAC attribute definition missing on the server |
missing_attribute_key |
An assignment is present but attributeKey is empty |
Resolution¶
- Open the Audit Log, find the failed rollout event (
rollout_failed), and inspect the Metadata column — the raw code and message are recorded there. - Address the underlying cause based on the raw code:
server_unreachable/fetch_failed→ Check policy server connectivity (Connectivity Issues).attribute_not_found→ Verify attribute IDs in the profile match definitions on the policy server.missing_attribute_key→ Regenerate device credentials or contact your policy administrator.- Retry the deployment once the underlying issue is resolved.
Deployment Failed: Write Error at UNC Path¶
Symptom¶
The Deployment Wizard reports a mid-rollout failure with code deploy_write_failed. Some
device groups may have received the configuration while others did not.
Cause¶
A write error (deploy_write_failed) occurred when ntkDeploy tried to write the JSON payload to the target UNC path. The connectivity pre-check passed, but the write operation itself failed — typically because of a transient network interruption or a permission change between the preflight check and the write.
Resolution¶
- Check the share is still reachable: Run a connectivity check from the Device Group
detail page and confirm the path shows
success. - Check write permissions: A write permission probe creates and deletes a temporary file at the target path. If it returns
permission_denied(OS error 5), the account running ntkDeploy lacks write access to the share. Work with your storage team to grant write permissions. - Check available disk space on the remote share. A full volume returns a
FileSystemExceptionwith a message referencing insufficient space. - Once the issue is resolved, re-run the full deployment — there is no partial-device retry. Devices that received the configuration in a partial rollout will keep the new version; the rest will be updated on the next run.
Backup behaviour: Before overwriting an existing config file, ntkDeploy creates a timestamped backup at
<original-path>.bak.<ISO8601-timestamp>. These backups can be used for manual rollback — see Rollback Guidance below.
Rollback Guidance¶
ntkDeploy does not have an automatic rollback UI for partially failed rollouts, but the backup mechanism provides the backup files needed for manual recovery.
What Is Backed Up¶
When a deployment runs with backup enabled (the default), and the target file already exists, a copy is written at:
<target-directory>\<config-file>.bak.<ISO8601-timestamp>
\\server\share\configs\ntk.json.bak.2026-03-03T14-22-10.000000
What Is Not Backed Up¶
- Devices that did not yet have a configuration file receive no backup (there was nothing to back up).
- Devices whose write failed before the backup step are unaffected — their original file is intact.
Performing a Manual Rollback¶
- Identify the backup path from the Audit Log — the
rollout_succeededevent metadata includesbackupPath. - Copy the
.bak.*file back to its original name manually via File Explorer or PowerShell:Copy-Item "\\server\share\configs\ntk.json.bak.2026-03-03T14-22-10.000000" ` "\\server\share\configs\ntk.json" - To confirm recovery, re-run a preflight check on the device group — connectivity and
write-permission status should return
success.
Rollback Failure Codes¶
If a manual restore attempt fails, the following error codes may appear in the Audit Log:
| Code | Meaning | Remediation |
|---|---|---|
backup_not_found |
The .bak.* file no longer exists at the recorded path |
Locate the file manually; it may have been removed by artifact cleanup |
restore_failed |
FileSystemException during the copy back |
Check write permissions on the target path |
Reading Error Codes in the Audit Log¶
Every failed rollout writes a rollout_failed entry to the Audit Log with structured
metadata. To inspect it:
- Navigate to Audit Log in the sidebar.
- Locate the
rollout_failedrow (shown in red). The Metadata column contains: failureCode— the primary error code (e.g.,deploy_write_failed,preflight_failed)message— a human-readable descriptiondetails— nested raw result if the code was wrapped by the deployment pipeline- Compare
failureCodeagainst the tables in this guide and in Deployment Preflight Reference.
Tip: You can filter the Audit Log by entity to see all events for a specific assignment or rollout — see Audit Log Reference.
Retry Strategies¶
| Scenario | Recommended strategy |
|---|---|
| Preflight failure (any code) | Fix the underlying issue, then re-run the full Deployment Wizard |
Partial write failure (deploy_write_failed) |
Fix write access, then re-run the full deployment (safe to repeat — backup is created again) |
| Policy server temporarily unreachable | Wait for the server to recover, restart ntkDeploy to reset the gate, then retry |
| Schema validation failure | Edit and re-validate the profile, then re-deploy |
Full re-deployment is always safe because ntkDeploy creates a backup before overwriting. There is no incremental-only retry path in the current release.