Skip to content

Deployment Failures

This page explains how to diagnose and recover from failures that occur during the deployment pipeline — from preflight checks through rollout. For UNC/SMB path problems that prevent preflight from ever starting see Connectivity Issues.

Background: Every deployment runs a fail-closed preflight check pipeline before any artifact is written to disk. The pipeline is documented in full at Deployment Preflight Reference. The sections below map each failure mode to its pipeline stage.


Prerequisites

  • ntkDeploy installed and configured — see First Launch.
  • A deployment has been attempted and produced an error. Check the Audit Log for failure codes.

Preflight Blocked: Policy Connectivity Gate Closed

Symptom

The Deployment Wizard halts at the preflight step with failure code policy_connectivity_gate_closed or policy_connectivity_gate_unavailable. No device receives a configuration.

Cause

The Policy Connectivity Gate is a fail-closed circuit breaker that closes when the policy server becomes unreachable and stays closed until the connectivity monitoring service confirms recovery.

Code Meaning
policy_connectivity_gate_closed Gate was closed by a previous connectivity failure
policy_connectivity_gate_unavailable Gate service was never initialised (app startup issue)

Resolution

  1. Test the policy server connection. Go to Settings → Policy Server and click Test connection. A success result does not directly re-open the gate, but it confirms whether the endpoint is currently reachable.
  2. Verify endpoint reachability from the command line:
    Test-NetConnection -ComputerName policy.example.com -Port 8443
    
  3. Restart ntkDeploy. On restart the connectivity gate service reinitialises and the gate re-evaluates from a neutral state. If the server is reachable, the gate will open automatically.
  4. For policy_connectivity_gate_unavailable specifically, a restart is the only remedy — the gate service is only available after the service locator initialises at boot.
  5. If the gate remains closed after restart, check the Audit Log for policy_connectivity_gate events — the metadata field records the exact failure outcome.

Further reading: Settings Reference → Policy ServerPreflight step 6 — connectivity gate


Preflight Blocked: Device Owner Not Assigned

Symptom

Preflight fails with code device_owner_unassigned for one or more device keys in the target device group.

Cause

The per-device policy preflight (step 7 of the pipeline) requires that every device key has a person assigned as its owner. Without an owner, ABAC attribute lookups cannot be performed and the deployment is blocked.

Resolution

  1. Navigate to Device Groups, open the affected group, and identify which device keys lack owner assignments.
  2. Go to Policies → People and confirm the relevant person records exist. If not, create them first — see Managing Policies.
  3. Return to the Device Group detail and assign a person to each un-owned device key using the Assign Owner control.
  4. Re-run the preflight from the Deployment Wizard.

Further reading: Preflight code table — device_owner_unassignedDevice Enrollment


Preflight Blocked: ABAC Assignment Missing or Expired

Symptom

Preflight fails with assignments_missing or assignments_expired for specific devices.

Cause

The profile includes ABAC attributes that require valid policy assignments. The policy server found that either no assignment exists or an existing assignment has passed its expiry date.

Code Meaning
assignments_missing No valid assignment exists for a required ABAC attribute
assignments_expired An assignment exists but its expiry date has passed

Resolution

  1. Open Policies → Assignments and filter by the affected person or device key.
  2. For assignments_missing: create a new assignment for the required attribute — see Managing Policies.
  3. For assignments_expired: renew the assignment (update the expiry date) or create a replacement assignment.
  4. Re-run the Deployment Wizard after updating assignments.

Tip: The assignments_expired code is also raised if the device's system clock is significantly ahead of the policy server. Confirm clock synchronisation if you see unexpected expiry failures.


Preflight Blocked: Schema Validation Failure

Symptom

Preflight halts at the schema validation stage (steps 1–5) before policy checks are even reached. The wizard shows a code such as invalid_state, invalid_json, invalid_format, or schema_validation.

Cause

The profile version being deployed failed the schema validation pipeline. Each code maps to a distinct check:

Code Failed check
invalid_state Profile version status is not valid
invalid_json settingsJson contains malformed JSON or failed re-encode
invalid_format JSON root is not an object ({})
schema_validation One or more fields violate schema v1.0 rules

Resolution

  1. Open the profile in the editor. The status badge at the top shows the current validation state.
  2. Press Ctrl+Enter (or the Validate button) to trigger re-validation and see the field-level error list.
  3. For invalid_json or invalid_format, switch to the Raw JSON tab and correct the syntax. The editor highlights the offending line.
  4. Once all errors are resolved and the status badge shows Valid, return to the Deployment Wizard and retry.

Further reading: Profile Schemas ReferenceCreating a Profile


Preflight Blocked: Non-Actionable Policy Preflight Failure

Symptom

Preflight fails with the generic code preflight_failed. The wizard may show limited detail in the UI.

Cause

The deployment pipeline wraps certain per-device preflight codes into the generic preflight_failed code before returning them. The underlying raw code is attached as details. Raw codes wrapped this way include:

Raw code Meaning
server_unreachable Policy server did not respond to the ping
fetch_failed Could not retrieve assignments or attribute definitions
attribute_not_found A required ABAC attribute definition missing on the server
missing_attribute_key An assignment is present but attributeKey is empty

Resolution

  1. Open the Audit Log, find the failed rollout event (rollout_failed), and inspect the Metadata column — the raw code and message are recorded there.
  2. Address the underlying cause based on the raw code:
  3. server_unreachable / fetch_failed → Check policy server connectivity (Connectivity Issues).
  4. attribute_not_found → Verify attribute IDs in the profile match definitions on the policy server.
  5. missing_attribute_key → Regenerate device credentials or contact your policy administrator.
  6. Retry the deployment once the underlying issue is resolved.

Deployment Failed: Write Error at UNC Path

Symptom

The Deployment Wizard reports a mid-rollout failure with code deploy_write_failed. Some device groups may have received the configuration while others did not.

Cause

A write error (deploy_write_failed) occurred when ntkDeploy tried to write the JSON payload to the target UNC path. The connectivity pre-check passed, but the write operation itself failed — typically because of a transient network interruption or a permission change between the preflight check and the write.

Resolution

  1. Check the share is still reachable: Run a connectivity check from the Device Group detail page and confirm the path shows success.
  2. Check write permissions: A write permission probe creates and deletes a temporary file at the target path. If it returns permission_denied (OS error 5), the account running ntkDeploy lacks write access to the share. Work with your storage team to grant write permissions.
  3. Check available disk space on the remote share. A full volume returns a FileSystemException with a message referencing insufficient space.
  4. Once the issue is resolved, re-run the full deployment — there is no partial-device retry. Devices that received the configuration in a partial rollout will keep the new version; the rest will be updated on the next run.

Backup behaviour: Before overwriting an existing config file, ntkDeploy creates a timestamped backup at <original-path>.bak.<ISO8601-timestamp>. These backups can be used for manual rollback — see Rollback Guidance below.


Rollback Guidance

ntkDeploy does not have an automatic rollback UI for partially failed rollouts, but the backup mechanism provides the backup files needed for manual recovery.

What Is Backed Up

When a deployment runs with backup enabled (the default), and the target file already exists, a copy is written at:

<target-directory>\<config-file>.bak.<ISO8601-timestamp>
Example:
\\server\share\configs\ntk.json.bak.2026-03-03T14-22-10.000000

What Is Not Backed Up

  • Devices that did not yet have a configuration file receive no backup (there was nothing to back up).
  • Devices whose write failed before the backup step are unaffected — their original file is intact.

Performing a Manual Rollback

  1. Identify the backup path from the Audit Log — the rollout_succeeded event metadata includes backupPath.
  2. Copy the .bak.* file back to its original name manually via File Explorer or PowerShell:
    Copy-Item "\\server\share\configs\ntk.json.bak.2026-03-03T14-22-10.000000" `
              "\\server\share\configs\ntk.json"
    
  3. To confirm recovery, re-run a preflight check on the device group — connectivity and write-permission status should return success.

Rollback Failure Codes

If a manual restore attempt fails, the following error codes may appear in the Audit Log:

Code Meaning Remediation
backup_not_found The .bak.* file no longer exists at the recorded path Locate the file manually; it may have been removed by artifact cleanup
restore_failed FileSystemException during the copy back Check write permissions on the target path

Reading Error Codes in the Audit Log

Every failed rollout writes a rollout_failed entry to the Audit Log with structured metadata. To inspect it:

  1. Navigate to Audit Log in the sidebar.
  2. Locate the rollout_failed row (shown in red). The Metadata column contains:
  3. failureCode — the primary error code (e.g., deploy_write_failed, preflight_failed)
  4. message — a human-readable description
  5. details — nested raw result if the code was wrapped by the deployment pipeline
  6. Compare failureCode against the tables in this guide and in Deployment Preflight Reference.

Tip: You can filter the Audit Log by entity to see all events for a specific assignment or rollout — see Audit Log Reference.


Retry Strategies

Scenario Recommended strategy
Preflight failure (any code) Fix the underlying issue, then re-run the full Deployment Wizard
Partial write failure (deploy_write_failed) Fix write access, then re-run the full deployment (safe to repeat — backup is created again)
Policy server temporarily unreachable Wait for the server to recover, restart ntkDeploy to reset the gate, then retry
Schema validation failure Edit and re-validate the profile, then re-deploy

Full re-deployment is always safe because ntkDeploy creates a backup before overwriting. There is no incremental-only retry path in the current release.


Next Steps