Section 01 · Why this guide exists

AI-augmented security auditing, for the in-house engineer.

A reproducible workflow for running AI-assisted IT security audits on your own infrastructure. Kali Linux in a Docker container, the official MCP server as the tool seam, and GitHub Copilot agent mode in VS Code as the driver — with every tool call gated behind your explicit approval.

Version 1.0 · 2026 Audience SME IT & security engineers Lab time ~1 hour from clean host

Who this is for

Two readers, both served by every section.

Reader	Prior knowledge	What they want
Primary — the operator. SME in-house IT/security engineer; technically capable owner-operator.	Comfortable with the CLI, basic networking, an IDE. No prior penetration-testing training assumed.	A reproducible method they can run on their own kit. Exact commands, verifiable steps, cross-platform parity.
Secondary — the sponsor. Non-specialist IT leadership; learners new to AI-augmented security work.	General IT literacy; no command-line fluency required.	Enough understanding of what an AI-assisted audit looks like to sponsor, oversee, or sit alongside it.

What you will be able to do

L1 State the legal and ethical preconditions for an internal audit in the UK (anchored on the Computer Misuse Act 1990) and produce a one-page written authorisation against a real target.
L2 Stand up Kali Linux in a Docker container with the official MCP server running as a systemd-managed service on Linux, Windows, or macOS.
L3 Wire VS Code and GitHub Copilot Chat (in agent mode) to that MCP server, with the operator approval boundary intact on every tool call.
L4 Run an end-to-end audit session against a scoped target: framing, reconnaissance, CVE verification, authentication and exposure analysis, and a written report.
L5 Recognise the new risks an agentic-AI toolchain introduces — prompt injection, excessive agency, tool-call abstraction, supply-chain risk, data leakage — and threat-model their own organisation's adoption of it.

How the guide is built

Four teaching commitments shape every section.

Legal before technical. Authorisation, scope, and disclosure are taught in Section 02, before any tool is introduced. The same Nmap scan is professional work or a Computer Misuse Act offence depending on whether the operator can produce written authorisation.
Two install paths. The Docker setup is taught manually first (every flag visible), then automated with Docker Compose. You learn what each flag does before letting a compose file hide it.
Example prompts, not a guided audit. Section 07 gives a handful of prompt templates you can adapt — framing, CVE triangulation, report drafting — rather than walking through one specific engagement step by step.
The assistant proposes; the operator approves. Every tool call in agent mode is a proposal you must accept. This approval boundary is the load-bearing safety property of the whole workflow.

The eight sections

Legal & ethical foundations

Authorisation in writing, scope drift, and the Computer Misuse Act 1990 applied to a port scan.

The toolchain at a glance

The four-part architecture and the two trust boundaries that hold it together.

Kali in Docker — Method 1

Manual step-by-step setup with docker run and docker exec. Every flag explained.

Kali in Docker — Method 2

The same lab in one command using Docker Compose and a first-boot entrypoint script.

Wiring VS Code to the MCP

Configure mcp.json, enable agent mode, and verify the approval boundary.

Running the workflow

A five-stage session structure plus three example prompts you can adapt to your own engagement.

Threat-modelling agentic AI

The new risks the toolchain itself introduces — prompt injection, excessive agency, and four more.

Glossary & key

Every acronym, tool, and standard cited anywhere in the guide.

What a session actually feels like

Section 07 walks through the shape of a working session: how to frame the engagement, how to force the assistant to triangulate a candidate finding, and how to constrain the report draft at the end. The prompts are deliberately generic — you supply the target, the scope, and the constraints; the prompts give you the scaffolding.

One lesson worth previewing here: in the kinds of audits this workflow is built for, the headline risk is usually deployment posture, not unpatched CVEs. Firmware-patching is necessary but not sufficient. The highest-leverage remediation is almost always configuration-level — what is exposed, on which port, with which encryption, behind which authentication — and an AI-augmented audit gets you to those conclusions faster than working unaided.

What this guide deliberately does not cover

Penetration-testing certification material. OWASP WSTG, NIST SP 800-115, and the CREST CRT body of knowledge are referenced but not replaced.
Unauthorised access of any kind. Every technique is taught inside a lawful scope. Section 02 sits ahead of every technical section by design.
Adjacent disciplines. Web application testing, mobile, cloud, and social engineering are out of scope — this guide stays focused on internal IT and infrastructure audits.

Check yourself

Three questions on what this guide is for

Pick the option you think is right and expand it to see the verdict and a one-sentence explanation. These aren't trivia — getting any of them wrong means you've misunderstood something the rest of the guide will assume you've got.

Q1 / 3

Who is this guide intended for?

Professional penetration testers preparing for a certification exam.

No Section 01 names OWASP WSTG, NIST SP 800-115, and CREST CRT as bodies of knowledge this guide does not attempt to replace.

An SME in-house IT or security engineer auditing infrastructure they are authorised to test.

Yes The primary reader is a technically capable owner-operator running scoped audits on their own kit — no prior pen-testing training assumed.

Anyone who wants to learn how to break into IP cameras.

No Every technique in this guide is taught inside a lawful scope; Section 02 sits ahead of every technical section by design.

Q2 / 3

What does the workflow's "the assistant proposes; the operator approves" grammar describe?

A polite convention for talking about the AI in writing.

No The grammar tracks what actually happens in agent mode — and what the legal record will say happened — not a style preference.

The load-bearing safety property of the whole workflow: every tool call is a proposal that must be approved.

Yes Nothing runs without an approve click. The grammar exists because the boundary it describes is what keeps the workflow safe — and lawful.

An indication that the AI is restricted to read-only commands.

No The assistant can propose any command the MCP server exposes. What's restricted is execution without operator approval, not the range of proposals.

Q3 / 3

Does the architecture in this guide replace your judgement about which targets are in scope?

Yes — the container, the MCP server, and agent mode together enforce scope automatically.

No The architecture cannot tell whether a target IP is in your authorisation file. Only you can — Section 02 makes this explicit.

No — the architecture isolates and gates execution, but the authority to act on a given target is the operator's to establish.

Yes Container isolation and the approval gate are technical controls; scope is a legal and organisational one, and it stays with you.

Only when the target is on the public internet.

No Scope discipline applies equally to internal LAN targets — an unauthorised scan of a colleague's machine is still a CMA 1990 s1 offence.