FEAT: Add PromptDecompositionConverter (DrAttack decompose-and-reconstruct) by Raulster24 · Pull Request #2003 · microsoft/PyRIT

Raulster24 · 2026-06-14T04:47:24Z

Description

This adds a PromptDecompositionConverter implementing the decompose-and-reconstruct technique from DrAttack (Li et al., Findings of EMNLP 2024, https://arxiv.org/abs/2402.16914). Following the discussion with @rlundeen and @romanlutz, it is implemented as a converter rather than an attack class, so it stays composable with the existing engines (TAP, Crescendo, PromptSendingAttack) instead of duplicating their traversal logic.

The converter:

Decomposes the objective into a flat, role-tagged phrase structure ({"words": [...], "types": [...]}) using an LLM. The flat form was chosen over a nested parse tree because it is much easier to validate.
Rebuilds it as a "Question A / Question B" reconstruction task plus a static benign in-context demonstration, so the target reassembles the original intent itself and the assembled instruction never appears verbatim in the request.

The one piece the paper precomputes offline is the live decomposition, so that path is hardened:

Structured output validated against a reconstruction-recall invariant (the joined phrases must preserve the original tokens), plus valid-tag and opening-instruction checks.
On a validation failure the error is fed back to the model and the call is retried.
A deterministic spaCy part-of-speech fallback runs if every attempt fails, so the converter never hard-fails on valid input (spaCy is already a PyRIT dependency).

Live-decomposition reliability measured at 93% valid parse on 30 AdvBench objectives (gpt-4o-mini), with exact reconstruction when valid.

Scope and follow-ups: this PR is the core converter. The word-game variant and registering the technique in scenario_techniques.py are intended as follow-ups. The catalog registration has an open design question worth input: create() resolves LLM targets lazily only for the adversarial-chat slot, not for converters in attack_converter_config, so wiring a target-needing converter into the static catalog needs a decision (reuse the adversarial target, the objective target, or add a lazy-converter slot). Happy to take that on separately.

Tests and Documentation

Added tests/unit/prompt_converter/test_prompt_decomposition_converter.py (7 tests): reconstruction assembly, retry-with-error-feedback, deterministic fallback, no-fallback-raises, reconstruction-recall rejection, invalid input type, and identifier construction.
Documented in doc/code/converters/1_text_to_text_converters.py under LLM-based converters, and added the DrAttack reference to doc/references.bib.
Ran JupyText --sync on doc/code/converters/1_text_to_text_converters.py so the paired notebook is updated.
ruff check and ruff format clean; ty reports no errors on the converter.

FEAT: Add PromptDecompositionConverter (DrAttack)

e7e4a39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: Add PromptDecompositionConverter (DrAttack decompose-and-reconstruct)#2003

FEAT: Add PromptDecompositionConverter (DrAttack decompose-and-reconstruct)#2003
Raulster24 wants to merge 1 commit into
microsoft:mainfrom
Raulster24:raulster24/add-prompt-decomposition-converter

Raulster24 commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Raulster24 commented Jun 14, 2026

Description

Tests and Documentation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant