Skip to content

FEAT: atomic attack identifier#1446

Merged
rlundeen2 merged 16 commits intoAzure:mainfrom
rlundeen2:users/rlundeen/2026_03_04_atomic_identifier
Mar 10, 2026
Merged

FEAT: atomic attack identifier#1446
rlundeen2 merged 16 commits intoAzure:mainfrom
rlundeen2:users/rlundeen/2026_03_04_atomic_identifier

Conversation

@rlundeen2
Copy link
Contributor

We need a way to uniquely identify attack techniques. If something works, we want to see how well, and what changes affect the success. This is captured from two places, the attack object and the general_technique seeds.

This PR adds a ComponentIdentifier (via build_atomic_attack_identifier) that nests the attack strategy, its sub-components (objective target, adversarial chat, converters, scorer), and general-technique seeds.

To evaluate metrics, AtomicAttackEvaluationIdentity is used to compute a behavioral equivalence hash by selectively filtering children: the objective_target only uses temperature, the adversarial_chat retains model_name, temperature, and top_p, the objective_scorer is excluded entirely, and all other children (request/response converters, seeds) are included in full. This means two attack configurations that differ only in deployment details (endpoints, rate limits, scorer thresholds, or the objective target's model name) produce the same eval hash, while any difference in behavioral parameters (temperature settings, adversarial model choice, converter pipeline, or seed content) produces a distinct hash — enabling consistent evaluation grouping across deployments.

@rlundeen2 rlundeen2 merged commit d1ce4ce into Azure:main Mar 10, 2026
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants