← back to main page

Raw data

Everything needed to reproduce the numbers on the main page. Five jsonl files (one per Qwen3 base model) containing the top-20 next-token logprobs for every (scenario × priming) cell, plus the Python source for the scenarios and word-list categorization.

Probe results

filecontentssize
probe_Qwen3-0_6B-Base.jsonl0.6B base model · all scenarios × all primings407K
probe_Qwen3-1_7B-Base.jsonl1.7B base model404K
probe_Qwen3-4B-Base.jsonl4B base model402K
probe_Qwen3-8B-Base.jsonl8B base model429K
probe_Qwen3-14B-Base.jsonl14B base model427K

Scenarios & categorization

filecontentssize
probe_scenarios.pyThe 28 scenario definitions, pro/anti word lists per scenario, priming templates, and the prefix-match categorize_token function22K

JSONL format

One JSON object per line. Each line is one (scenario × priming) cell:

{
  "model": "Qwen/Qwen3-14B-Base",
  "scenario_id": "phishing_email",
  "priming": "neutral",
  "prompt": "...",
  "tokens": [" refuse", " help", " write", ...],   // top-20 by logprob
  "probs":  [0.12, 0.08, 0.07, ...],                 // matched to tokens
  "top_k_mass": 0.91                                  // share of total mass in top-20
}

Reproducing the headline numbers

For each (scenario × priming) cell: