Shadow Messages: Plausible Deniability in JPEG Steganography

Steganography hides the existence of a message. But what happens when someone suspects you’re hiding something and compels you to reveal the passphrase?

This is the rubber-hose problem — named after the observation that no amount of mathematical sophistication protects you if an adversary can simply coerce you into decrypting. You either reveal the passphrase and expose the message, or you refuse and face consequences. The cryptography is irrelevant. The human is the weak link.

VeraCrypt solved this for disk encryption with hidden volumes: an outer volume with an innocuous decoy, a hidden volume with the real data, and — critically — no way to prove the hidden volume exists. The outer volume’s “free space” looks identical whether or not it contains a hidden volume. Under coercion, you reveal the outer passphrase. The adversary sees the decoy. The hidden volume remains undetectable.

Shadow messages bring this concept to steganography.

The Channel Separation Insight

JPEG images store pixel data in YCbCr color space — three independent channels:

Y (luminance): brightness information
Cb (blue-difference chrominance): blue-yellow color information
Cr (red-difference chrominance): red-green color information

Phasm’s primary Ghost mode operates exclusively on the Y channel: the J-UNIWARD cost function computes wavelet distortion for luminance coefficients, and the STC Viterbi algorithm embeds the message by modifying luminance DCT coefficients at minimum cost.

Shadow messages exploit a simple observation: the Cb and Cr channels are completely untouched by the primary embedding. In JPEG, each color channel has its own independent grid of DCT coefficients, its own quantization table, and its own Huffman coding. Modifying a Cb coefficient has zero effect on any Y coefficient. The channels are mathematically and physically independent.

This means the chrominance channels are free real estate — a second, independent embedding domain that the primary encoder never touches. Shadow messages live here.

The Pipeline

The encoding order is deliberate:

Embed all shadow layers in Cb+Cr (repetition coding)
Compute J-UNIWARD costs on Y (wavelet distortion)
Run STC Viterbi on Y (minimum-cost embedding)
Write JPEG

Shadows are embedded first, but since they modify only Cb+Cr and the primary STC reads only Y (dct_grid(0)), the primary encoding is completely unaffected. The primary message decodes identically whether or not shadow layers exist. This is not a design goal achieved through careful balancing — it is a structural guarantee. The channels do not interact.

Repetition Coding in Chrominance

Shadow layers use repetition coding with factor $R = 7$: each payload bit is embedded into 7 separate chrominance coefficient positions. During extraction, the bit is recovered by majority vote.

Encoding

For each payload bit $b_i$ and repetition index $r \in \{0, 1, \ldots, 6\}$:

$$\text{position index} = i \cdot R + r$$

At each position, the encoder reads the absolute-value LSB of the chrominance coefficient and, if it doesn’t match $b_i$, applies the nsF5 modification: decrease $|\text{coeff}|$ by 1, with the special case that $\pm 1 \to \mp 1$ (sign flip instead of creating a zero).

Decoding

For each payload bit position $i$, count the number of 1-valued LSBs across the $R = 7$ repetitions:

$$\hat{b}_i = \begin{cases} 1 & \text{if } \sum_{r=0}^{6} \text{LSB}(|\text{coeff}_{i \cdot R + r}|) \geq 4 \\\\ 0 & \text{otherwise} \end{cases}$$

The majority vote correctly recovers the bit as long as at most $\lfloor R/2 \rfloor = 3$ of the 7 repetitions are corrupted. This tolerates a bit error rate of:

$$\text{BER}_{\max} = \frac{3}{7} \approx 42.8\%$$

Why Not STC?

The primary Ghost message uses Syndrome-Trellis Codes to minimize distortion. Why don’t shadow layers use STC too?

Three reasons:

STC requires a meaningful cost function. J-UNIWARD computes directional wavelet distortion — designed for luminance. Computing wavelet costs for chrominance is wasteful because human vision is far less sensitive to color changes. The perceptual benefit of cost optimization is minimal in Cb/Cr.
Repetition coding tolerates interference. When multiple shadow layers share the same chrominance pool, their randomly-permuted positions may overlap. Repetition coding’s 42.8% BER tolerance absorbs this interference. STC’s near-optimal coding has no such margin.
Simplicity is a security property. The shadow embedding code is 262 lines of Rust — straightforward enough to audit in an afternoon. The STC implementation (with segmented Viterbi, back-pointer packing, and message-block shifts) is 600+ lines. For a secondary layer that carries short messages, the simpler implementation is the better one.

Shadow Capacity

Each shadow layer’s capacity is determined by the number of non-zero AC coefficients in the Cb and Cr channels:

$$C_{\text{shadow}} = \left\lfloor \frac{P_{\text{chroma}}}{R \cdot 8} \right\rfloor - F_{\text{overhead}}$$

where $P_{\text{chroma}}$ is the count of usable chrominance positions (non-zero AC coefficients in Cb+Cr combined), $R = 7$ is the repetition factor, and $F_{\text{overhead}} = 50$ bytes is the frame overhead (salt, nonce, authentication tag, CRC, and length field).

With standard 4:2:0 chroma subsampling, Cb and Cr are each 1/4 the resolution of Y. A 12 MP photo might have ~50,000 usable chrominance positions, yielding:

$$C = \lfloor 50{,}000 / 56 \rfloor - 50 = 892 - 50 = 842 \text{ bytes}$$

For comparison, the primary Ghost message on the same image can carry several kilobytes via STC. Shadow capacity is modest — designed for short text messages, not files.

Image resolution	Approx. chroma positions	Shadow capacity (1 layer)
640 x 480	~10,000	~128 bytes
1920 x 1080	~30,000	~485 bytes
4032 x 3024 (12 MP)	~50,000	~842 bytes
8192 x 6144 (50 MP)	~200,000	~3,521 bytes

These are approximate — actual capacity depends on the image content (specifically, how many non-zero chrominance AC coefficients exist after quantization).

The Cryptographic Isolation

Each shadow layer has its own independent cryptographic chain, derived solely from its passphrase:

1. Structural key derivation:

$$K_{\text{shadow}} = \text{Argon2id}(\text{passphrase}, \text{SHADOW\_SALT})$$

The shadow salt (phasm-shdw-v1) is distinct from the primary Ghost salt (phasm-ghost-v1), Armor salt (phasm-armor-v1), and Fortress salt (phasm-fort-v1). Even the same passphrase produces completely different keys for different modes.

2. Position permutation:

The structural key seeds a ChaCha20 PRNG that drives a Fisher-Yates shuffle over the chrominance positions. Each passphrase produces a unique permutation — a different shadow passphrase reads completely different coefficient positions in a completely different order.

3. Payload encryption:

$$\text{ciphertext} = \text{AES-256-GCM-SIV}(\text{payload}, K_{\text{enc}}, \text{nonce})$$

where $K_{\text{enc}} = \text{Argon2id}(\text{passphrase}, \text{random\_salt})$. The random salt and nonce are stored in the frame alongside the ciphertext. The authentication tag (16 bytes) rejects any incorrect decryption attempt — wrong passphrase → wrong key → authentication failure → no plaintext leakage.

4. Duplicate passphrase guard:

The encoder enforces that every passphrase (primary + all shadows) is unique. Two layers with the same passphrase would share the same permutation and overwrite each other, destroying both messages. This is a safety check, not a security requirement — the system simply rejects ambiguous configurations.

The Deniability Argument

What can an adversary observe?

A JPEG image with normal-looking chrominance coefficients
If they have the primary passphrase: the primary message from luminance

What can they not determine?

Whether any shadow layers exist. Chrominance coefficient LSBs are naturally pseudorandom — they vary based on color content, quantization, and JPEG encoding. Shadow modifications (nsF5: $|\text{coeff}| \to |\text{coeff}| - 1$) are statistically subtle and use only existing non-zero coefficients. No new non-zero coefficients are created.
How many shadow layers exist. Each layer uses an independent permutation derived from its passphrase. There is no header, no marker, no count field. The image either decrypts successfully with a given passphrase or it doesn’t.
Whether the image was encoded with shadow support at all. The ghost_encode_with_shadows function produces a valid JPEG that is structurally identical to one produced by ghost_encode. There is no flag, no metadata, no detectable difference in the file format.

The smart_decode function tries all modes automatically: Armor first, then Ghost primary, then Ghost shadow. The user just enters a passphrase. If it matches the primary passphrase, they get the primary message. If it matches a shadow passphrase, they get that shadow. The decoder doesn’t know in advance which type of message it’s looking for — it tries everything and reports whichever succeeds.

An Honest Caveat

The deniability of shadow messages rests on the practical difficulty of distinguishing shadow-modified chrominance from natural chrominance — not on a mathematical proof of undetectability. The primary Ghost message benefits from J-UNIWARD’s carefully optimized cost function, which minimizes the statistical footprint across wavelet subbands. Shadow layers use simpler uniform-cost embedding. A sophisticated steganalysis attack targeting chrominance statistics could, in theory, detect that modifications have occurred — though it could not determine the content or even confirm that the modifications carry a message rather than being compression artifacts.

For practical deniability against non-expert adversaries and standard forensic tools, the channel separation and encryption provide strong protection. Against a nation-state adversary with custom steganalysis capabilities, the primary Ghost message is the safer channel.

Multi-Shadow Interference

Multiple shadow layers share the same chrominance coefficient pool. Each layer’s positions are determined by an independent permutation (different passphrase → different ChaCha20 seed), but their modification regions can overlap statistically.

The Interference Model

Shadows are embedded sequentially. When shadow $B$ (embedded second) modifies a position that shadow $A$ (embedded first) had already modified, shadow $B$’s value overwrites shadow $A$’s. During extraction, shadow $A$ reads positions in its own permutation order, some of which now carry shadow $B$’s data instead of shadow $A$’s.

The probability that a given position of shadow $A$ was overwritten by shadow $B$ is:

$$P(\text{collision}) = \frac{B_B}{P_{\text{chroma}}}$$

where $B_B = |\text{frame\_bits}_B| \times R$ is the number of positions used by shadow $B$. Each collision has a 50% chance of changing the LSB value (since the encrypted payloads are independent), so the expected BER for shadow $A$ due to shadow $B$ is:

$$\text{BER}_{A \leftarrow B} \approx \frac{B_B}{2 \cdot P_{\text{chroma}}}$$

For $k$ shadows total, the worst-case BER (for the first shadow embedded, which can be overwritten by all subsequent shadows) is:

$$\text{BER}_{\text{worst}} \approx \frac{\sum_{j=2}^{k} B_j}{2 \cdot P_{\text{chroma}}}$$

The last shadow embedded has 0% interference BER — its writes are never overwritten.

Safety Threshold

The app enforces a conservative BER threshold of 35%, well below the theoretical 42.8% tolerance of $R = 7$ majority voting. This provides a safety margin for cases where the random permutations happen to produce more collisions than the expected value.

For two shadows of 200 bytes each on a 12 MP photo (~50,000 chroma positions):

$$B = (200 + 50) \times 8 \times 7 = 14{,}000 \text{ positions per shadow}$$

$$\text{BER}_{\text{worst}} = \frac{14{,}000}{2 \times 50{,}000} = 14\%$$

Comfortably below the 35% threshold. Both shadows decode reliably.

The Decode Chain

Phasm’s unified smart_decode function tries every mode automatically, now including shadows:

Armor Fortress (BA-QIM on block averages)
Armor STDM + Phase 3 DFT geometric recovery
Ghost primary (J-UNIWARD + STC on luminance)
Ghost shadow (repetition coding on chrominance)

On the native parallel path (iOS/Android with Rayon), all four run concurrently:

Fortress ‖ STDM+Phase3 ‖ Ghost primary ‖ Ghost shadow

The user enters a single passphrase. Whichever mode produces a successful decryption (valid AES-GCM-SIV authentication tag) returns the result. No mode selector, no channel selector, no “is this a shadow?” checkbox. The decode side is completely symmetric — the user’s experience is identical whether they’re decoding a primary message or a shadow.

Limitations

Color images only. Grayscale JPEGs have no Cb/Cr channels — shadow capacity is 0.
Modest capacity. 4:2:0 subsampling gives chrominance 1/4 the resolution of luminance, and $R = 7$ repetition divides that by 7 further. Shadows are for short text messages.
Ghost mode only. Armor mode uses chrominance differently (for STDM robustness). Shadow messages are available only with Ghost encoding.
No recompression survival. Like Ghost primary, shadow messages are destroyed by JPEG recompression. Send the file directly.
Multi-shadow interference. Each additional shadow layer increases the BER for previously-embedded shadows. Practical limit is 2-3 shadows for typical images.

The scenario shadow messages are built for is specific and serious: a journalist crosses a border checkpoint. Their phone contains a photo — seemingly innocuous. If compelled to reveal the steganographic passphrase, they reveal the primary message: something plausible but harmless. The shadow message, with its separate passphrase shared only with the intended recipient, remains invisible. No amount of inspection of the primary decode proves the shadow exists.

One image. Multiple secrets. No proof.

Frequently Asked Questions

Can an adversary detect that shadow layers exist without the passphrase?

Not through the decode API — a wrong passphrase produces an AES-GCM-SIV authentication failure, identical to “no message exists.” Statistical analysis of chrominance coefficients could theoretically detect modifications, but cannot distinguish shadow embedding from natural coefficient variation or compression artifacts. The adversary would need to prove that the chrominance carries a structured payload, not just that it looks slightly unusual.

Does adding a shadow reduce the primary message capacity?

No. The primary Ghost STC operates on Y (luminance); shadows operate on Cb+Cr (chrominance). The channels are entirely independent — different DCT grids, different quantization tables, different coefficient pools. Adding ten shadow layers would not change the primary capacity by a single byte.

Can I use the same passphrase for the primary and a shadow?

No. The encoder rejects duplicate passphrases with a DuplicatePassphrase error. This prevents accidental overwrites — since the primary and shadow use different salts for structural key derivation, the same passphrase would produce different permutations, but the duplicate check avoids any confusion about which message the user intended to decode.

How many shadow layers can one image hold?

Limited by chrominance capacity and multi-shadow interference. Each additional shadow consumes positions from the shared Cb+Cr pool and increases the BER for earlier shadows. For a 12 MP photo with short messages (~100 bytes each), 2-3 shadows are practical. The app shows a BER estimate and prevents encoding if interference exceeds the safety threshold.

Can shadow messages carry file attachments?

Yes. Shadow layers use the same payload format as the primary message: Brotli-compressed text with optional file attachments. However, chrominance capacity is limited, so attachments must be very small.

References

Fridrich, J. (2009). Steganography in Digital Media: Principles, Algorithms, and Applications. Cambridge University Press. Chapter 10: Deniable steganography.
Filler, T., Judas, J., & Fridrich, J. (2011). Minimizing additive distortion in steganography using syndrome-trellis codes. IEEE Transactions on Information Forensics and Security, 6(3), 920–935. doi:10.1109/TIFS.2011.2134094
Holub, V., Fridrich, J., & Denemark, T. (2014). Universal distortion function for steganography in an arbitrary domain. EURASIP Journal on Information Security, 2014(1). doi:10.1186/1687-417X-2014-1
Anderson, R., Needham, R., & Shamir, A. (1998). The eternity service. Proceedings of Pragocrypt ‘96. (Early deniability concepts in digital systems.)
VeraCrypt Documentation. Hidden volume. veracrypt.fr/en/Hidden%20Volume.html

Shadow Messages: Plausible Deniability in a Single JPEG