Paste any text to inspect every character at the Unicode level. See code points, names, categories, and detect invisible or control characters instantly.
Click to load sample text with hidden Unicode characters:
Detect hidden Unicode characters that are invisible to the naked eye. Zero-width spaces, joiners, bidi controls, and other formatting characters that attackers use to hide data or manipulate text rendering.
Encode secret messages into invisible Unicode characters hidden within normal-looking text. The carrier text appears unchanged, but contains a hidden payload.
Each character of your secret message is converted to binary (7 bits per char). Each bit is encoded as either U+200C (Zero Width Non-Joiner = 0) or U+200D (Zero Width Joiner = 1). These invisible characters are inserted after each visible character in the carrier text. The result looks identical to the original, but carries a hidden payload.
Decode reverses this: it extracts the zero-width characters, reconstructs the binary, and converts back to text.
Find characters from different scripts that look identical to Latin letters. Attackers use these in phishing URLs, fake filenames, and supply chain attacks.
A URL like аррӏе.com looks identical to apple.com — but uses Cyrillic а (U+0430), р (U+0440), and ӏ (U+04CF). Your browser renders them identically, but they resolve to completely different servers.
This technique powers IDN homograph attacks — one of the oldest and most effective phishing methods in Unicode-based systems.
Unicode bidirectional (BiDi) controls let text switch direction mid-string. Attackers abuse this to make code look like one thing while actually being something else — the "Trojan Source" vulnerability class.
Here's how a BiDi attack can hide dangerous code. The RLO character (U+202E) reverses the visual order of everything that follows it:
In real attacks, the RLO character is invisible. The code appears to be a harmless comment, but actually grants access to everyone. This vulnerability (CVE-2021-42574) affected C/C++, Go, Rust, and many other languages.
- Strip BiDi controls from source code during CI/CD pipelines
- Use editors that highlight or warn about invisible Unicode characters
- Code review tools should flag files containing U+200E–U+200F, U+202A–U+202E, U+2066–U+2069
- Normalize text to NFC/NFKC form before processing
- Browser dev tools — inspect element to see the actual character sequence, not the rendered result