← RageBuilder v1.4
Character Inspector

Paste any text to inspect every character at the Unicode level. See code points, names, categories, and detect invisible or control characters instantly.

📥 Try These Examples

Click to load sample text with hidden Unicode characters:

Text with invisible zero-width characters 🕶️ 3 hidden chars
BiDi override attack (Trojan Source) ↔️ LTRAL + RLO
Homoglyph phishing URL 👥 Cyrillic substitutions
🔍 Invisible Character Scanner

Detect hidden Unicode characters that are invisible to the naked eye. Zero-width spaces, joiners, bidi controls, and other formatting characters that attackers use to hide data or manipulate text rendering.

📖 What Gets Detected
U+200B — Zero Width Space
U+200C — Zero Width Non-Joiner
U+200D — Zero Width Joiner
U+FEFF — Zero Width No-Break Space (BOM)
U+202A–202E — BiDi Embedding/Override
U+2066–2069 — BiDi Isolate Controls
U+00AD — Soft Hyphen
U+2060 — Word Joiner
🕶️ Zero-Width Steganography

Encode secret messages into invisible Unicode characters hidden within normal-looking text. The carrier text appears unchanged, but contains a hidden payload.

ℹ️ How It Works

Each character of your secret message is converted to binary (7 bits per char). Each bit is encoded as either U+200C (Zero Width Non-Joiner = 0) or U+200D (Zero Width Joiner = 1). These invisible characters are inserted after each visible character in the carrier text. The result looks identical to the original, but carries a hidden payload.

Decode reverses this: it extracts the zero-width characters, reconstructs the binary, and converts back to text.

👥 Homoglyph Explorer

Find characters from different scripts that look identical to Latin letters. Attackers use these in phishing URLs, fake filenames, and supply chain attacks.

⚠️ Why This Matters

A URL like аррӏе.com looks identical to apple.com — but uses Cyrillic а (U+0430), р (U+0440), and ӏ (U+04CF). Your browser renders them identically, but they resolve to completely different servers.

This technique powers IDN homograph attacks — one of the oldest and most effective phishing methods in Unicode-based systems.

↔️ Bidirectional Text Attack Demo

Unicode bidirectional (BiDi) controls let text switch direction mid-string. Attackers abuse this to make code look like one thing while actually being something else — the "Trojan Source" vulnerability class.

🔬 Trojan Source: Real-World Example

Here's how a BiDi attack can hide dangerous code. The RLO character (U+202E) reverses the visual order of everything that follows it:

// This looks like a safe check if (user.isAdmin()) { U+202EgrantAccess("everyone"); // ← RLO makes this look like a comment! } // What you SEE: // if (user.isAdmin()) { // grantAccess("everyone"); // <-- RLO reverses display // } // The semicolon and closing brace appear BEFORE the function call visually // What it ACTUALLY executes: // if (user.isAdmin()) { // ;grantAccess("everyone") // }

In real attacks, the RLO character is invisible. The code appears to be a harmless comment, but actually grants access to everyone. This vulnerability (CVE-2021-42574) affected C/C++, Go, Rust, and many other languages.

🛡️ Defense Strategies
  • Strip BiDi controls from source code during CI/CD pipelines
  • Use editors that highlight or warn about invisible Unicode characters
  • Code review tools should flag files containing U+200E–U+200F, U+202A–U+202E, U+2066–U+2069
  • Normalize text to NFC/NFKC form before processing
  • Browser dev tools — inspect element to see the actual character sequence, not the rendered result