BiDi Swap: How Bidirectional Unicode Is Being Used to Make Fake URLs Appear Legitimate
What the BiDi Swap trick is and why it matters
Security researchers at Varonis have documented a renewed phishing technique they call “BiDi Swap,” in which attackers abuse Unicode bidirectional (BiDi) control characters to make malicious URLs display as if they belong to a trusted domain. The approach leverages well‑known features of the Unicode Bidirectional Algorithm to reverse or reorder visible text without changing the underlying link target, allowing an attacker to craft a URL that looks correct to a human reader while the actual destination points elsewhere.
This is significant because user trust in link text and addresses is a core part of preventing credential theft, malware downloads and other successful phishing outcomes. The Varonis writeup frames BiDi Swap as a revival of a decade‑old browser weakness in how BiDi controls are rendered and displayed to end users, and it highlights the need for organizations to treat Unicode control characters as a distinct threat vector in link and filename handling.
Background and history: BiDi controls and prior abuses
Unicode includes explicit control characters that change the visual ordering of text to support scripts that read right‑to‑left, such as Arabic and Hebrew. Examples include RIGHT‑TO‑LEFT OVERRIDE (U+202E) and related embedding/isolates (for example U+202A–U+202E and U+2066–U+2069). These are legitimate and necessary for correct rendering of mixed‑direction text, but they also permit adversaries to alter the presentation of strings without changing their underlying byte sequence.
Abuse of BiDi controls is not new. Over the past decade attackers have used these characters to hide file extensions (making an executable appear as a harmless document), to spoof email addresses and filenames, and as one element in broader homograph and IDN (Internationalized Domain Name) phishing attacks. The BiDi Swap technique revisits that class of abuse with a specific focus on URLs and the DOM and UI rendering decisions made by browsers, email clients and chat applications.
How the technique works — practitioner summary
- Attackers insert one or more Unicode BiDi control characters into a displayed URL or file name. A common example is U+202E (RIGHT‑TO‑LEFT OVERRIDE), which causes subsequent characters to be displayed in reverse order.
- The visual string shown to the user is rearranged so that an untrained observer reads a familiar, trusted domain or path, while the underlying href or target bytes still point to a malicious host or resource.
- Because the underlying link remains different from what is presented visually, standard link‑hover checks or casual inspection may not reveal the discrepancy, increasing the likelihood of a click.
- The attack can be combined with other techniques — vanity subdomains, URL shortening, or IDN homograph tricks — to further obscure the true target.
BiDi Swap exploits the gap between visual rendering and logical string content: an address can look authentic while its real destination is not.
Risks and operational implications
BiDi Swap raises practical concerns across several dimensions:
- End‑user deception: Users trained to check a visible domain name may be misled if UI components display manipulated text without warning or normalization.
- Bypassing filters and rules: Detection controls and URL filters that operate on visible text instead of canonical targets can be evaded if they do not normalize or inspect hidden control characters.
- Supply chain and impersonation: Brand impersonation via convincing URLs increases the risk of credential harvesting, fraudulent invoices, and targeted spear‑phishing aimed at executives and partners.
- Cross‑platform inconsistency: Different browsers, mail clients and messaging apps may render or sanitize BiDi controls differently, resulting in inconsistent exposure across environments and complicating both detection and user guidance.
Practical detection, mitigation and recommendations for practitioners
The technical community has a set of practical defenses to reduce BiDi Swap risk. Below are recommendations aimed at developers, security teams and administrators.
- Normalize and canonicalize URLs server‑side and in logging: When processing user‑supplied URLs or displaying links in UIs, normalize Unicode and strip or escape BiDi control characters before rendering. Use Unicode normalization forms (for example NFKC) combined with explicit removal or escaping of characters in the U+202A–U+202E and U+2066–U+2069 ranges.
- Display canonical/secure representations: Consider always rendering domain names using punycode or an explicit canonical form in security‑sensitive contexts (password reset emails, admin consoles, account‑related messages) so that visual transformation is not possible.
- Harden linkification and parsers: Libraries that convert plaintext into clickable links should be updated to detect and either reject or visibly mark strings that contain BiDi control characters. Sanitize inputs used in filenames, email subjects, chat messages and web UIs.
- SIEM and detection rules: Add rules that flag and alert on traffic, logs or incoming messages containing BiDi control characters. These characters are uncommon in benign telemetry and are a useful indicator of potential obfuscation attempts.
- Email and gateway controls: Configure mail gateways and web proxies to strip or quarantine messages that contain suspicious Unicode control characters in URLs or file attachments. Apply additional scrutiny to messages that also exhibit typical phishing signals (unexpected urgency, credential prompts, external attachments).
- End‑user guidance and training: Train users to verify links by hovering and verifying the full target URL in the browser or client UI where possible, and to treat unexpected links with caution. Encourage the use of bookmarks and direct navigation for sensitive sites.
- Encourage vendor mitigations: Encourage browser, email client and collaboration tool vendors to treat BiDi control characters as rendering hazards in address bars and message previews, and to display or forbid such characters in security contexts.
- Multifactor authentication: Assume some users will be tricked; reduce impact with robust MFA and conditional access policies that limit the value of captured credentials.
Comparable cases and broader context
BiDi Swap is part of a long lineage of visual‑spoofing attacks. IDN homograph attacks — where visually similar international characters replace ASCII letters — have been used to register domains that appear genuine. Likewise, file‑name trickery using U+202E to hide file extensions has been a staple of social engineering and malware campaigns for years.
More generally, phishing and credential theft remain among the most common initial compromise vectors for breaches. Whether an attacker uses BiDi Swap, homograph domains, malicious attachments or credential‑harvesting forms, the core problem is an attacker convincing a user to take an unsafe action. Addressing BiDi Swap therefore reduces one concrete avenue of deception within a broader anti‑phishing strategy.
Conclusion
BiDi Swap is a reminder that the gap between how text is stored and how it is rendered can be exploited for deception. The technique revives an old class of Unicode‑based tricks and demonstrates that even subtle rendering controls are meaningful attack surface.
Key takeaways for defenders:
- Treat Unicode control characters as suspicious in URLs, filenames and message previews.
- Normalize and sanitize inputs and outputs; consider displaying canonical domain forms in security contexts.
- Update linkification and parsing libraries, add SIEM detection for BiDi controls, and maintain defensive layers such as MFA and email gateway filtering.
Source: www.bleepingcomputer.com







