How can normal files supported in the Windows operating system hide a virus?
Sometimes opening a .txt , .jpg or .docx files leads to running a virus. How is this possible?
How can normal files supported in the Windows operating system hide a virus?
Sometimes opening a .txt , .jpg or .docx files leads to running a virus. How is this possible?
This is possible thanks to OLE (Object Linking and Embedding) technique. It is intended to share information between applications that run on Microsoft Windows operating system. Mainly, it allows to embed objects in documents.
Official Microsoft Documentation explains the benifits of OLE. But as any other concept, it could be used for a neferaious intention.
Maybe the easiest way to explain this is the figure below: by dragging and dropping the executable icon of Notepad on an opened Wordpad file and saving, the double-click on the Wordpad file will lead to open the Notepad application. The figure below explains better the idea:
You can save the document in .txt format. It will look as a normal benign file. However, if you check its properties, you will find its real extension is SHS. You can imagine adding any commands to the file: such as formatting some disk partitions.
Sometimes opening a .txt , .jpg or .docx files leads to running a virus. How come is this possible ?
To add to the answer to a similar question (thanks for finding it, Tcholas!):
You are correct in thinking that a virus in and on itself is harmless. A virus sitting in a file somewhere is no immediate threat to your computer. But when you open a file that contains a virus, you are actually running a program to open that file.
When you ask your operating system (say, Windows) to open a file, it basically does the following:
Check what kind of file it is. This can be by reading the extension (.txt, .doc) or by reading some of the data (zip files have "PK" in the first bytes).
Knowing the file's type, it finds the 'default program' to open it with. This is stored in a database somewhere on your computer.
Execute the program and load the data file in it.
So when you double-click "mynotes.txt", Windows looks up the default program for .txt files, probably "notepad.exe", and then executes "notepad.exe" and makes it load "mynotes.txt". When you double-click a JPEG image, it'll load an image viewer. Even something as simple as plugging a USB pen drive into your laptop will run some code.
The devil is in any bugs/errors that this loading program may have. Viruses use these 'holes' to trick the operating system into executing different code and modify/subvert part of your computer. When the software maintainers hear about the vulnerability, they (hopefully) fix it and issue a security update.
This also means that viruses tend to target specific programs. Something that works on MS Word will probably not work on OpenOffice. Same with Acrobat Reader and Evince, Chrome and Internet Explorer, Thunderbird and Outlook, and so on.
tl;dr - When you open a file, you are really opening a program which then opens that file. It's vulnerabilities in that program that allow viruses a shot at doing their thing.
One possibility is by exploiting overflow vulnerabilities. When opening the image, the software failure will "throw" the virus into unauthorised memory sections that may be executed by the system.
Here you have a description by Symantec of a vulnerability that exploited Internet Explorer in this sense.
Also, this question was answered in Stack Overflow.
One possibility is Unicode shenanigans.
Unicode supports displaying many languages, including those written left to right, and those written right to left. One way it does this is using special characters, including [U+202E] (left-to-right override (LRO)).
Windows supports Unicode, including in filenames.
You see a file on your desktop: evilexe.txt. It looks like a text file, but it's not. It's really named evil[U+202E]txt.exe.
It can have an icon set to make it look like a text file or JPEG file, but it's really an EXE file.
Buffer overflow.
A buffer overflow is where a program requests a size of memory but then writes more data in that location (overflowing) to the memory next to it.
For example: A program like OpenOffice Writer (a word processor like MS-WORD) (this is pure fiction by the way) has a limit on how large a paragraph could be, let's say it's 65535 characters. A malicious hacker creates a document with a paragraph that is 65555 characters long. The program doesn't have a check in place to see how large a paragraph is - it just loads it into memory. On the 65536th character, instead of an actual character the hacker puts in the byte code for a jump instruction to somewhere later in the hacker's document or anywhere that the hacker knows and controls. When the program accidentally reads the jump instruction that malicious document now controls the processing flow of the application. It's a virus, and you have just been infected.
There are many different types of overflow-able components in a computer so watch out!
If this sounds similar to the Heartbleed exploit it's because it is.
Let's continue.
Microsoft's Office platform is extremely extensible. It allows you to add both compiled and interpreted code to the program and to documents. The interpreted code, the macro, has a lot of power for productivity purposes, but up until about a decade ago it was also used heavily for propagating malicious code. Not so much any more, but it still pops up. Often a malicious document would have a bit of code that would write a file to your hard drive and then execute it when you opened up the document. Of course most of these guys came packaged in a PowerPoint presentation containing photos of cute cats and the like so it wasn't a total loss when you got pwned.
A lot of larger software packages is generally extendible and can also suffer from similar exploits.
Oftentimes a hacker will create a file and present that file as an image or as plain text when it isn't. A website author's responsibility is to escape, encapsulate and sandbox any data that they may receive from the internet. It's a hard job.
So a hacker may put a file called "NotA.phpfile.jpg" into a PHP website's form which presents itself as a image/jpeg
file, but in reality it's a application/php
file. Now when you load the URL that the image was uploaded to, you may have attained control of the website.
Similarly, the website must escape text or it could suffer from SQL injection and data can be stolen.
Let's look at some technology:
The short OP is asking why clicking on a .txt runs a virus. If a virus changes the associated program for an extension, then your computer can start mad_notepad_that_also_runs_a_virus.exe instead of notepad.exe when clicking on a .txt file.
But clicking on i_love_you.txt.extension_like_that_will_obviously_make_this_computer_explode.exe, even if a very naive behaviour, may happen.