12

I have a question about how to detect a program/binary modification assuming the program is able to communicate with a remote validation server. More specifically I'm asking for android APK file, but it can be for any other program as well.

I imagine the following hypothetical scenario:

There is an APK installation file that the user is downloading and installing on his device. The application, after being installed and used is sending "some information" about its integrity to a remote server. The server is validating that the program is OK or is modified by hackers.

Given that scenario and the fact that APKs are written in Java, what are some ways that a developer can use to generate a hash value of his application files, so later this hash can be sent to a remote server that in turn will be able to validate that hash - hash over files sizes, or some other kind of structure?

I am interested in some ways to generate that "some information" which is sent to the validation server.

Please correct me if the question is not very properly asked as this scenario is not so clear to me and maybe there are other ways this can be achieved more easily. Thanks a lot!

--- EXTENSION ---

Hello and thank you very much for the answers!

I am surprised by the answers a little bit, because I didn't think too deep about the issue and I thought that there will be some easy mechanism to do this :) But now I hear that even TPMs cannot guarantee 100% protection...

As a lamer on the topic, I would like to ask something more. I don't want to protect the program from redistribution, like a copy protection mechanism. I want the program to be tested from time to time via validation server, that it's not modified.

This is the scenario:

The server that is distributing the application computes a hash of the program with some weird algorithm, if there is such. Later the program uses the same weird algorithm to compute a hash of itself and send it to the server. The server then compares the two hashes. If someone has modified a file, a JMP instruction for example, the hash should be different and the app considered hacked. What are the flaws in this scenario? That someone will be able to disassemble the algorithm to compute the hash from the program and later modify the program to send the same value to the server?

And even if this is the case, do you think that this is still a lot of work for someone to do, and it may not be worth it?

luben
  • 898
  • 2
  • 12
  • 17

5 Answers5

15

The scenario you describe is very similar to the concept of "remote attestation". There has been a lot of research on this and there are two major results:

  • You need a trust anchor, such as the TPM or a trusted system service, to securely measure your app and report the results to the server. Otherwise you can always build a simulator that generates the correct responses[1].

  • Once you deploy and use all the trusted computing infrastructure, you still can't prevent or even detect exploitation of vulnerabilities in your app with any more assurance than by deploying today's standard anti-buffer overflow technology.

So, if you want to deploy this today, your best option is code obfuscation. Basically, you want to implement a copy protection mechanism, same as it has been implemented and broken for decades.

[1] There have been some very cool advancements that exploit computation and communication limits of the client platform, but this is still sci-fi.

pepe
  • 3,536
  • 14
  • 14
9

What are the flaws in this scenario?

The basic flaw here is that you are assuming that the remote party is adhering to your rules.

You have a server with a program.

You receive a download request.

At this point you have no idea who or what the remote downloader is.

Once downloaded, an adversary can save it wherever they please. They can attempt to run your program in any one of a number of debuggers, emulators, or physical hardware. It is quite trivial for the adversary to prevent your program from communicating with any remote system while they debug, disassemble, run, modify, or analyze.

That someone will be able to disassemble the algorithm to compute the hash from the program and later modify the program to send the same value to the server?

The basic problem with sending the hash to the server is that there is no incentive to allow the program to send that hash.

Imagine a standard device with an application firewall. The user downloads and runs your program unmodified. When your program attempt to send the hash to the server, the application firewall pops up a dialogue asking 'Application YourApp is attempt to connect to the internet, do you want to allow this?' The user has no incentive to say yes.

Now imagine an adversary who has modified your program. They run your application in a standard device with an application firewall. Now, when the application firewall pops up a dialogue. The adversary has incentive not to allow the program to communicate with the server.

do you think that this is still a lot of work for someone to do, and it may not be worth it?

The problem of remotely verifying software is a fundamental problem that has been worked on for decades. There are approaches that work well for specific scenarios and potential attackers. However there is no general solution.

For your scenario, where some remote users have complete control over the target hardware platform, the application binary is easy to disassemble, and there is no incentive for a user to allow your application to communicate with your server, at best you will detect very few of the total modifications. At worst every single modification will go unreported, and the unmodified reports will give you a false sense that no one is modifying your software.

this.josh
  • 8,843
  • 2
  • 29
  • 51
7

You can make it harder for an attacker to modify the files, but you cannot prevent modification of anything you give away to the attacker in the end.

That is assuming the attacker has full access to his computer. There is some work being done by the Trusted Computing Group and other vendors to restrict the abilities of the owner. Those trusting computing modules are mostly used in game consoles and smart phones. But as soon as this protection is moved out of the way (e. g. the phone is "rooted"), the above paragraph applies.

Those modification include the checksum send to the server. To give a noteworthy example: In the Netherlands the CEO of Nedap claimed that their voting computers are a dedicated special purpose machine that can only be used for elections and nothing else. WVSN and CCC ported an open source chess program to it to prove the CEO wrong. As a special feature of protection the voting computer have a button which calculates the checksum of the program and displays it in order to prevent manipulations. The chess program did display the same number. (Heise article in German).

You can make it more difficult by using many different checksum and not using them in plain but as part of calculations: Skype uses the checksums to calculate the destination of JMPs as an anti debug technology. A standard breakpoint will modify the debugged program and thus cause it to jump to the wrong places. Skype was eventually reverse engineered by running a second copy as oracle. (Silver Needle in the Skype)

An other example which probably fits better to your goals is SecondLife. Second Life is an online game in which players can create and sell their own content. Despite their DRM any type of information, that is send to the client (textures, animations, sound), has been illegally copied. Only the information that is kept on the server (user scripts) is secure. I went into more details on this at Are there DRM techniques to effectively prevent pirating?

nobody
  • 11,341
  • 2
  • 41
  • 60
Hendrik Brummermann
  • 27,158
  • 6
  • 80
  • 121
7

Regarding extension:

The scheme you propose is also easy to break: I record the hash that your software sends to the server, then modify the app and reply the hash whenever the server asks for it. Even if you make a challenge-response-style protocol(to provide freshness), I can debug your app to find out how you generate the response. This attack is in general always possible and your security completely depends on the amount of obfuscation and anti-debugging techniques. You may fool 90% of your clients with moderate security, but remember that it only requires one guy to publish a "crack" for your program and then everyone can use a modified version that sends the correct hashes to your server.

The keywords for current research on this topic are "software attestation" or "secure mobile agents". These mechanisms are only secure under some rather strong requirements that are hard to achieve in practice (google "on the difficulty of software attestation" and "PUF-based attestation").

The best way to solve such problems in practice, here and now, is to move significant parts of your service to the server. If the actual value comes from communicating with the server, there is much less incentive to cheat. Consider online games. Maybe you can have some "social features" like rankings, chat rooms or other services that you provide from the server?

pepe
  • 3,536
  • 14
  • 14
5

You may be interested in the approach Microsoft has taken with Authenticode and .NET.

Link

Glorfindel
  • 2,263
  • 6
  • 19
  • 30
makerofthings7
  • 50,488
  • 54
  • 253
  • 542