Encrypting text data by replacing characters?

Question

I have been thinking on this method but did not really get into it because I thought it was stupid to do for some reason, maybe because I've never heard of it and it's pretty obvious. But today I got a little more into it and thought about it.

If I want to securely speak with a mate on the Internet with a program I made specially for this, which only we have. The program makes my computer connect to his computer or vice-verse and transfer data.

Now the thing is, to make it simple, let's say we only use English language without any symbols, just lower-case characters.

We can set for example that a will equal to gH5S and so on, and at the end we will receive a long text of gibberish, and on top of that we will have an RSA encryption before it's sent out.

I understand, the hacker after decrypting the RSA data he can split the text into pieces that repeat itself like the gH5S but there are many characters used, how can he know which split is what character? He needs to guess the words and somehow be sure, and there are tons of possibilities to build words.

There's an example simple program in Java to explain this:

public class Test {
    static String[] replacements = new String[] {
            "abK98", "HGD3x", "aRfXZ", "hdZdgb", "eDzfh", "aieSZ3", "iLz5",
            "zH4", "ab98", "abK2398", "a5568", "ACz98", "loW91", "ZmKuJ",
            "azS6D", "ZcfZS", "dFXze", "FszXF", "rXzFttX", "fdXRS", "52aF",
            "ZaWRQ", "qPweQ", "dWtQY", "puEz", "ZdeA"
    };

    static String space = "lKi89";

    public static void main (String[] args) throws java.lang.Exception
    {
        StringBuilder encrypted = new StringBuilder();
        String text = "hello i really like to program and be on stackexchange";

        for (int i = 0 ; i < text.length(); i++) {
            char c = text.charAt(i);
            if (c == ' ') {
                encrypted = encrypted.append(space);
                continue;
            }

            int idx = Character.getNumericValue(c) - 10;
            encrypted = encrypted.append(replacements[idx]);
        }

        System.out.println(encrypted.toString());
    }
}

The replacements array, first index is a and last one is z by order. And this is the output:

zH4eDzfhACz98ACz98azS6DlKi89ab98lKi89FszXFeDzfhabK98ACz98ACz98puEzlKi89ACz98ab98a5568eDzfhlKi89fdXRSazS6DlKi89ZcfZSFszXFazS6DiLz5FszXFabK98loW91lKi89abK98ZmKuJhdZdgblKi89HGD3xeDzfhlKi89azS6DZmKuJlKi89rXzFttXfdXRSabK98aRfXZa5568eDzfhdWtQYaRfXZzH4abK98ZmKuJiLz5eDzfh

Is that a bad method? can this be easily decrypted by spying on the data? How?

Why not pass the data through GnuPG or something similar instead, before transmitting and after receiving? — user, Jun 23 '16 at 16:31
are you asking about the security of using both RSA and a home grown solution, or just home grown? — Neil Smithline, Jun 23 '16 at 18:23
If you really must do this, obfuscate *after* RSA. You add a **limited** (see @CaffeineAddiction 's answer) obfuscation that might delay a hopefully doomed RSA attack. On the other hand, the attacker might invest more resources up front, wrongly believing the composed crypto to be weaker than RSA; so many resources, in fact, that in the end he'll prefer a rubberhose attack to conceding defeat. — LSerni, Sep 28 '16 at 14:16

score 17 · Accepted Answer · edited Jun 26 '16 at 04:29

17

Is that a bad method? YES.
Can this be easily decrypted by spying on the data? YES.
How? Standard cryptographic analysis as been done in wars like WW2 and before.

Essentially what you got is a variation on the Caesar cipher. Simple statistical analysis will quickly identify your 'letters'. and as soon as that is done, its only a matter of getting enough data through the line to be able to do pattern recognition on words (like identifying the vowels and such). Then its a matter of educated guesses of certain words and meaning and viola you have broken the cipher. As been stated over and over again: do not roll your own crypto. Cryptography is really really hard. and even the experts get it wrong (OFTEN)

edited Jun 26 '16 at 04:29

atk

2,156
14
15

answered Jun 23 '16 at 14:34

LvB

8,336
1
27
43

I was thinking of the same but I wanted to know how difficult is it if you make your patterns long and permutations of each other. How difficult is it to identify a pattern like that? – Limit Jun 23 '16 at 14:49
I meant that if you replace with a 5 character pattern and b with another 5 character pattern and both of them use same characters but in different order. Can't it get confusing for the pattern analyser?? – Limit Jun 23 '16 at 14:50
Sorry for the DV, but I completely disagree with your answer. I feel like you answered the wrong question, or just missed the fact that the OP is doing his method AND RSA encryption. – TTT Jun 23 '16 at 15:35
this code fragment actually does not do any RSA. ad just saying you use it does not mean you actually use . it. my awnser is compleetly based on the code given, For RSA to work as intended it must be made properly and there mut first be a secure key negotiation. this is not present in the example and there for I simply assume the cipher used for RSA is "NONE" and so its plaintext. so yes I did not address this in the answer itself I though it was obvious. Also the question is about after the attack decrypted the RSA message. so are you sure you read it fully? – LvB Jun 23 '16 at 15:49
1

YMMV, but I thought that the code fragment was simply demonstrating the obfuscation idea, and is not the code for the entire messaging application. :D – TTT Jun 23 '16 at 17:25
As for OP's statement of "after decrypting the RSA data", I still believe what he did would be *slightly* better than just having plaintext there. – TTT Jun 23 '16 at 17:29

score 7 · Answer 2 · edited Mar 17 '17 at 13:14

Is that a bad method?

Yes, this method is bad for many reasons. First, it does not really gain you anything. If you are already using RSA then using your own on top of that does not gain you anything. If you are not using RSA or another publicly accepted encryption ... then rolling your own is a terrible idea.

can this be easily decrypted by spying on the data?

Yes

How?

As LvB already stated, what you have is a variation of the Caesar Cipher ... which is one of the simplest and most widely known encryption techniques. Your variation of it while it may seem complex to you is pretty basic ... looking at the encrypted string you provided (even if you didn't supply your source code) has very obvious patterns. If you used a variation of the LZW compression algorithm on your encrypted text, the dictionary generated would yield some very interesting statistics about the usage of various patterns found.

04, x9, lKi89
01, x6, eDzfh
02, x5, ACz98
07, x5, abK98
03, x4, azS6D
14, x3, ZmKuJ
06, x3, FszXF
00, x2, zH4
05, x2, ab98
09, x2, a5568
10, x2, fdXRS
12, x2, iLz5
08, x1, puEz
13, x1, loW91
15, x1, hdZdgb
16, x1, HGD3x
17, x1, rXzFttX
11, x1, ZcfZS
18, x1, aRfXZ
19, x1, dWtQYaRfXZ

 0  1  2  3  4  5  4  6  1  7  2  8  4  2  5  9  1  4  10  3  4  11  6  3  12  6  7  13  4  7  14  15  4  16  1  4  3  14  4  17  10  7  18  9  1  19  0  7  14  12  1

Once you identify some of the most commonly used patterns you can compare it to the twelve most common letters in the English language e t a o i n s r h l d c and fill in the blanks with whats left.

hello i reall_ li_e to _ro_ra_ and _e on stac_e_chan_e

After this you could try different mappings of different patterns to letters and search for the most common words in the English language ... while this doesn't work very well on your example string ... once you get a paragraph or two of encrypted text it starts to work quite well.

So, yes not only can your encryption be broken ... it could be broken by hand with pencil and paper by a child with an interest in patterns and code breaking (I use to do this for fun in grade school).

Good practical example. Keep having fun :-) – LSerni Sep 28 '16 at 14:12 — LSerni, Sep 28 '16 at 14:12

Luis Casillas · Answer 3 · 2016-06-25T20:46:16.070

4

To add to the excellent answers already here, I'll focus on this point:

If I want to securely speak with a mate on the Internet with a program I made specially for this, which only we have.

This is problem #1. Somebody else could acquire a copy of your program. Even worse, do it without your knowledge. This is why in cryptography there's a difference between techniques and keys:

Techniques are hard to develop and change, and need to be safe to publish.
Keys must be kept secret, but they're easy to change.

So with proper cryptography, if you learn that your key may have been leaked then you can just change the key, which is much easier than changing your encryption program. In fact, it's standard cryptographic practice to rotate keys—to change them periodically—in order to limit the damage done when somebody steals your key without you knowing it. This goes as far as designing protocols to use ephemeral keys, that are used for one communication and then thrown away immediately.

edited Jun 25 '16 at 20:46

answered Jun 23 '16 at 19:36

Luis Casillas

10,361
2
28
42

To add to the last point, it goes as far as to make *periodic rekeying a standard feature in some software implementations*. Look at OpenSSH's `RekeyLimit` configuration option, for example: In recent versions, you can specify the maximum number of bytes *and* the maximum length of time a (symmetric session) key should be used for, and once either criteria is met, a new key is automatically negotiated, transparent to the user. (Earlier versions only allow you to specify the number of bytes a single key can be used for, but also allowed for transparent re-keying during a session.) – user Jun 26 '16 at 16:40
(...) Keep in mind here that the SSH symmetric session key is different from the host key, which is used to authenticate the host and secure the key exchange. The client also likewise can offer up an asymmetric key for authentication, in public-key authentication methods. – user Jun 26 '16 at 16:41

TTT · Answer 4 · 2016-06-23T15:34:27.033

Even though an answer is already accepted, I'm going to add another because I actually believe the exact opposite of the accepted answer.

Basically, you are rolling your own obfuscation, and then using RSA encryption on top of that.

Is that a bad method? No, but it's pointless. Strong RSA encryption is sufficient without your additional obfuscation.
Can this be easily decrypted by spying on the data? No, but not because of what you're doing, but because of using strong RSA encryption.

If someone manages to decrypt strong RSA encryption, then they'll probably crack your algorithm shortly after, so there is barely any benefit to what you're doing.

Conclusion: what you are doing is slightly better than only using RSA encryption by itself. But if you are considering doing only your obfuscation and not using RSA encryption also, then that would definitely be (to use your own words) "a bad method". In fact, I would completely agree with LvB's answer if you were considering only doing your obfuscation without RSA encryption.

Encrypting text data by replacing characters?

4 Answers4

Linked