In short, the answer is yes to the use of UTF-8 characters in an attack chain. There are a few cases that have crossed my path. What I have read about this method of attack, is that this it is the last step to "drop shell" on an attack chain into a native system. With a quick "Google", this article came up.
"Using UTF-8 Encoding to Bypass Validation Logic"
The article goes on to explain exactly how this particular method is used. Here's the executive summary.
"This attack is a specific variation on leveraging alternate encodings to bypass validation logic. This attack leverages the possibility to encode potentially harmful input in UTF-8 and submit it to applications not expecting or effective at validating this encoding standard making input filtering difficult. UTF-8 (8-bit UCS/Unicode Transformation Format) is a variable-length character encoding for Unicode. Legal UTF-8 characters are one to four bytes long. However, early version of the UTF-8 specification got some entries wrong (in some cases it permitted overlong characters). UTF-8 encoders are supposed to use the "shortest possible" encoding, but naive decoders may accept encodings that are longer than necessary. According to the RFC 3629, a particularly subtle form of this attack can be carried out against a parser which performs security-critical validity checks against the UTF-8 encoded form of its input, but interprets certain illegal octet sequences as characters."
This subject has come up before, although the use case is unavailable during this writing. If It's found later, it will be added as a comment.
What is remembered, was a attack chain method that had gained entry to a native system and sat dormat. When entry was gained through the firewall, access further was denied. The program then transformed to a UTF-8 file until a time when it could gain access past the security software. Once the hole was opened, because of character bit coding, the python program was able to open a command prompt, or "drop shell". The attacker then had full access to the root. It then opened a gateway for another part of the program laying in wait.
It's very similar to the study "Using UTF-8 Encoding to Bypass Validation Logic" in the way it masks itself and becomes unreadable to the security software. The methods of attack are very similar.
Methods of Attack
1. Injection
2. Protocol Manipulation
3. API Abuse
In the case I had read before, the objective of the program was to penetrate as deep as possible. If blocked, if else, transforms to a UTF-8 and perform attacks in that prescribed matter. If successful, open a gateway to another part of the malware laying farther down the attack chain.
Got to say, it's interesting to think about. The way I see it, if you can write code that is able to outperform the limited scope of another system, you will have success in the attack. If the defending code has constraints and the attacking code has choices and options that were never programmed into the scope of the defender, then it's an obvious loss. Especially, when you can have an AI or machine learning attacker.
Attribute to the CAPEC Content Team, The MITRE Corporation 2014-06-23 Internal_CAPEC_Team for the article.