It depends on the context the injection happens.
Obviously, if the injection happens in the context of an elements content like this one:
<p>Your search for "❌" has returned the following results:</p>
<
is required to switch from text to markup. But even here, if you echo user input in a <script>
, e.g.:
<script>var search = "❌";</script>
You’ll need to look for other characters as you’re not in the plain text context. Here you should take care for characters which are special inside JavaScript string literals as well as certain sequences that can denote the script
element’s end tag, e.g. </script>
, or </script/
.
Similarly, if you print user input in an element’s attribute value:
<input type="text" name="search" value="❌">
Here you have to take care of characters that are special for double quoted attribute values, i.e. the delimiting "
quote. If you’re using single quotes or no quotes at all, different rules need to be applied.
Additionally to that, you don’t just have to look for syntax but also for semantics. Like there are javascript:
and data:
URIs that can be used for XSS. Or there is a JavaScript that uses part of the user supplied data for some evaluation, or retrieval of some additional script code, etc. There are hundreds of examples.
So always take the context into account in which you want to put user supplied data and encode that data correspondingly.