Problem
Having one HTTPS api which has an endpoint i.e POST data how can we prevent/check that the data received is not malicious or possibly detrimental?
To consider (please read)
- POST datareceives 2 payload parameters:- keyand- data. Where- keycould be any valid string (even allows extension- malicious.batand data a- stringor- multipart/form-data.
- GET datawill return the- datainto a downloadable file where the file name is the- keyparameter and sets the- Content-Typeto- application/octet-stream.
- This allows a client, to post stuff like: key: danger.batdata:!#/bin/bat\ndb_destroy;' and when retrived, a file on namedanger.bat` is downloadable.
- In the current implementation (for whatever X reason) I can only implement a new endpoint of POST datafrom 0 in order to don't destroy current api-clients, so in that part I am quite free but let's say, can't do much on theGET data.
- A virus scan is out of question, due to high volume of this API.
- It's clear that (unless someone knows), there is no guarantee to secure the API, we need here to find instead the Best prevention and solution in a Best Effort
Question
- Best way and how to validate data a scan for possible Malware/virus/XSS(cross-site scripting)/ and other possible malicious content sent via HTTP through an API by client. 
- What are the possible dangers on letting a client create/read any type of strings into a database without check what's the content? 
Scenario
With an API a client can connect to endpoint /postdata and currently accepts any Content-Type. The data can be inserted in a form-data or urlencoded base64, anytype of string or bytes. Once the the request is received in the database, there is not check, the data is only converted from a string into base64 bytes and then inserted into a database as bytes databytes, more specifically in a Cochbase database. The user also specify the name of its data with a name.
In another endpoint called /getDatathe client can retrieve that data.
So a sending client can send stuff like (as strings):
- Html file with possible Javascript scripts in it:
- Name: virus.html
- Data: <!DOCTYPE html><html lang="en"><head></head><body><script>alert("Virus!!!!")</script></body></html>
 
- Name: 
- Shell scripts
- Name: virus.bat
- Data: #!/bin/bash echo "Virus";
 
- Name: 
- Images
- Symply converted into base64if as parameter of a URL or as field in a form-data (multipart/form-data)
 
- Symply converted into 
- Executables as bytes.
- Shell scripts
Also worth pointing out that once this data is in the database, is not proccessed or used, but only released to the client when it requests it via the /getData endpoint.
The challange I'm facing is that currently, there is not checking on the string sent by client, so it could technically send malicious data, however, how can you check a string with all the different possible dangers?
Example
schroeder Actually in the comment made a good example:
how is the risk different from hosting an html file on OneDrive, Google Drive, Dropbox, etc.?
Now, this back-end service that I'm facing this challange is not like the above mentioned apps, however, a similar use case would be with Google Drive and you are the security developer
- user_aput in his drive:- virus.htmland shares it for the web.
- user_bclick on the- virus.htmlon the link, downloads it no problem and is pc blows up, or whatever malicious thing the- virus.htmlwas meant for
Where is the risk?
I can see 2
- How would you implement a possible (if any) check for user_awhen it uploadsvirus.html? Can you prevent it somehow?
- How would you implement a possible (if any) check for user_bwhen it downloadvirus.html?
What's the risk?
For example, someone can send data in a form of html (like above example) and also set it's name as a html file with file extension.
If then you request that data, the file is downloaded.
Structure
- API Back-end: Python
- Database: Couchbase with Bytes data type
Related
I've already read and went through these related post, however, they are more specific to file and when adding a file into a folder. However in this case we are adding a file as array of bytes into a database and later retrieved.
- Content-type validation in REST APIs
- Uploading Executable Files
- Use PHP to check uploaded image file for malware?
- What are the security risks of letting the users upload content to my site?
- Is it necessary to scan users' file uploads by antivirus?
- Antivirus for scanning anonymous file uploads
- What steps should be taken to validate user uploaded images within an application?
- What are security risks of serving user uploaded files without Content-Disposition?
- Why should I restrict the content type of files be uploaded to my site?
- Is it safe to store and replay user-provided mime types?
- Is it safe to serve any user uploaded file under only white-listed MIME content types?
- Using file extension and MIME type (as output by file -i -b) combination to determine unsafe files?
 
     
    