0

Please forgive the vagueness of the question title.

I am currently working/designing an opaque storage of immutable files. The purpose of the service is simple: storing files and being able to retrieve them by their identifier.

As the service is explicitly designed at storing immutable data, I had the idea of making the files identifier dependent on the contents of the files themselves.

The formula is as follow:

file_id = file-size + SHA-256(file-content)

With that design, it is guaranteed that each file will have a deterministic and unique id.

I thought about how to protect the data with typical access control (typical HTTP Authorization strategies come to mind) but the more I think of it, the more it feels unnecessary.

My rationale is as follow: the valid identifier space is extremely large. So large that the only way an attacker could fetch a specific file is either:

  1. Knowing the exact file content beforehand (allowing to compute the identifier).
  2. Gaining access to an existing identifier in some other system.

But...

  1. is... well. Idiotic: if you have the content of the file already, you don't need to access my storage service to fetch it.
  2. Is a security design problem for the other systems: regardless of whether they store the real data or a reference to it, they need to be secure by themselves anyway.

Should I still bother with an additional layer of security on top of my API or is that design secure enough in on itself?

ereOn
  • 134
  • 5

1 Answers1

1

This depends on the features of your API and on how your storage is intended to be used.

Steffen Ullrich
  • 190,458
  • 29
  • 381
  • 434