Should company protect DB schema at the same level of the data itself?

Question

If a company holds DBs that has access to different users. If the company has to decide how much to invest in protecting the data itself and how much to protect the DB schema (the structure and inner relations between tables and fields). Should the company invest only on protecting the data itself? Is there any threat for DB schema penetration. Is the schema has valuable meaning at all or can it increase the DB vulnerability for future attacks?

Having a schema makes it a lot easier to pull out useful data from a database if you can gain access. It can also be useful when identifying what data might be available, in the case of table level restrictions (e.g. can see there is a relationship to x table, but not the contents). However, if someone can access either unintentionally, it's not good! — Matthew, Sep 06 '16 at 16:02
I'd treat the database structure similar to the source code. As an IP issue, not as a security issue. — CodesInChaos, Sep 06 '16 at 16:16
If you can automate obfuscating the production schema and it doesn't kill your productivity, then might as well — Neil McGuigan, Sep 06 '16 at 17:35
Ha, **Information Security in its original, information theoretic sense**. I like it! See my answer below. — Marcus Müller, Sep 06 '16 at 21:19

score 6 · Answer 1 · answered Sep 06 '16 at 20:14

No

Protection of the database schema is a type of security by obscurity.

Your software should be designed and configured in such a manner that none of its security is compromised when a hacker learns its design. In fact, in the best case your design is made public so that security professionals can critique it and you can continuously improve it.

score 3 · Accepted Answer · answered Sep 06 '16 at 21:24

3

@John-Wu's is the right answer, investment should not be made to prioritize protection of schema above protection of the data.

Data is most important and is a compromise target for attackers. Schema is at best an information leak.

That said, databases themselves typically treat schema metadata at a different and higher privilege level than the data, not without reason. As @marcus-muller says, schema provides attackers with useful hints. And a user with privileges to see schema can almost always see data, while users with privileges to see data may not be able to see schema.

So schema should not be considered unnecessary to protect. But data is most important.

answered Sep 06 '16 at 21:24

Jonah Benton

3,439
12
20

1

wouldn't agree with you on this: *while users with privileges to see data may not be able to see schema.* Users with access to data can very often infer schema. – Marcus Müller Sep 06 '16 at 21:28
Right- I was, perhaps poorly, making a more abstract point, but concretely- certainly it is difficult to prevent a user with a database client from being able to see or infer schema. However, a user who simply has access to a webapp that renders the data can not necessarily infer the schema. That is, I think the circumstances in which data protection has to be considered vastly outnumber those in which schema exposure may be an issue, even though schema can still be considered privileged information. – Jonah Benton Sep 06 '16 at 22:02

score 1 · Answer 3 · answered Sep 06 '16 at 21:18

Although I agree with @JohnWu's answer that Security By Obscurity is a bad concept, there's one thing:

Yes. In a way.

Complex database schemes typically have complex relations. But let me come up with a simply example:

You've got a table

+----+------------+-------+
| ID | CustomerNo | Order |
+----+------------+-------+
|  0 |          1 |  Nuts |
|  1 |          1 | Bolts |
|  2 |          2 |  Nuts |
|  3 |          1 |  Wire |
+----+------------+-------+

Where ID is an auto-increasing primary key, CustomerNo refers to the primary key of your customer table, and Order refers to the primary key of your order table. This way, you map your orders to customers.

You, of course, only let each customer see the rows that "belong" to themselves.

Still, do you see the problem?

Exactly, through the linearly increasing ID field, customer 1 has an easy way of figuring out when the competition does their orders, simply by not acting like a "normal" customer, placing orders in bulks, but by automatedly placing orders every minute.

That way, they figure out that every Tuesday afternoon at 4:30pm, the competition orders. Now, knowing the market for hardware very well, they put a big sell option on the market every Tuesday 4:20pm, thus increasing the market price of nuts, harming their competitor, and also, financially profiting from knowledge they should never have gotten in the first place.

This is a question of data cleanliness, in principle. A monotoneous primary key gives you (partial) information you shouldn't have. Hence, your customer API would need to make sure the information leakage is zero. And I mean that mathematically: Information has a pseudo-unit, bit, and there's a whole mathematical discipline called information theory, that describes how much information is in the observation of a certain event based on randomness (for example, if there's more than two customers in that table, you can't know which of your competitors did the ordering at 4:30pm, but you still get probabilities out of this, and hence, non-zero information).

It's very unlikely you can hide such information from your customers if they have access to the database scheme. In fact, that scheme doesn't help them without you giving them info they probably shouldn't be getting. In that sense, well, your scheme doesn't need to be secret, but some of the data that you probably still only consider "relational" within a customer's accessible records should, in fact, be secret from the customer.

The values of a primary key is part of the data, not the schema. So the need to keep the primary key secret (or random) to stop enumeration attacks is unrelated to whether or not to keep the schema secret. — Anders, Sep 06 '16 at 22:57

Should company protect DB schema at the same level of the data itself?

3 Answers3

No

Linked