57

Are there any programming languages that are designed to be robust against hacking?

In other words, an application can be hacked due to a broken implementation, even though the design is perfect. I'm looking to reduce the risk of a developer incorrectly implementing a specification.

For example

Question

  • Is there a language that addresses many or most of these issues? It's acceptable for the language to be scoped for a particular use-case such as WebApps, Desktop, Mobile, or Server usages.

Edit: A lot of people addressed the buffer-overflow issue, or say that the programmer is responsible for security. I'm just trying to get an idea if there exist languages whose main purpose was to lend itself to security as much as possible and reasonable. That is, do some languages have features that make them clearly more (or less) secure than most other languages?

TruthOf42
  • 845
  • 1
  • 7
  • 12
  • 5
    Would you accept any variant of Pig Latin as a valid answer? – MDMoore313 Apr 14 '14 at 19:37
  • @BigHomie that made me chuckle, but that sounds a little bit like you don't like my answer selection, if so please rebut – TruthOf42 Apr 14 '14 at 19:40
  • 1
    Well, I don't have experience w/ Ada, however I don't believe *any* programming language would have prevented heartbleed, and am prepared to defend that statement. – MDMoore313 Apr 14 '14 at 19:48
  • 3
    @BigHomie You seem to be contradicting Tom Leek. Care to explain why a language with automatic array bounds checking and memory management wouldn't have prevented Heartbleed? – Brilliand Apr 14 '14 at 20:24
  • 1
    @Brilliand I did read his answer. To clarify I mean any language that allows the developer to access dynamic memory (allocate, copy, etc), which is what that function does. Simply put, the vulnerability is *not* a buffer overflow, neither due to *array* bounds checking. This is because the function that sends the heartbeat uses memcpy() and takes a pointer to the supposed 'payload' sent from the client, which in reality can contain *anything*, as we've seen. – MDMoore313 Apr 14 '14 at 20:52
  • 1
    Also, rereading your question, this was *not* the result of buffer overflow, while they are close, I don't know if there is a term for this. – MDMoore313 Apr 14 '14 at 20:53
  • 1
    Most programming languages don't exist to lend themselves to security, because the purpose of most programs isn't to be secure. The purpose of, say, a banking application is primarily to monitor and transfer your money and secondarily to be secure. So a programming language whose main purpose is to be secure isn't too terribly useful, because, with the exception of AV, security is never the main purpose of whatever it is you're programming – KnightOfNi Apr 14 '14 at 23:37
  • 1
    @KnightOfNi "security is never the main purpose of whatever it is you're programming" - that's a pretty bold statement. I would say OpenSSL's main purpose IS security, as well any communication protocol whose messages are intended to be secret – TruthOf42 Apr 15 '14 at 12:31
  • @KnightOfNi is right, it's always functionality *first*, not to branch too far off topic. A program is written to solve a particular problem, in OpenSSL's case, that problem is *the implementation* of a security protocol. How secure that *implementation* is, is an extremely close second to its primary function. – MDMoore313 Apr 15 '14 at 12:42
  • 1
    regardless of the language, ultimately you are at the mercy of the compiler http://cm.bell-labs.com/who/ken/trust.html – Colin Cassidy Apr 15 '14 at 13:10
  • At one time Java sort of fit this description, but (beginning roughly with "reflections") it got to be too complex to verify as being "secure" in any non-trivial context. – Hot Licks Apr 15 '14 at 15:48
  • How can it be that nobody brought up dependently typed languages, which can statically enforce every safety feature I know from other languages and much more – Niklas B. Apr 15 '14 at 16:40
  • 1
    Offloading the responsibility of security from programmers who are experts in security to compiler/interpreter/VM/OS developers who are not seems like a bad idea. – Anthony Apr 15 '14 at 21:15
  • 1
    There are no safe programming languages, there are safe programmers. Everybody could do stupid (unsafe) things regardless of the language. The fact is that some languages make easier to do stupid things. – Manu343726 Apr 15 '14 at 22:02
  • The problem of designing a language with safety in mind is that could impose some restrictions or make some future problems. Consider Java: It was dessigned to be easy and be hard-to-break by their users (Bounds checking, exceptions, etc) but as the language complexity increased, more difficult was to make its usage easy, simple, and safe. In fact, nowadays complex code bases in Java are neither simple, easy to maintain, and secure. – Manu343726 Apr 15 '14 at 22:05
  • @BigHomie Maybe I don't fully understand heartbleed, but it seems to be the result of a buffer "over-read". There are languages whose runtime system throws an error when this sort of thing occurs, rather than allowing it to go through. Wouldn't that have prevented it? – David Apr 16 '14 at 00:59
  • @DavidYoung I'm going to have to pose that question on one of these sites to see if there is a name for it. That wouldn't have prevented it, sadly, because the only way to *know* how much was supposed to be returned is relying upon the data from the user. Without checking the presented length with the *actual* length, you have, in this case, *Heartbleed*. – MDMoore313 Apr 16 '14 at 12:36
  • @DavidYoung I've proposed an edit to this question, the most well defined term for this to date is a [Buffer Over-Read](http://cwe.mitre.org/data/definitions/126.html). – MDMoore313 Apr 16 '14 at 13:43
  • @Truth - I made some edits, and added what I thought were interesting links. Please revise as to your intent. – makerofthings7 Mar 03 '15 at 23:52
  • I clarified the question with a new last sentence: "That is, do some languages have features that make them clearly more (or less) secure than most other languages?" And wonder if it can be re-opened in its current form. – Tom Au Mar 04 '15 at 00:22

11 Answers11

56

Actually most languages are "secure" with regard to buffer overflows. What it takes for a language to be "secure" in that respect is the conjunction of: strict types, systematic array bound checks, and automatic memory management (a "garbage collector"). See this answer for details.

A few old languages are not "secure" in that sense, notably C (and C++), and also Forth, Fortran... and, of course, assembly. Technically, it is possible to write an implementation of C which would be "safe" and still formally conforms to the C standard, but at a steep price (for instance, you have to make free() a no-operation, so allocated memory is allocated "forever"). Nobody does that.

"Secure" languages (with regards to buffer overflows) include Java, C#, OCaml, Python, Perl, Go, even PHP. Some of these languages are more than efficient enough to implement SSL/TLS (even on embedded systems -- I speak from experience). While it is possible to write secure C code, it takes (a lot of) concentration and skill, and experience repeatedly shows that it is hard, and that even the best developers cannot pretend that they always apply the required levels of concentration and competence. This is a humbling experience. The assertion "don't use C, it is dangerous" is unpopular, not because it would be wrong, but, quite to the contrary, because it is true: it forces developers to face the idea that they might not be the demigods of programming that they believe to be, deep in the privacy of their souls.

Note, though, that these "secure" languages don't prevent the bug: a buffer overflow is still unwanted behaviour. But they contain the damage: the memory beyond the buffer is not actually read from or written to; instead, the offending thread triggers an exception, and is (usually) terminated. In the case of heartbleed, this would have avoided the bug from becoming a vulnerability and it might have helped to prevent the full-scale panic that we observed in the last few days (nobody really knows what makes a random vulnerability go viral like a Youtube video featuring a Korean invisible horse; but, "logically", if it not had been a vulnerability at all, then this ought to have avoided all this tragicomedy).


Edit: since it was abundantly discussed in the comments, I thought about the problem of safe memory management for C, and there is a kind-of solution which still allows free() to work, but there is a cheat.

One can imagine a C compiler which produces "fat pointers". For instance, on a 32-bit machine, make pointers 96-bit values. Each allocated block will be granted a unique 64-bit identifier (say, a counter), and an internal memory structure (hashtable, balanced tree...) is maintained which references all blocks by ID. For each block, its length is also recorded in the structure. A pointer value is then the concatenation of the block ID, and an offset within that block. When a pointer is followed, the block is located by ID, the offset is compared with the block length, and only then is the access performed. This setup solves double-free and use-after-free. It also detects most buffer overruns (but not all: a buffer may be a part of a bigger structure, and the malloc()/free() management only sees the outer blocks).

The "cheat" is the "unique 64-bit counter". This is true only as long as you don't run out of 64-bit integers; beyond that, you must reuse old values. 64 bits ought to avoid that issue in practice (it would take years to "wrap around"), but a smaller counter (e.g. 32 bits) could prove to be a problem.

Also, of course, the overhead for memory accesses may be non-negligible (quite a few physical reads for each access, although some cases may be optimized away), and doubling pointer size implies higher memory usage, too, for pointer-rich structures. I am not aware of any existing C compiler which applies such a strategy; it is purely theoretical right now.

Tom Leek
  • 170,038
  • 29
  • 342
  • 480
  • 3
    The big problem with "don't use C" is interoperability with other languages. Pretty much any language supports writing a wrapper for C libraries. But you can't easily use say a Java library from C# or Python. – CodesInChaos Apr 14 '14 at 15:05
  • Well you can interface with (C)Python more or less as reliably as with C. I think that the issue is that you'd be vunerable to any bug in VM this secure language is running on... – jb. Apr 14 '14 at 15:08
  • 1
    From what I've seen, poor coding can make _any_ language insecure _in the practical sense_. Good coding can make most languages secure, even inherently "unsecure" ones. I'll also risk being flamed for suggesting that the higher level the language, the less secure it is as you are relying more on routines written by others, which can be subject to more subtle bugs. Or to put it another way, I've never had a box pop up on my computer informing me that my version of C is out of date & needs patching to fix security issues... – John U Apr 14 '14 at 18:56
  • 3
    But you had that kind of box about the "standard C library"... which is formally part of the language. Similarly, you never had a popup about the _Java compiler_ being insecure, only the _runtime support code_. – Tom Leek Apr 14 '14 at 18:58
  • I think one could make a conforming safe C even with freeing memory, but it would require significant runtime overhead, and probably significant memory overhead as well. (WRT `free`, the answer would revolve around inserting pointer checks before pointers are dereferenced to ensure the range is valid, and to signal/abort if it's invalid. – Mooing Duck Apr 14 '14 at 21:33
  • 3
    @JohnU I've also never had a box pop up and inform me that code I have written myself needs patching to fix security issues. In other words, that box doesn't mean, "Wow, another instance of my platform being unsafe!" It means, "Hey, part of my code just got fixed, and _I didn't even have to do anything_!" – Two-Bit Alchemist Apr 14 '14 at 21:39
  • @MooingDuck: Consider a C program which allocates a pointer to a gig of memory, puts some stuff there, shows a 16-digit hex representation on the screen, frees it, and repeats that process 2,000 times/second. If someone enters a 16-digit number that has been previously displayed, the program shows the data it put in that space. If someone only copies down the address of the 517th pointer displayed, all of the allocations other than the 517th may be freed, but there's no way a program could possibly know that. – supercat Apr 14 '14 at 21:55
  • 2
    @JohnU I don't know about flaming for it, but I do think your assertion that "written by others makes it more subject to subtle bugs" is flat out wrong. Code written by experts in the field that code pertains to is orders of magnitude less likely to have subtle bugs than stuff you wrote yourself. It's why you hear things like "Don't write your own crypto" "Don't reimplement standard library data structures". – corsiKa Apr 14 '14 at 21:58
  • @MooingDuck: For a language to be secure, it must either forbid conversions between at least some kinds of pointers and integers, or else it must overtly declare that nothing important should be stored in areas of memory accessible to such pointers. – supercat Apr 14 '14 at 21:59
  • @supercat: Your example makes little sense, as it relies on undefined behavior. What that means is the program _should not work_ period. My suggestion was that with some overhead, one could make a conforming C compiler that would signal/abort when you tried to do any undefined behavior such as you describe. (If I misunderstood, you may have to clarify your example) – Mooing Duck Apr 14 '14 at 22:07
  • @MooingDuck: If one were to get rid of the `free` statements and abandon the pointers without freeing them, then the behavior of the program would be perfectly defined, but it would never be able to reclaim any memory. If one includes the `free`, then the behavior is undefined, but a secure language should affirmatively promise to trap errant behavior rather than allow it to cause nasal demons--a bit hard to do if memory can get recycled while references exist. – supercat Apr 14 '14 at 22:14
  • @supercat: hard certainly. But not impossible. In fact, many languages do it. – Mooing Duck Apr 14 '14 at 22:29
  • @MooingDuck: The ability to convert a sequence of numbers to a legitimate pointer makes security essentially impossible, since there's no way of ensuring that a pointer produced by picking numbers out of a hat won't alias some existing object. – supercat Apr 14 '14 at 22:41
  • C is younger than some secure languages. – Kaz Apr 14 '14 at 22:42
  • @supercat: If it doesn't alias any existing object, it should segfault. If it aliases an object of the correct type, that's desired behavior. All you have to do is affirm that the pointer is pointing at an actual object of the correct type, which is hard, but not impossible. (absurdly and impractically hard, to be sure) – Mooing Duck Apr 14 '14 at 22:49
  • @MooingDuck: The only thing that should alias a pointer to an existing object is a pointer that was somehow derived from that object. A pointer derived from numbers drawn out of a hat should never alias to anything. I suppose if one used a large enough pointer type, one could have each pointer encode a sequence number, base address, offset, and cryptographic checksum so that arbitrarily-generated byte sequences would be recognizable as bogus, and no pointers would ever get reused. If the compiler required that the base address for each pointer contain a pointer to the last address... – supercat Apr 14 '14 at 23:03
  • ...in the same allocated object, then recycled pointers would be recognizable as such. The resulting pointer objects would seem a bit unwieldy, however. Having an "object identifier" type could make things much more efficient. – supercat Apr 14 '14 at 23:05
  • Proof it's hard to fully cover a subject in a comments box ;) My meaning was simply that some languages put you more in control at a lower level than others, you can use that for good or evil. You have the _choice_ to import a known & trusted library, or to roll your own if you know better. When you don't have that control, you have no choice but to rely on the work of others and hope they got all the bugs out. Some cash machines run Windows XP, a better OS than most of us could write on our own, but that doesn't make it a brilliant idea for that application. – John U Apr 15 '14 at 08:02
  • C is dangerous, but that doesn't mean it shouldn't be used. C is for super speed and/or low-level hackery. C is not the optimal language for security and most C programmers would be the first to admit that. All good programmers know - use the right tool for the right job. – Pharap Apr 15 '14 at 15:44
  • 1
    @supercat @ TomLeek Memory safe C compilers are *not* only theoretical (although they are admittedly more in the class of proof of concepts and disallow the int to pointer casting you have identified as problematic, supercat). In this talk Andreas Bogk explains his LLVM plugin: [Bug class genocide](https://www.youtube.com/watch?v=2ybcByjNlq8) [pdf](http://blog.andreas.org/static/30c3-buffer-overflows.pdf). It only introduces a runtime overhead of 100%. – Perseids Apr 15 '14 at 16:16
  • @Perseids: Does it allow for functions like qsort that can operate on arbitrary kinds of data, or does it require that any operations involving structure copying and manipulation must have compile-time knownedge of the structures being manipulated? – supercat Apr 15 '14 at 16:20
  • @supercat I haven't watched the whole talk recently, but as far as I remember casting to void* should not pose a problem. You mustn't loose the information that it is a pointer though. So casting char*->int->char* is not possible (afair). – Perseids Apr 15 '14 at 20:33
  • @Perseids: If a void* can be cast to a double-indirect pointer or a pointer to a structure containing a pointer, that will allow an arbitrary sequence of numbers stored in a char[] to be interpreted as a pointer. – supercat Apr 15 '14 at 20:42
  • @supercat: Haven't thought about that, right. Does it often happen in normal C code? Also - thinking about their design - they use a shadow map that maps a pointer to its bounds. So it might actually be okay to loose the information that this was pointer, as with false addresses you end up with an index error in the lookup. – Perseids Apr 15 '14 at 21:02
  • @Perseids: C code commonly needs to store structures containing pointers into memory received via `malloc()` or `calloc()`. There's no mechanism to ensure that a particular allocation won't get cast to different pointer types, some of which have pointers in places that others have numbers. – supercat Apr 15 '14 at 21:10
55

The Ada language is designed to prevent common programming errors as much as possible and is used in critical systems where a system bug might have catastrophic consequences.

A few examples where Ada goes beyond the typical built-in security provided by other modern languages:

  • Integer range type allows specifying an allowed range for an integer. Any value outside of this range will throw an exception (in languages that do not support a range type, a manual check would have to be performed).

  • := for assignment = for equality checks. This avoids the common pitfall in languages that use = for assignment and == for equality of accidentally assigning when an equality check was meant (in Ada, an accidental assignment would not compile).

  • in and out parameters that specify whether a method parameter can be read or written

  • avoids problems with statement group indentation levels (e.g. the recent Apple SSL bug) due to the use of the end keyword

  • contracts (since Ada 2012, and previously in the SPARK subset) allow methods to specify preconditions and postconditions that must be satisifed

There are more examples of how Ada was designed for security provided in the Safe and Secure Booklet (PDF).

Of course, many of these issues can be mitigated through proper coding style, code review, unit tests, etc. but having them done at the language level means that you get it for free.

It is also worth adding that despite the fact that a language designed for security such as Ada removes many classes of bugs, there is still nothing stopping you from introducing business logic bugs that the language doesn't know anything about.

Pixel Elephant
  • 756
  • 6
  • 6
  • 1
    +1 for referencing the Safe and Secure Booklet – TruthOf42 Apr 14 '14 at 19:28
  • 26
    And yet, all of Ada's safety didn't prevent the Ariane 5 failure. – Robert Harvey Apr 14 '14 at 20:45
  • 9
    This is a nitpick but it's not having two different operators for assignment/comparison that allows Ada to do that. After all, `=` and `==` are different, too. Python distinguishes `=` (assignment) and `==` (comparison), and "won't compile" (`SyntaxError` exception) when they are misused. I think the main point of using `:=`/`=` rather than `=`/`==` is to prevent typos. – Two-Bit Alchemist Apr 14 '14 at 21:35
  • @Two-BitAlchemist I'm not totally sure which case you are referring to, but `foo == "Bar"` doesn't produce a `SyntaxError`. – Chris Down Apr 15 '14 at 03:56
  • 3
    @ChrisDown Of course not. That's simple comparison. `if 'a' = 'b':` on the other hand... – Two-Bit Alchemist Apr 15 '14 at 13:17
  • @RobertHarvey http://www.adapower.com/index.php?Command=Class&ClassID=FAQ&CID=328 - not to segway, but ADA, nor the programming was at fault, it was a hardware design fault - i.e. they system was never designed for the Ariane 5 flightpath – TruthOf42 Apr 15 '14 at 14:59
  • @TruthOf42: The Ariane 5 failed because a 16 bit numeric value overflowed. The programmers overrode safety mechanisms in ADA that would have prevented that, because they believed that the number would *never* overflow. – Robert Harvey Apr 15 '14 at 15:18
  • 2
    @RobertHarvey Doesn't matter, the engineers used a hardware system (happened to use ADA), designed for the Ariane 4 flightpath, where it worked just fine, and then tried to use it for something it was NOT designed for (Ariane 5 flightpath) - also the fact that they had to override anything just proves more to the point that they were aware of what they were doing and had to take active steps (because of the language) to do bad things – TruthOf42 Apr 15 '14 at 15:22
  • @TruthOf42 I guess I'm merely pointing out that type safety is, to a certain degree, a crutch. Unless you know what you're doing, you can still make a very expensive rocket assume the aerodynamic properties of a toaster. – Robert Harvey Apr 15 '14 at 15:25
  • @RobertHarvey I'm not advocating that if you just use the right language all your problems go away, but that it makes sense to use a particular type of language for particular types of situations, security being one of those situations – TruthOf42 Apr 15 '14 at 15:30
  • Using = for assignment and == for equality comparison is not a pitfall. Most programmers are used to using == for equality checks and = for assignment. The end keyword is also not unique, most languages have a way of defining the start and end of a block, most use either curly braces or end (i.e. Lua). The other three are completely valid (though in/out is debatable), but those two aren't really unique to Ada. – Pharap Apr 15 '14 at 15:39
  • For the only ADA project I've ever seen, all the built-in checking was used for debug, testing, and integration, but when the product shipped, **all those built-in checks were turned off** because they were "too slow" for end-user usage. Talk about testing how you run... – kmort Apr 16 '14 at 20:29
  • RobertHarvey says they turned off a safety-check, shit happened, and therefore the language's safety goals are BS. Lol. I think the Arianne failure says more about the importance of good requirements analysis and proper tool use than anything else. Meanwhile, Ada + SPARK are still rocking high integrity development plus showed it was a good investment for future-proofing software. – Nick P Jan 11 '15 at 17:59
16

Most programming languages higher level than C are much more secure when it comes to programming errors like Heartbleed's. Examples that primarily compile to machine code include D, Rust and Ada. It's not interesting to talk about just memory safety, in my opinion.

Here is a list of additional programming language features that (I think) make it much harder to write unsafe code. The first five features expand the compiler's capabilities in reasoning about your code, so you, a human being prone to error making, don't have to*. In addition, these features also should make it easier for a fellow human being, an auditor, to reason about your code. OpenSSL's source code is often described as a mess and a language stricter than C could have helped to make it easier to reason about. The last two features are about context issues that affect security as well.

  • A strict type system: Makes it easier to reason about program correctness. Eliminates certain input attacks.
  • Immutable by default: having immutable values as the primary data container means it is much easier to reason about the state of your program.
  • Disabled or restricted unsafety: Don't allow scary things such as pointer arithmetic (e.g. Go), or, at least only allow it if wrapped in big fat warnings (Rust). Note that a language lacking in pointer arithmetic completely is excluded for use in a huge number of applications that require low level access.
  • Compile time taint checking: expand the type system to allow identifying tainted values: values that depend in some way based on input. The compiler could then (conditionally) forbid operations with a tainted value that leak information to outside observers, such as branching on such a value. This could prevent or at least migitate certain classes of timing attacks. As far as I know, these are only available in static code analysis tools, and not in compilers themselves?
  • Dependent types: dependent types are a means to tell the compiler that "here is an Int whose values are between 2 and 87" or "here is a String of maximum length 12 containing only alphanumeric characters". Failure to meet these requirements results in compilation failure, and not a runtime failure with likely unsecure results. This feature is available in Idris and some theorem prover languages.
  • Absence of garbage collection: Garbage collection is a big problem for language safety - it creates garbage collection pauses in your program. These pauses leak information about the state of your program and allow timing attacks to happen. When the garbage collector is invoked is impossible (or at best incredibly hard) to predict as a developer, however, and subject to huge changes for even the smallest amount of code changes.
  • Performance, portability & interopability: It may be fine if you have need for a secure and slow program that only runs on the PowerPC platform, but don't expect anyone else to use it for a cross-platform TLS library. OpenSSL is popular precisely because it's fast and runs everywhere from obscure MIPS-based routers to massively parrallel SPARC servers and everything in between. Furthermore any program or runtime in the world can interface with OpenSSL as a library because it uses C calling conventions.

From my limited knowledge of languages, no language does all of these. Rust is an example of a language that covers many - it is strict, immutable by default, has restricted unsafety, does not require garbage collection and is quite performant and portable. Compile time taint checking and dependent types presently appear to be exotic features that require either additional static code analysis tools or new languages, unfortunately.

* See also: formal verification

DCKing
  • 325
  • 1
  • 10
  • That's a good list. Dependent types are a big deal. Haskell, with appropriate extensions (like Type Families) covers most of the list. The execution model is not so great for preventing timing attacks, though. Lazy evaluation is a "mirror image" of garbage collection. – nomen Apr 16 '14 at 05:44
  • Not all garbage collectors are stop-the-world types. There are incremental and concurrent garbage collectors. – Doval Apr 16 '14 at 11:47
  • My garbage collection comment is the most relevant example of creating unpredictability in your programs. It is unwise to create behaviour in your program that is undetermistic or unpredictable for you as a developer. The characteristics of this behaviour might give an attacker too much information. You can make your program unpredictable with garbage collection (whichever type of GC you use), lazy evaluation, concurrency and more things. In most garbage collected languages it is in fact unavoidable, which is bad. – DCKing Apr 16 '14 at 12:43
  • The garbage collector comment is unfair because GC languages prevent huge amounts of common vulnerabilities while side-channel attacks due to GC's are almost non-existent in real world. I've discouraged GC's for crypto or anonymity systems due to that weakness but they solve a lot of problems in regular apps. Not to mention you get plenty of solutions to problems you mention typing real-time garbage collection into Google incl commercial Java products. – Nick P Jan 11 '15 at 17:56
  • 1
    @NickP No. *Memory safety* is what prevents huge amounts of common vulnerabilities, not garbage collection. It just so happens to be that many garbage collected languages are memory safe: but garbage collection does necessarily not imply memory safety nor does memory safety necessarily imply garbage collection. Furthermore 'realtime garbage collection' is not a solution to any problem I mentioned, as even realtime garbage collectors create complex unpredictability within your programs. I agree that memory safe garbage collected languages are preferable over non-safe, non-GC'ed languages. – DCKing Jan 15 '15 at 15:54
  • Good point on memory safety. Wrong on realtime garbage collection. The whole point is, regardless of their technique, they leave execution predictable. Most are not tweaked for covert channel mitigation because most developers don't give a shit about that. You can tweak them to mitigate timing channels. I just made sure secret-driven operations always concluded in a fixed amount of time. If I wanted more, I could use priority-aware, asynchronous execution. Measurements show it works and an academic even recently built a processor to do same thing. Solution is there but not applied. Common. – Nick P Jan 22 '15 at 19:43
  • As I said, though, I discourage use of even real-time GC's if covert channels *really* matter. They usually only matter at interface, though, whose timing can be controlled even with RT GC. Can and has. – Nick P Jan 22 '15 at 19:44
  • @NickP Alright - I would say that garbage collection is somewhat of a nitpick. Even though you can control its unpredictability, it *is* an additional covert channel to worry about albeit a minor one. As an additional point, garbage collection is quite an obstruction for the performance and portability point: nobody is going to write a security library in a GC'ed language with mostly slower performance and a large associated runtime that must be loaded for its use. So GC'ed languages are of more limited use. – DCKing Mar 06 '15 at 16:24
  • I agree it's still a covert channel: said as much about Freenet using Java. Developers of system software will often avoid it for reasons you mention. All true. Good news is exceptions are slowly increasing. – Nick P Mar 13 '15 at 21:53
6

In the general spirit of what you're asking, I think the E language (the "secure distributed pure-object platform and p2p scripting language") is pretty interesting, in that it is attempting to securely offer features/computation models not generally available.

moe
  • 61
  • 1
  • 2
    could you provide a bit more detail ala the ada answer? thanks! – lofidevops Apr 15 '14 at 10:29
  • E is a safe by design against many things language joined with the object capability security model. That model can be used to implement many types of security policies and systems in a way that supports POLA. It also has support for distributed computing. Its backers, esp Combex, have thrown together secure chat in (100 lines?), made a secure browser, and made secure desktop prototype immune to many problems Windows has. So, I'd say it more than qualifies as a secure language. The implementation is where the dragons will be, esp reliance on Java. – Nick P Jan 11 '15 at 18:04
5

All current (meaning still updated) programming languages are designed to have as few inherent security flaws as possible, but at the end of the day it's (almost always) the programmer who is responsible for security flaws, not the language he's using.

EDIT: As @DCKing pointed out, not all languages are equal, and I'm not saying it's a good idea to pick one at random and try and make it work. I am saying that a (very) talented C programmer can make a program just as secure as a semantically identical program written in a higher level language. My point is that we should recognize that some languages make it easier to make mistakes, but also know that in the end it's the programmer's mistake, not the language's (with few exceptions)

KnightOfNi
  • 2,267
  • 3
  • 19
  • 23
  • 4
    That's not the what's asked, though. – Ven Apr 14 '14 at 15:14
  • 4
    @user1737909 It answers the question. All programming languages are designed not to be inherently insecure, but that doesn't matter because it's how the programmer uses them that determines how secure a system is. – KnightOfNi Apr 14 '14 at 16:03
  • 7
    @KnightOfNi I feel like this answer is equivalent to saying that a castle is only as secure as the soldiers defending it, which is partly true, but there's a reason why castles were designed the way they were, so it's easier to defend and less work has to be done by soldiers – TruthOf42 Apr 14 '14 at 19:46
  • 1
    @TruthOf42 Well, "a programming language in which the programmer does less work" is just a higher level programming language. Continuing with your castle analogy, the question is "what type of rock is best for my castle walls?" The answer is that unless you're using chalk, no one is going to go for your walls (in a majority of cases), they'll exploit the fact that you (the architect) forgot to add a door to the castle or left in a secret passage. It's almost always systems that are exploited, not languages themselves (with, obviously, a few exceptions). – KnightOfNi Apr 14 '14 at 20:36
  • In my analogy The Castle is the language, and the soldiers are the program. If you design a Castle with a very large entry way it's harder to defend, but if you design a castle where only one person can enter at a time it makes it easier to implement a policy that that ensures the person entering is authorized. What I'm trying to get at is that you can design a language is such a way that ENCOURAGES good security practices. – TruthOf42 Apr 14 '14 at 20:46
  • @TruthOf42 I know. I changed it to show my point. I think the castle represents the entire system/program better because it has so many different things that add up to its being secure against enemies, each of which is a function. The things that make the functions secure (the skill and integrity of the guards, having a door, etc) are the elements of the language. Doing this backwards, the way you did, seems counter-intuitive to me, so I changed it around to better make my point. – KnightOfNi Apr 14 '14 at 22:12
  • It is easier to write insecure programs in C and PHP than it is to write them in Ada or Scala. Don't pretend the choice of language is arbitrary and everything is down to the programmer. Some tools are better than others. – DCKing Apr 15 '14 at 17:35
  • @DCKing I never said that, and that's not the question. The question is "[are there] languages whose main purpose [is] to lend [themselves] to security as much as possible and reasonable?" My answer was that few, if any, languages are designed to be INsecure, but that the security of a code is the responsibility of the programmer. Who said anything about all languages being equal? – KnightOfNi Apr 15 '14 at 19:56
  • @KnightOfNi To clarify, I did not necessarily read that in your answer. Your answer may be interpreted to imply that however, and I felt that a comment was needed make sure people didn't read it that way. It's meant as an addition, not as a correction :) – DCKing Apr 15 '14 at 20:08
  • @DCKing Oh, OK. Thanks for letting me know. I'll reference it in my answer. – KnightOfNi Apr 15 '14 at 20:09
  • I will still continue to disagree (and agree with @TruthOf42). Let's say for example, that a language providing a good way to avoid dealing with strings, maybe an XML literal library that auto-escapes correctly depending on the where in the XML AST you are (node, etc) would help tremendously. – Ven Apr 16 '14 at 18:10
  • @user1737909 It would indeed help tremendously. What's your point? – KnightOfNi Apr 16 '14 at 23:52
  • My point is : this is what was the question. Languages with such features. – Ven Apr 17 '14 at 09:57
  • @user1737909 I won't pretend to understand what you just said, but the question WASN'T "are some languages better/easier than others for security?" It WAS "are any languages specifically designed to be as secure as is reasonably possible?" – KnightOfNi Apr 17 '14 at 13:28
4

There is no such thing as a secure language. If a language provides enough security for your problem depends a lot on the problem you are trying to solve. Like if you are writing a web application the security of most languages used in this context (e.g. Java, PHP, JavaScript... add your favorite) is enough to prevent things like buffer overflows, but even the more strongly typed languages don't offer inherent support for web specific things, e.g. like making it impossible or at least hard to introduce Cross-Site-Scripting bugs or similar. And no languages will protect you against a bad trust model, like trusting DNS servers (DNS rebinding etc), the current PKI model or by including third party (e.g. out of your control) scripts into your web application (typically ads or google analytics).

So the choice of a proper language might help you a bit, but there is not magic security sword.

Steffen Ullrich
  • 190,458
  • 29
  • 381
  • 434
  • +1 for "magic security sword." And all of the other stuff too :) – KnightOfNi Apr 14 '14 at 23:43
  • There is no language that is "secure" because "secure" is never probably defined (and not definable). There *are* languages that are *more* secure than other languages however given some threat model, and I think that's what this question is about. – DCKing Apr 15 '14 at 17:30
  • By the way, what do you mean with "more strongly typed languages don't offer inherent support for web specific things"? I fail to see how strongly typed languages don't offer that support or even how they are worse at supporting those features! – DCKing Apr 15 '14 at 17:32
  • They are not worse, but they don't provide inherent support for various kind of strings (html, javascript, css, url...) and make sure that everything gets escaped in the right way if you concatinate different types of strings (the rules are tricky and sometimes browser dependend). So right now most developers do it by hand or with some help of a toolkit but often do it wrong and cause XSS attacks. You can probably implement it, but right now there is at least no inherent support in the major languages used for web development. – Steffen Ullrich Apr 15 '14 at 17:57
  • @SteffenUllrich: look up Yesod. Oh, wait, you're moving the goalposts. "Major languages used for web development" don't have strong type systems. – nomen Apr 16 '14 at 05:40
  • Yesod is a framework you have to use and not a language. And while haskell might be a nice language I doubt that somebody is using it for serious web development. Yes, I know there is interesting research for better inherent security, but it does not help if they don't get used for serious stuff :( . Unfortunately you have to work with what developers currently can and don't expect them to jump on the next shiny thing, at least if the learning curve is too high. – Steffen Ullrich Apr 16 '14 at 05:59
  • Lots of people write 'serious' web apps in Haskell, Scala and F#, all of which are very strongly typed and esoteric looking for the casual PHP programmer. I'll have you know that I'm more productive and capable in writing web apps in Scala than in PHP, so to each their own. The funny thing about your comment is that the necessity of a framework to do things makes languages somehow less capable for serious web development. I'll challenge you to find any secure serious web application that does *not* use a framework (albeit self made). – DCKing Apr 16 '14 at 13:12
  • Maybe you are right and haskell, scala etc provide secure frameworks and are used enough to be noticeable. I don't doubt that you can be very productive with them. But, I see that not even companies like ebay, google, facebook etc which definitely have very smart programmers are immune to typical web 2.0 problems like XSS or CSRF. So I guess the frameworks they use are not secure enough yet. And, the original question asked about secure languages, not secure frameworks. – Steffen Ullrich Apr 16 '14 at 17:37
  • That's not correct at all: Opa and Ur/Web are languages designed to be immune to many typical web 2.0 issues. So they exist. People just don't use them much. – Nick P Jan 11 '15 at 17:45
4

Remember that for most programming languages, you have to worry about the security of two languages. There's the language you're actually using, and then there's the language that the compiler or interpreter are written in, which is often different. (Technically, there's a third, which is the microcode of the CPU itself.) A security issue in either of those languages can make your program insecure.

Mike Scott
  • 10,134
  • 1
  • 28
  • 35
  • 1
    I thought all the cool kids were supposed to write the compiler of their language in that language. – Superbest Apr 15 '14 at 06:19
  • Most native code compiled languages are self-hosting (implemented in themselves). However, most popular languages these days are hosted in a VM which makes that impossible. You can still write a source-to-bytecode compiler in that language (e.g. javac) but the VM that interprets the bytecode has to be native code. – Nate C-K Apr 15 '14 at 14:27
  • 1
    However, I'd much rather worry about buffer overflows only in the code of the JVM itself instead of worrying about them in every single library my code depends on. – Nate C-K Apr 15 '14 at 14:29
  • Good point. Hence the importance of all the "certifying" compilation, interpretation, and memory management research. I think co-simulation is one of the better solutions. You create a model of how the source and target language do things. Run data through each one to obtain execution traces. Compare properties of traces to ensure they match in function and risk. One project is using this technique successfully for a C to MIPS optimizing compiler. – Nick P Jan 11 '15 at 17:47
3

Firstly, you do not actually write programs in programming languages. You write instructions for the compiler which describe what kind of program you want, and the compiler produces a program, in its own peculiar way, which will hopefully (if your compiler is well-designed) do the same thing that your source code describes. All programs, when they are running, are in "machine language" - they are a series of numbers that are interpreted in a certain way when loaded into RAM and fed into the CPU. Machine language is not designed with robustness to hacking in mind, so no language that is compiled can be truly "resistant" to hacking, because the actual program will be in machine language anyway. Any interpreted or VM language will still run in a native framework which is compiled ultimately to machine language, so the problem still persists.

Second, most real languages are Turing complete. This means that any task that can be accomplished by one of them can be accomplished by all. Therefore, you cannot make "hacking" impossible (if hacking means writing malicious programs); it would break the Turing completeness.

It's worth clarifying at this point what you mean by hacking. Since you mention Heartbleed, I imagine you don't mean it in Stallman's sense ("playful tinkering").

If you mean people who write programs that directly access the memory and steal data, or modify other programs (such as viruses or keyloggers) then this is not a problem a language can really deal with. A compiler can help, by having an additional function to produce obfuscated machine code when compiling, but ultimately it's still possible for a skillful memory hacker to find his way around. The solution to this problem is OS design: An operating system should sandbox programs, and not allow one program to mess with memory that belongs to another program. This is part of what UAC in Windows does (although Sandboxie is a better example).

There is a caveat here: Some languages, like C# or Java have features (more correctly, the compiler and the VM that the programs run inside have features) that check whether any program is trying to muck about in another program's memory, and when this happens throw errors like IllegalAccessException (for example, keylogger.exe should not be able to read the Credit_card_number value from internet banking application.exe). Of course, this requires keeping track of what memory belongs to what program, which has some non-trivial performance and effort cost. Some "simpler" languages like C don't have it - this is why a lot of hacks like viruses are written in C. Nowadays you have to be clever about evading UAC, but back in the days of Windows 98 people could do all sorts of crazy things to your computer/OS by reading and writing to memory they weren't supposed to. Note that even in C# you still have the option of using normal, C-like pointers (which the languages calls unsafe and requires you to mark as such in the code) if you want - although CLR will probably contain your hack within itself, unless you find a security hole in the CLR that lets you tunnel out into the rest of the memory.

The second kind of hacking is exploiting a bug in an existing program. This is the category heartbleed belongs to. With this, the question is whether the programmer makes a mistake or not. Obviously if your language is something like Brainfuck or Perl that is very difficult to read, it is likely that you will make mistakes. If it is a language with many "gotcha"s like C++ (see "classic" if (i=1) vs. if (i==1) or the C obfuscation contest) then it may be difficult to catch mistakes. In this sense, designing for security is really just a trivial special case of designing to minimize programmer error.

Note that the Heartbleed bug, whether deliberate sabotage or honest mistake, was a problem with the algorithm used (the author "forgot" to check the size) - so no compiler short of an AI as intelligent as a very smart human could possibly hope to detect it; although the resulting access violation conceivably could have been caught with some clever memory management.

In conclusion

There are two sorts of concerns with regard to hacking:

  1. A program has been programmed erroneously, and allows you to do things that you shouldn't. Eg. Gmail server lets everyone see your emails, instead of requiring them to enter the correct username and password first, because someone made an error when developing the server software. Includes bugs, vulnerabilities, etc.

  2. A program is manipulated by the hacker's malicious program. Includes viruses, keyloggers and other malware.

(1) can be fixed by making a language more strict and explicit, so that detecting errors is easier, but ultimately only very simple errors can be detected by automated tools, and as for "tripwires" like Ada's range checking, it can be argued that recognizing the possibility of an error is necessary for you to think of adding the check in the first place, and recognizing the possibility is already the hardest part.

(2) cannot be fixed by changing the language. If you make a language in which it is very difficult to write nefarious applications hackers will simply use another language, and will have no added difficulty manipulating programs written in your language because they are ultimately run as machine code anyway. It can be fixed by making an OS that very vigilantly polices programs running in it, but then it becomes a question of (1) type problems in the source code of the OS.

Superbest
  • 1,104
  • 8
  • 21
  • 2
    Turing completeness basically means a language can *emulate* any other computer (=> program => compiler => language feature) in existence. It doesn't mean a feature you're emulating can ever be as powerful, or as exploitable, as if it were native to the language. Java being a prime example -- you could *emulate* pointers using an array and int "pointers", but you'd still have to intentionally write code to allow a buffer to overflow. And it wouldn't affect anything outside the array. – cHao Apr 15 '14 at 13:09
  • 2
    First sentence is wrong. You do write programs in a programming language. The compiler translates that program into the same program in a different language. – Nate C-K Apr 15 '14 at 14:18
  • 2
    This answer completely misunderstands what programming languages are about. Higher level programming languages exist to make the generated machine code *better* – DCKing Apr 15 '14 at 17:22
  • 1
    So many bad claims. Higher level programming languages exist to make the generated machine code *better* in some way - e.g. more secure. If your higher level language guarantees that your generated assembly doesn't allow for certain insecure conditions at runtime, your language is more secure. Secondly, as pointed out, you completely misunderstand what Turing completeness entails. Just because Turing completeness is there for a developer, it does not mean it's there for an attacker. Lastly, local access to binaries is not part of the threat model for most attacks, including those on OpenSSL. – DCKing Apr 15 '14 at 17:27
  • I'll admit that some of my claims aren't exactly true - I deliberately wrote them in this way to keep things simple and concise. I think it is easy to see which ones, and it is easy to find out the correct version. However, I think the points I made are valid: For instance, Turing-completeness is a relevant concern if you would like to forbid certain kinds of programs by design. It depends a bit on how you define hacking: If hacking is deliberately (by tampering) or accidentally (bugs) creating a dangerous program, then does Turing completeness not guarantee the possibility of such a program? – Superbest Apr 15 '14 at 18:39
  • Also consider the "possible program space" of your source code vs. the compiled assembly. Even if you designed your language to restrict this space, the restriction does not exist for assembly, so malware that tampers with memory can still make your program do things that your language was designed to disallow. – Superbest Apr 15 '14 at 18:43
  • @DCKing Edits are welcome if you think you can improve the answer, but I'll stand by the first sentence. I think fundamentally, high level languages came to be to save the programmer the bother of remembering machine instructions and messing around with a million gotos. That the compiler is now in a position to do automatic optimizations and enhance security is only a secondary benefits. The language does not exist to make generated code better: Sometimes, people may even accept slower (worse?) code because the high level language makes their job easier. – Superbest Apr 15 '14 at 18:46
  • Lastly, consider security of program (implementation) vs. security of algorithm. It should be clear why problems with the latter cannot be solved by language design. – Superbest Apr 15 '14 at 18:48
  • Turing completeness has nothing to do with this. You can conceive malware for regexes or other non-Turing-complete systems. Turing machines are an equivalence class about the theoretical computability of algorithms. It is not an equivalence class of security, which is a real-world practical concept.In fact, explain to me how would you go about 'exploiting' the classical single-taped Turing machine? It is Turing complete, so it must exploitable right? If you're using a term such as Turing machine in a context like this, it starts sounding like snake oil to me. – DCKing Apr 15 '14 at 20:59
  • let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/13912/discussion-between-superbest-and-dcking) – Superbest Apr 15 '14 at 21:22
3

There are many secure languages. I would say that a language with memory management and thread safety is as secure as a language can get.

However, most of these are inefficient. Garbage collection is expensive, and interpreted languages more so. And that's why large applications to this day are written in the memory-unsafe C/C++.

I've recently been playing with Rust, and to me it seems to be a "secure" language in the sense that it was partly designed for this.

It's a compiled language like C++, and it also offers pointers and concurrency. (Garbage collector not necessary)

However, it doesn't lug the memory safety of pointers and concurrency with it. Rust is a language that doesn't trust the programmer, and at compile time it checks for suspicious usage of pointers. There are multiple kinds of pointers/references (borrowed, owned, etc), and some of them have strict rules about them. For example, one cannot:

  • take a reference to an owned pointer and then mutate the owned pointer
  • pass a reference to outside the lifetime of an object (references aren't just numbers that can be batted about like in C++)
  • move around an owned pointer and access the original variable

There are similar rules that ensure thread safety. If one wishes, they can bypass a lot of these checks by using unsafe-marked boxes ("trust me, I know what I'm doing"), or slow garbage collected pointers. There are also more Rustic (and efficient) ways of doing this by using a combination of clones and references, which vary as the usage changes.

Manishearth
  • 8,257
  • 5
  • 35
  • 56
1

Managed type safe languages do a lot to prevent this kind of thing by providing validation of types automatically and moving code execution further from the CPU itself, however that doesn't rule out the possibility of bugs in the implementation of the system the language uses to map to the CPU (for example, the CLR in .Net or the JVM in Java). It also doesn't rule out the possibility of bugs in an application that could cause it to be vulnerable to manipulation or data leakage for itself.

They do improve the security of the system quite considerably, but they also are bulkier, slower and more limited in function due to the overhead of the execution engine they have to run through to provide that functionality.

AJ Henderson
  • 41,896
  • 5
  • 63
  • 110
  • Type-safety has nothing to do with Virtual Machines like JVM or CLR, you could easily have a compiled language with type-safety. And having a compiler enforce array bounds checks is... easy. It does add a little overhead, but then again why do you want to write assembler? And sure, compilers are also programs and can have bugs, but at least there are languages with strict formal definitions that a compiler can provably fulfill (or not). – kutschkem Apr 14 '14 at 15:16
  • @kutschkem - yes, but MANAGED type safe languages do. Managed code requires something to do the managing. By the way, I'm also not speaking against managed languages at all. I'm a C# developer. – AJ Henderson Apr 14 '14 at 15:20
  • @kutschkem: A fundamental requirement for safety is being able to ensure that no object's memory will be subject to reclamation while any reference could exist to it. Requiring that all references be stored at all times in a way that the memory manager knows about can greatly facilitate this, and using a managed VM greatly facilitates that. – supercat Apr 14 '14 at 22:05
  • A compiled language can easily manage references and do garbage collection without requiring a VM. What's more, such a language can be implemented in itself, so you don't have to worry about C-related bugs in the C code used to implement your VM. – Nate C-K Apr 15 '14 at 14:23
0

There are certainly languages that were designed to be secure, but none that are perfect. For example, Ada allows you to specify an allowable range for integer variables and throws an exception if they ever go outside that range. Sounds good, saved you having to manually check. The problem is if you don't have to manually check it is easy to set this mechanism up and then forget to consider the consequences, i.e. integer out of range exceptions. You just created a vector for denial of service attacks.

Security is a process. The language can help, but at best it can only reduce the chance of errors an in the process usually creates new and often even more subtle ones. Arguably C, by nature of being easier to fully understand and fully deterministic operation (no background garbage collection, for example), is ideal for writing secure code. And sure enough, when you look at a lot of security critical code it is written in C. To be fair to Ada, it is more about reliability than security.

user25221
  • 291
  • 1
  • 2
  • 7
  • 1
    C is far from ideal: it's too complex, has undefined behavior, and has security issues in most common constructs (eg arrays, strings). Pascal and Modula-2 are much better for analysis due to simplicity, readability, consistency, and easy compilation. – Nick P Jan 11 '15 at 17:51