This should probably be on the crypto Stack Exchange instead of here.
Regardless, the best course of action is to not touch the symmetric block cipher yourself. There are mature, cryptographer-audited security libraries like NaCl and KeyCzar that will make the correct decisions for you.
That said, if you do choose to go it your own, a generally safe choice is AES-128 in either EAX mode or GCM mode. Both of these modes provide authentication of your ciphertext in addition to cryptographic security, which is crucial for avoiding things like padding oracle attacks. In order to use either of these modes, you need a cryptographic key, an initialization vector, a plaintext, and, optionally, additional authentication data.
The key must be 128 bits long and generated with a cryptographically secure random number generator. If you want a password-protected key, take the randomly-generated key and pass it through PBKDF2-HMAC-SHA-256 with the password, the key in place of the salt, a number of rounds calibrated to take as long as you're comfortable with (10_000 rounds on my laptop requires 0.5s, which is a reasonable amount of time), and a 128-bit output size. The output will be the "real" key to use, and you can re-run this with the original key and password any time you need to access the real key.
The initialization vector for EAX and GCM modes (note: this may not be the case for other modes, like CBC, which has even stricter IV requirements) must be unique across all encryptions with a given key. A simple counter is considered to be sufficient. However, you need to make sure this counter guarantees uniqueness even when run across multiple invocations of the process, multiple simultaneous processes, and multiple machines. A properly-implemented version 1 UUID should suffice for these purposes. This isn't in the format encouraged by RFC 5116, section 3.2, however, and someone more knowledgeable than me will have to clarify whether or not that format is important to abide by. Alternatively, a securely-generated random number of appropriate size should be sufficient, but I haven't seen people do this in practice with CTR-derived modes (which GCM and EAX are based off of). Again, someone more knowledgable will have to comment on whether or not this is a good idea.
The plaintext can be anything you please, but do not fall into the trap of thinking the input of AES is "characters", "ASCII", "Unicode", or any such nonsense. The appropriate input is an array of bytes.
The optional associated authentication data may also be anything you like. This is data that is incorporated into the ciphertext in order to verify authenticity when a decryption is performed. But it is not part of the plaintext protected inside of the ciphertext. For example, if you're encrypting data on behalf of users stored in a database, you could use the user's user_id
in the database. In the event that one user found a way to copy another user's encrypted data into his account, he wouldn't be able to get your application to decrypt it, since his user_id
wouldn't match the one used to encrypt the data. I may have explained this poorly, so please let me know if it was confusing or unclear.
For storage, the key must be kept secret. Key management and lifetime is beyond the scope of my answer, but it is a crucially important part of ensuring the security of your protected data. The initialization vector, ciphertext, and authentication data have no requirement on their secrecy, but the authentication data should ideally be something provided to the cryptography layer from an external source (e.g., the user_id
mentioned earlier). When used, it should not be something a potential attacker can supply to you or otherwise exercise control over. Storing, transmitting, and copying the authentication data along with the other values defeats the purpose. One way of considering the authentication data is that it should provide "context" for the protected data.
That should do it. Keep in mind that this is complicated, and excruciatingly difficult to do correctly without leaking protected information. I don't recommend you go it on your own, but if you do, the above should be a relatively safe point to start from. At the very least, you should be better off than 95% of the websites out there that try to implement cryptography themselves. Assuming, of course, that I haven't completely failed at describing a secure implementation approach.