I'm looking to encrypt files using secure hashing and encryption algorithms in Python. Having used bcrypt in the past, I decided to use it for my passphrase calculator, then pass the output through SHA256 in order to get my 32 bytes of data, then use that with AES to encrypt/decrypt a file:
#!/usr/bin/env python
from argparse import ArgumentParser
from bcrypt import gensalt, hashpw
from Crypto.Cipher import AES
from hashlib import sha256
import os, struct, sys
def main():
parser = ArgumentParser(description = "Encrypts or decrypts a file using " +
"bcrypt for the password and triple AES for file encryption.")
parser.add_argument('-p', '--passphrase', required = True,
help = "The passphrase to use for encryption.")
parser.add_argument('-i', '--input', required = True,
help = "The input file for encryption / decryption.")
parser.add_argument('-o', '--output', required = True,
help = "The output file for encryption / decryption.")
parser.add_argument('-r', '--rounds', default = 10,
help = "The number of bcrypt rounds to use.")
parser.add_argument('-s', '--salt', default = None,
help = "The salt to use with bcrypt in decryption.")
parser.add_argument('operation', choices = ('encrypt', 'decrypt'),
help = "The operation to apply, whether to encrypt or decrypt data.")
parameters = parser.parse_args()
if parameters.operation == 'encrypt':
encrypt(parameters.input, parameters.output, parameters.passphrase,
parameters.rounds)
elif parameters.operation == 'decrypt':
decrypt(parameters.input, parameters.output, parameters.passphrase,
parameters.salt)
def encrypt(input_file, output_file, passphrase, rounds):
bcrypt_salt = gensalt(rounds)
bcrypt_passphrase = hashpw(passphrase, bcrypt_salt)
passphrase_hash = sha256(bcrypt_passphrase).digest()
print "Salt: %s" % (bcrypt_salt, )
iv = os.urandom(16)
cipher = AES.new(passphrase_hash, AES.MODE_CBC, iv)
with open(input_file, 'rb') as infile:
infile.seek(0, 2)
input_size = infile.tell()
infile.seek(0)
with open(output_file, 'wb') as outfile:
outfile.write(struct.pack('<Q', input_size))
outfile.write(iv)
while True:
chunk = infile.read(64 * 1024)
if len(chunk) == 0:
break
elif len(chunk) % 16 != 0:
chunk += ' ' * (16 - len(chunk) % 16)
outfile.write(cipher.encrypt(chunk))
return bcrypt_salt
def decrypt(input_file, output_file, passphrase, salt):
print "Salt: %s" % (salt,)
bcrypt_passphrase = hashpw(passphrase, salt)
passphrase_hash = sha256(bcrypt_passphrase).digest()
with open(input_file, 'rb') as infile:
input_size = struct.unpack('<Q', infile.read(struct.calcsize('Q')))[0]
iv = infile.read(16)
cipher = AES.new(passphrase_hash, AES.MODE_CBC, iv)
with open(output_file, 'wb') as outfile:
while True:
chunk = infile.read(64 * 1024)
if len(chunk) == 0:
break
outfile.write(cipher.decrypt(chunk))
outfile.truncate(input_size)
if __name__ == "__main__":
main()
What are the possible weak points of an implementation like this?
What I have determined is that it would be easy for an attacker to determine the original file size, however that doesn't reveal much about the file. SHA-256 isn't the best hashing algorithm in the world, but wrapping a bcrypt password would lead me to believe that all threats would be mitigated there. Due to bcrypt's ability to increase in security over time by adding more rounds to the algorithm, it seems like a pretty safe bet to use bcrypt for now.
Are there any gaping holes in this implementation? I'm not a cryptographer, but I do know the basics and purposes of each of the three algorithms being used here.