Earlier this week, I wrote about FreeBSD ZFS and how deduping can take considerable amounts of memory. A commenter who wishes to remain anonymous took issue with this line:
As an aside, it makes you appreciate why so many cloud vendors don’t wish to encrypt data; such processing would render even the most sophisticated deduper utterly useless.
This line was phrased in the context of a cloud vendor wishing to save costs, not the wishes of their customers. If storage was constrained, it would make sense to dedupe if possible.
Encryption “breaks” deduplication (and by extension, compression). The mark of a high quality algorithm is pseudo-random noise, such that there are as few repeating patterns as possible.
Of course, encrypting the same individual files will result in the same ciphertext, right? I suppose it comes down to implementation.