MD5 hash of a blob

Hi there, I’m parsing a blob from an API using the Capacitor Blob Writer NPM package.

I would like to hash the blob as an MD5 and compare this to the MD5 hash returned from the API to check the integrity.

There is not much guidance in this regard online or on this forum. Any help would be appreciated.

Thanks!

I would convert the blob to a string (e.g. base64) and then hash it

Assuming the api provider does the same

I don’t generally disagree with @Tommertom, and I guess if the final sentence in his post is accurate, it would be supreme folly to do so this time.

That being said, I would be surprised if his final sentence were true, and therefore I would tentatively not agree with the overall suggestion.

If we have a blob already, running MD5 across it directly will always be faster than encoding it in Base64 and then MD5ing that result, for three reasons: the base64 encoding pass will take some time, there is more data than before to feed MD5, and all of that would typically require over ~2.33x the total blob size of memory.

Now, exactly how one would go about doing this will be completely dependent on which library you use to do the hashing. Last time I checked these things (I don’t use MD5 personally for anything, so no direct experience with any of this), the JKM implementation was the most performant JavaScript implementation for MD5. It is used by the js-spark-md5 library, which has some documentation in its README and test suite demonstrating how to calculate the MD5 hash incrementally, which allows you to process files of any size with constant memory requirements.

2 Likes