- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
PieFed uses PDQ hashing to generate a fingerprint of an image and can use that fingerprint to detect other posts that use the same or fairly similar images, for moderation purposes. Hashes are added to a block list which stops the image from being re-posted in future. Demo
PieFed does not generate PDQ hashes itself - it uses a separate service to do it. Several different instances could be using the same hashing service which will be more efficient than everyone running their own. When an image is being federated around the URL of it will be sent to the hashing service by multiple different fedi instances and only the first will be slow as all the subsequent requests will be served from a cache.
Get the code from https://github.com/rimu/pdqhash-python
By doing a GET request for https://yourdomain.tld/pdq-hash?image_url=url_to_image_to_hash you will receive JSON like this:
{
“pdq_hash_binary”: “100100100011…”,
“quality”: 100
}
The quality score (0–100) indicates how well the image content supports a reliable perceptual hash.
Higher scores mean better contrast, edges, and texture in the image. PieFed accepts anything > 70.
Rather than de-duplication it’s more about blocking CSAM / spam and when a large flood of bad images have already arrived finding all the copies of them that there are (even if those copies are slightly different from each other). Demo of it at https://piefed.social/post/751901 .
It looks like we’ll need a less fuzzy hash for de-duplication.
That’s AMAZING! Thank you so much sir! :-)