So I'm building this app and part of it is dedicated to ingesting a feed of text and images,.. the way the feed is sent to me the images are always named the same.. as in id-1.jpg, id-2.jpg etc.. so I started wondering what would happen if the managers decided to change an image in their application.. it would come to me named the same.. so I would either need to process ALL the images into multiple sizes everytime I process the feed or maybe somehow check each photo against what I had last time.. so here's what I am thinking about doing..
When I process the feed I read the image with file_get_contents, and since that is kinda large I take a md5 of that and store it in the db along with other details about the image.. then next time I process the feed, or the images from a specific record are requested I can check to see if I still have the most up to date image..
I ran a test with two identical jpgs, each all white but one with a single black pixel, and my tests showed the source and the md5 of that source were both unique.. so that worked..
The curious thing though is when I managed to get the same photos from two different feed dates (photos that my system should treat as identical) and test them the md5 was the same but the source was not..
Why would the md5 be the same while the source wasn't?
Does this sound like a logically sound way to go about it?