dstiles - 10:40 pm on Nov 30, 2011 (gmt 0)
You're right, but I did say "looking genuine" and "if this is genuine". Without a proper rDNS I cannot be sure it's valid except for the crawl pattern, which looks as extensive as others.
My own findings are that shopwiki does not always give a proper rDNS. Of three ranges I get bots from, only one seems to resolve and that is their acknowledged IP range. Two that do not are in Hurricane and XO ranges; they seem to work as expected and not as forgeries.
Yes, the bot crawls every day, as do other SEs that have the capacity (G, B etc). The proper way is to change the cache header times (expire and cachecontrol) from (eg) 24 hours to 240 hours. That should fix the problem. If not, ask the SE why not (NB: it may still crawl to check the timing but should be happy with the header and not reload the complete page.
The above is partly conjectural for SEs as my sites all ask for 24 hour refresh periods. Does anyone have further onfo on this?