Forum Moderators: open
Here is a sample "normal" Slurp log entry (broke the line for reading):
72.30.134.145 - - [14/Nov/2005:02:30:03 -0800] "http://www.example.com/somepage.html" 200 1745 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http:// help.yahoo .com/help/us/ysearch/slurp)" Here is a sample "PPC" Slurp log entry (broke the line for reading):
72.30.134.145 - - [14/Nov/2005:02:30:03 -0800] "http://www.example.com/somepage.html?source=YAH21034" 200 1745 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http:// help.yahoo .com/help/us/ysearch/slurp)" The "source" attribute allows us to track individual PPC term performance, and every one of our 125,000+ terms in all of our campaigns has its own. It is clear from the log entries that Slurp is trolling through all of our links in our PPC campaigns ... even the links that are no longer included, like to an older domain that hasn't been in our campaigns for months. Often Slurp will run through all of our terms in all of our campaigns several times before stopping. During some sessions, Slurp requests to send an email message to itself with every other request which has the effect of disrupting our mail service, too.
This hammering will usually go on for only 3 or 4 hours, but occasionally, like today, it goes on for 12 to 18 hours or longer. The effect is that of a mild denial of service attack, as it slows down our server tremendously and we lose a lot of visitors as a result.
Have any of you experienced this?
I have asked before, but nobody responded. It's tough to believe that we are the only ones who have experienced this. I'm posting this here instead of to the Yahoo Search forum because this specifically hits our PPC ads, so I'm hoping other PPC advertisers will be able to verify it. Certainly none of our efforts to get info from Yahoo have resulted in anything at all. I even talked to an API tech guy who had no idea what might be happening.
I ban the entire Class B block of Slurp IPs whenever I have the time, for a total of 14 bans so far, however it keeps coming back with a new IP block. Neither .htaccess nor robots.txt instructions keep the hits from draining our resources, so I usually use ipchains/iptables and so forth to drop the packets and their requests entirely.
This looks like regular-old slurp to me, and I'd be checking to see how my dynamic links got 'exposed' so that Slurp finds them and spiders them if this was my site.
Alternately, Denying /somepage.html in robots.txt should stop them - eventually. If a 'bot finds a link, and it's not denied, it will spider it. So where's Slurp finding your links?
Jim
This simply must be Slurp running through our campaigns repeatedly. When it visits, our logs are full of Slurp entries as illustrated ... 2+ hits per second, and every one of our source attributes is included ... I've checked, even with the terms that have never generated an impression.
At first we assumed it was checking the validity of the links in our campaigns, but it has gone on for far too long and requested far too many inactive domains for this to be normal activity.
We get no visits whatsoever from Slurp requesting a normal page without the source attribute.
Perhaps coincidentally, all of our domains were dropped from the algorithm listings on Jan. 1 2005 when Yahoo dropped the Inktomi paid inclusion customers (like we used to be) and started their own similar service. I don't mean domains that may have shared a link or two ... but ALL of the domains where we appear in the whois info, regardless of the IP block, hosting service, age or content.
The best response we can get from Y is to sign up for Site Match and see what happens ... uh ... yuh ... sure.
It's been a very long year ... and thanks very much for commenting ... it's a first!
I'd think one or two hits spaced over a couple of seconds or so would do the job, don't you?
Nobody has seen this Slurp behavior in their logs?!?
I mean, if so, then it's clear only our server is affected ... at least out of all of Yahoo's advertisers that frequent these forums. That in itself is pretty amazing, and it gives us the basis for a consipracy theory or something ... I don't know what to think, anymore.
Yeah, it's weird, but are we really alone in this?
This page sends a 301 redirect for
mydomain/landingpage.
so, noone really sees that overture specific url. I am 100% certain that slurp is getting it from overture database.
and its not the verification agent. The verification process for new urls (ads) really hots the server VERY HARD (not 2 per second). About 20-30 requests or even more per second and that comes from some RPT-HTTPclient.
sdani