Forum Moderators: open

Message Too Old, No Replies

Thank you Yahoo

SlurpConfirm404

         

fiestagirl

2:59 pm on Aug 3, 2005 (gmt 0)

10+ Year Member



This should put a stop to the "what is Yahoo(Slurp) doing asking for files that don't exist?" threads.
IP: 66.196.91.*
UA: mozilla/5.0 (compatible; yahoo! slurp; [help.yahoo.com...]

I'm seeing query strings of this nature:
www.mysite.com/SlurpConfirm404/file/page.htm

abates

9:29 pm on Aug 3, 2005 (gmt 0)

10+ Year Member



Still not sure why they do this when other search engines don't seem to need to...

fiestagirl

9:43 pm on Aug 3, 2005 (gmt 0)

10+ Year Member



I also see this type of testing by G and MSN. It happens so often that I know that it isn't us that made a mistake.
MSN does this type of thing:
www.mysite.com/page.ht
G does this:
www.mysite.com/bwxovjujquwhht.html

fish_eye

4:13 am on Aug 8, 2005 (gmt 0)

10+ Year Member



Are you sure it's really them?

fiestagirl

8:03 pm on Aug 8, 2005 (gmt 0)

10+ Year Member



Yes. They identify themselves by the usual IP and UA.

fish_eye

10:42 pm on Aug 8, 2005 (gmt 0)

10+ Year Member



I am Elvis Presley and I never really died at all. I AM telling the truth - I promise - cross my heart.

Granted the IP is hard to spoof but it's not impossible (the UA is similar to my statement above).

I might send Y! an email and see what they say.

wilderness

10:58 pm on Aug 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I am Elvis Presley and I never really died at all. I AM telling the truth - I promise - cross my heart.
Granted the IP is hard to spoof but it's not impossible (the UA is similar to my statement above).

I might send Y! an email and see what they say.

fish,
fiestagirl has provided some detailed insights in the past.

I would assume (on her behalf) that she has made this determination on both her log history and her sites content as well!

We have no reason to doubt what she presents.

Best of luck with Yahoo.

Don

fish_eye

11:11 pm on Aug 8, 2005 (gmt 0)

10+ Year Member



No disrespect intended (to either fiestagirl or elvis fans!) - just trying to make a point about the UA.

I have a feeling that the best I can hope for from Y! is a cryptic or boilerplate response.

wilderness

11:22 pm on Aug 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



best I can hope for from Y! is a cryptic or boilerplate response

IMO, that's the best we can hope for from any provider.
Automated responses and double-talk.
Cooperation and assitance to webmasters is rather rare.

I'm not sure how many mails I sent to providers to reports violations of their UAG's and including links to their UAG's in the process?
Only to be automated with a reporting procedure suggestion that I had already followed.

It's a WASTE of time.

Don

fish_eye

12:04 am on Aug 9, 2005 (gmt 0)

10+ Year Member



I did send the email - and I'll post a summary of a response if I get one - but I found this url in the faq [help.yahoo.com].

I have recently changed something that may have triggered this reaction in the bot. I used to mod_rewrite a number of old (html) pages to new (php) pages using a 301. I've recently switched some of these off (they'd been there about 6-12 months).

By default these will now be redirected to my 'directory' page with a 404 response.

Perhaps this is it - also "SlurpConfirm404" seems to tie in with a technique similar to what they're stating in the FAQ (above). Perhaps they should call it "Deliberate404" :)

Anyway, it's not killing my site with too many requests - I'm just heavily into bot monitoring and trapping IPs lately so it caught my eye.

Have you made similar changes recently fiestagirl?

fiestagirl

5:37 pm on Aug 9, 2005 (gmt 0)

10+ Year Member



Yes, unfortunately no matter how well planned the site architecture, sometimes directories or file names change, redirects need to be put up, etc.

The point of my first post was really that I appreciate knowing that the request is a 404 check and not a broken link put up by us or someone else.

I analyze every 404 and am happy to know that I can ignore those probes by Yahoo. Makes my life easier.

jdMorgan

5:59 pm on Aug 10, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Granted the IP is hard to spoof but it's not impossible (the UA is similar to my statement above).

Just a tech-note, here:

It is possible to spoof an IP address, but if a client spoofs its IP address, then it will never receive a response from the server -- The server will send its response to the specified (spoofed) IP address -- which might be someone else's machine, or might not even exist.

So, a search-indexing robot, e-mail harvester, or site-scraper could conceivably spoof its IP, but it would never receive any of the content it was requesting, so that would be rather pointless.

Jim

fish_eye

11:09 pm on Aug 10, 2005 (gmt 0)

10+ Year Member



Thanks Jim - seems like it would not be spoofed in most cases then.

Also, I got a response from Y!

I understand that you have a question about how our crawler Slurp checks
that a server returns a valid 404 error.

Yes, this is most likely a request to check of your server returns valid
404s.

Signed by a real human and all!

I guess "most likely" is as good as a service desk person will be able to give.

abates

9:45 pm on Aug 11, 2005 (gmt 0)

10+ Year Member



So what happens if I ban bots from fetching /SlurpConfirm404 in my robots.txt? :)

fish_eye

11:36 pm on Aug 11, 2005 (gmt 0)

10+ Year Member



:) I guess it depends on what the author of this part of Slurp's system wrote. The only way to find out (inconclusively) would be to try it.

eeek

9:10 pm on Aug 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It is possible to spoof an IP address, but if a client spoofs its IP address, then it will never receive a response from the server

You can spoof an IP address and receive a reply if you have access to a router in the packets' path. While not as many people have this kind of access, one should never assume it can't happen.

followgreg

2:06 am on Aug 22, 2005 (gmt 0)

10+ Year Member




It's actually amazing the number of errors like this in my log files, not only the slurpconfirm404 but a whole bunch of public_html/ / and other weird requests coming form nowhere while my site has all its internal linking structure perfectly in place.