Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

WMT shows many more 404s for hotlinked images. Why are they 404?

         

dickbaker

11:17 pm on Feb 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Webmaster Tools has always showed a handful of 404 errors for my site: pages that Google still has in the index that no longer exist, or every so often a link from another site that wasn't written properly.

In the last week or two the number of 404's has increased to a couple of hundred. They're coming from hotlinks to images on my site. What's interesting is that, although they all come from different URL's, the structure of the link to my site is the same:

http://www.mysite.com/ABrandName/Model_XYZ.jpg" width="76" height="50" alt="image"/></a> </div> <div class="c0 r"><a href="/m/imgres?q=word_A+word_B

q=word_A+word_B is a query that varies with each link, just as the images on my site vary.

So there's obviously one person or company using the same type of hotlinking script on a couple of hundred different sites to link to images on my site and tie those into another site. But they don't work. Why? I have no idea.

I can't imagine that these would hurt my site, but they can't help, either.

Any ideas as to what might be going on?

[edited by: tedster at 11:24 pm (utc) on Feb 8, 2011]
[edit reason] un-hide the example URL [/edit]

chalkywhite

8:31 pm on Mar 30, 2011 (gmt 0)

10+ Year Member



Hi Pudders, ive started seeing these 404`s again since yesterday, however they are dated early feb and march the 11th and 12th. Its really strange as the image is IT related ( basically a picture how to do something in exchange 2010 ) however when I go to the link its a familys photo album site ( italian ), there seems to be a footer which is RIDDLED with hundred of links to other IT related sites.

chalkywhite

8:33 pm on Mar 30, 2011 (gmt 0)

10+ Year Member



sorry to add this in a another post. On scanning other sites linking to my images they all seem to be partially contructed sites with pages and pages of Lorem ipsum dolor sit amet

innocbystr

8:58 pm on Mar 30, 2011 (gmt 0)

10+ Year Member



This is really getting strange. I've got the hotlinked images errors again plus two new errors on links/pages that don't exist on my site:

mysite.com/cgi-bin/login?login
mysite.com//l/sayit?ret=

Guess I could 301 them to the home page....

crobb305

9:06 pm on Mar 30, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I had a big jump in 404s also just since yesterday. Mostly I am seeing a sudden reporting of 404s discovered only once and last requested over a month ago. Furthermore, some of my pages being reported as 404 utilize underscores between each word in the filename, and are correctly linked on the discovery pages, but Googlebot is replacing the underscore with <hr> or hr (e.g., /file<hr>name.htm rather than /file_name.htm)

I am not sure why Googlebot is replacing the underscore that way to return 404, because that isn't how the pages are linked.

indyank

9:53 am on Mar 31, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So people are all seeing 404 errors that I was complaining to have hit one site ever since panda arrived.

dickbaker, I see a lot of those errors too.But I have disallowed any urls with "?" in robots.txt. So I see them in the robots.txt blocked list.

But there are so many strange 404s that I had been seeing over the past one month.There was one to an url that never existed and the pages reported as linking to it was strangely pages on the same site that carry "noindex, follow" meta tag.

indyank

2:13 pm on Apr 1, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



GWT report is getting more and more clumsy and non-actionable.there are so many error reported these days and some remain for a long time even after we correct them.This makes it difficult to find the new errors.There isn't even a sort by date option.

If they could at-least give us a way to sort them by date and to label them, it would be better.We could then label them as actioned and so on.

Or they could provide an option to export all these stuff just like they do in analytics.

GWT looks like a very primitive interface and google you better improve quality in stuff that you provide.

rlange

3:29 pm on Apr 1, 2011 (gmt 0)

10+ Year Member



I'm seeing a handful of these same 404s. Some from February, some from March. I've checked out 5 of the websites. One doesn't exist anymore, another loaded but looks rather shady, the other three are reported attack sites.

My guess is that these are a result of some malicious folks scraping images from Google's image search in order to lure in victims. The broken nature might be the result of these folks using an outdated script that was written to scrape images from the old image search results (before Google introduced their "fancier" interface).

MelissaLB

3:01 pm on Apr 2, 2011 (gmt 0)

10+ Year Member



Hi there, Just thought i'd add that you're not alone, we saw these errors as well when this thread was first started. They vanished and have now returned, all showing the old/original dates when they were first discovered.

rlange, we have checked out some of the sites as well and about 70% of them don't exist anymore and the others are all similar looking and shady. They are all 'how to' lists filled with links and one section of about 6 or 8 images. The interesting part is if you visit the domain, there is no way to navigate back to that particular page that is hot linking. hmmm. it's peculiar and I'm not exactly sure what to make of it.

crobb305

5:19 pm on Apr 2, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



yeah my number of 404s have jumped since Panda. I had about 5 showing up in WMT the week of Panda, now about 30. That may seem like a small number, but my site only has 110 pages, so that's a lot of 404s to suddenly show up. They are mostly clumsy requests from Gbot. Like I said, it is requesting filenames that have underscores, but replacing the underscores with <hr>. The linking sources are not linking to me that way.

mslina2002

2:46 pm on Apr 3, 2011 (gmt 0)

10+ Year Member



So from what I am reading various people have different approaches how to handle these errors. I have since Friday over 74 of these 404s that seemed to have popped up on my WMT and I am afraid is hurting my rankings.

Seems opinions on this is a bit varied:

(1) Ignore - let WMT stay a mess
(2) Redirect to some image
(3) Redirect to some url - capture backlinks (at own risk)
(4) Redirect to proper image
(5) Block image directory from Google image bot - at cost of not getting traffic via Google images

I am trying to figure which way would be least harmful. My immediate inclination would be to get rid of those damn errors right away so I am thinking (3).

pageoneresults

2:54 pm on Apr 3, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Redirect to proper image.


In the cases we've had, that is exactly what we've done. What we found is that Googlebot is attempting to interpret JavaScript generated links and in the process is breaking things.

innocbystr

4:29 pm on Apr 3, 2011 (gmt 0)

10+ Year Member



Googlebot is attempting to interpret JavaScript generated links and in the process is breaking things.


I think you really hit the nail on the head there, pageoneresults.

Obviously this is something experimental and I wouldn't think that Google would actually drop a site in the rankings because of 404 errors from another site. But it would sure be nice to hear that from them.

innocbystr

4:51 pm on Apr 3, 2011 (gmt 0)

10+ Year Member



There is thread on this in the Google Webmaster Central help forum, Huge new eruption of 404 errors deriving from irrelevant external sites, but as of yet no response from any of the Googlers that monitor the forum.

tedster

5:26 pm on Apr 3, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There's a Google Help page (dated March 20, 2011 on the SERP) that seems to be a response to this situation:

Unexpected 404 errors

In Crawl Errors, you may occasionally see 404 errors for URLs you don't believe exist on your own site or on the web. These unexpected URLs may be generated by Googlebot trying to follow links found in JavaScript, Flash files, or other embedded content...[examples follow here]

...Google strives to detect these types of issues and resolve them so that they will disappear from Crawl Errors. In general, 404 errors won't impact your site's search performance, and you can safely ignore them if you're certain that the URLs should not exist on your site.

[google.com...]
This 44 message thread spans 2 pages: 44