Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google WMT - Soft 404 reported for flash (swf) ?

         

omoutop

12:36 pm on Dec 12, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I noticed in webmaster tools, that several flash videos in our site were declared as "soft 404".

these swf files exist and appear inside their pages.
for example:

using liveHttp headers plugin for firefox i see that:
url that load the swf, is found with http code 200
swf file is found with http code 200
everything seems ok - no http 4xx headers anywhere, no errors in console, no missing images, nothing i can spot

but i cannot figure out why the swf is declared as "soft 404" in webmaster tools. Any ideas?

netmeg

1:33 pm on Dec 12, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A soft 404 won't return a 4XX header.

[support.google.com...]

omoutop

2:21 pm on Dec 12, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



the question is why google wmt reports existing swf files as "soft 404"

anim8tr

3:07 pm on Dec 12, 2014 (gmt 0)

10+ Year Member



Haven't seen this before, but was wondering if there might be links inside the SWF that are producing the Soft 404s?

phranque

8:15 pm on Dec 12, 2014 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



have you tried "fetch as googlebot" in GWT?

lucy24

9:13 pm on Dec 13, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



url that load the swf, is found with http code 200
swf file is found with http code 200
everything seems ok - no http 4xx headers anywhere, no errors in console, no missing images, nothing i can spot

Are you doing these tests with the specific URLs that gwt reports as "soft 404"? I'm assuming by now you have read up on what "soft 404" means, so you know you should be looking for 301 responses rather than 404. Now, I've never personally been guilty of a "soft 404" she said, smugly but I have to assume they don't start throwing around the accusation unless a whole lot of URLs start redirecting to the same page. Or just redirecting, period: read on.

Check this: In your logs, there will be periodic requests from the googlebot for URLs like
/vckjl45u89vbugoirt.html

-- that is, some garbage URL that cannot possibly exist. It seems to be programmatically triggered any time a site throws some-number-or-proportion of 301s. (I see it periodically, especially this past year as I've been rearranging subdirectories.) Do these requests get a 404 response?

I don't know whether the Googlebot also tries it with other extensions that a specific site uses. So check both ways-- looking in logs and also by personal experiment-- for something like
/cvkjte4iubfoierng.swf

and make sure it's getting a 404 response.

TheMadScientist

9:15 pm on Dec 13, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



but i cannot figure out why the swf is declared as "soft 404" in webmaster tools. Any ideas?

Do you have a number of URLs redirecting to the .swf page(s)?

incrediBILL

9:37 pm on Dec 13, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Redirects cause soft 404s, whether it's an apache rewrite internally doing it or a 301, 302 or meta redirect.

Basically a soft 404 is a page being displayed where other content should be, typically redirected pages, and it's possible even the way you're loading the flash is triggering the soft 404 issue.

If the page is dynamic with varying parameters on the URL, each set of parameters is considered a different page so something like that could also make it seem like the flash was a soft 404.

If it's not a bunch of redirects causing this, try adding a meta canonical to the page and see if that helps.

phranque

10:02 pm on Dec 13, 2014 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



a soft 404 is when the server responds with a non-404/410 status code and google decides the requested resource doesn't actually exist.

i've seen many soft 404s reported in GWT that did not involve an internal rewrite or an external redirect.

if you decide to specify a canonical url for this (.swf) resource, you will have to use the link rel canonical HTTP header.

lucy24

11:17 pm on Dec 13, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



whether it's an apache rewrite internally doing it

Seems like an internal rewrite, without URL change, would instead lead to a charge of Duplicate Content.

omoutop

6:57 am on Dec 15, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



sorry for late reply
No there are no links inside the swf - its plain 360 animation.

yes there are htacecss redirections (301) from old urls patterns to new ones... but... new urls contain also new paths to media files (jpg/swf) that did not existed in earlier versions.

This is indeed a dynamic page creating the content of the page. It creates about 300 urls (path like: /folder1/folder2/folder3/title-of-page.htm) and I get the soft 404 report in 50 swf (out of the 300)

As I said earlier i cannot detect any errors in headers of the page (either as whole page/url or per unique item inside the page (media files, css, js, etc) - either some 301 or 302 or 404. Everything that these pages load gives me an http response of 200

lucy24

8:13 am on Dec 15, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Everything that these pages load gives me an http response of 200

Oh, whoops, that's the other "soft 404" variant. Does everything in .swf lead to a 200 response? Or only the 300 URLs that correspond to actual content?

The "soft 404" expression can be interpreted as "We tried hard to elicit a 404 response, but didn't succeed".

omoutop

9:41 am on Dec 15, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



only for these 300 urls and their included files
swf, images, css, js - all return separately 200 header response
the url/page returns 200 header response
the pages/urls that leads to these 300 (parent pages) also return header 200.

I decided to leave it there for a couple of days and see how this will end.
Just in case, I added a canonical meta (although these 300 urls are unique content).

lucy24

10:14 am on Dec 15, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I meant it the other way around: Make sure that bad URLs -- like "cjkbrtjoir.swf" -- return a 404 response.

omoutop

10:26 am on Dec 15, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I see - ok i will try this approach also
thanks for the tip