Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Was googlebot experiencing gzip problems on my site?

         

Karma

3:45 pm on Sep 23, 2009 (gmt 0)

10+ Year Member



As some of you may have seen, I have had a problem with my site for the past few months which I thought was a penalty. The strange part was that some of my pages remained at #1 position whereas others seemed to disappear from the index altogether.

I happened to mention this problem to someone at my work, and I gave him two URLs from my site. One that remained at #1 and the other that had disappeared. Opening the pages in Firefox he immediately seen around 800 characters of pure gibberish, this wasn't a rendering issue but as it was there in the HTML source too.

This 'gibberish' was only showing in his browser, I've also checked these pages multiple times on different browsers and operating systems (including the same o/s and firefox version as my colleague).

The pages that disappeared from the index did seem to match those pages with the 'gibberish' problem.

Between us we found that disabling the gzip compression stopped this problem. We also found that disabling his firewall (outpost) also stopped the problem.

Anyone have any idea what the hell was going on here?

tedster

12:19 pm on Sep 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This has the sound of a technical mystery. But if disabling the firewall also fixed the issue, then I'd lean toward the firewall as being the key part of the problem. This is especially so if the Google cached page is clean.

I'd still consider experimenting with gzip to figure it out. The one factor I can't quite wrap my head around is why gzip would mess up only some of the time.

Karma

3:07 pm on Sep 24, 2009 (gmt 0)

10+ Year Member



Technical mystery indeed. I've been playing around with the gzip code that I used and found that if the first character was blank then the page loaded fine. Anyway, I'm now using "php_value zlib.output_compression 16386" in my .htaccess which is the preferred method and works fine.

I'm still not sure if this has fixed the problem, I guess time will tell. My concern was that the Google crawler was using something similar in their firewall.

Fingers crossed my site will pick back up now.

[edited by: tedster at 3:49 pm (utc) on Sep. 24, 2009]