Forum Moderators: phranque

Message Too Old, No Replies

.htaccess blocking Google? (newbie)

         

ang099

1:49 pm on Sep 19, 2007 (gmt 0)

10+ Year Member



We've had our website redesigned (Joomla) and immediately we started
having problems with Google. In a drastic move we completely deleted
our site from the Google index, (via webmaster tools) and resubmitted.
Now, we are still not indexed, and in the webmaster tools it shows ALL
of our pages with 404 - not found errors.

link:www.example.com shows nothing, although we have over a thousand in
bound links.
info:www.example.com also shows nothing

The only thing that comes up are articles about us, not our site.

I had previously found a problem with the server response codes
returning a 404 [using an online headers checker]. I fixed the
problem (settings in Joomla) and now we are getting a 200. (This was
found via Google Webmaster tools) But... In running different tests it
looks like I'm missing something else, because our site isn't being
found.

I've run through all the tools [on the online tool site], and it seems fine!

Now, I set up Analytics and added the code - but when I check the
status it tells me:

Tracking Unknown (Last checked: 2007-09-19 6:04 AM PST.)
The Google Analytics tracking code has not been detected on your
website's home page. For Analytics to function, you or your web
administrator must add the code to each page of your website.

It's like Google isn't seeing us... Help!

Part of .htaccess:

RewriteCond %{REQUEST_URI} ^(/components/option,com) [NC,OR] ##optional - see notes##
RewriteCond %{REQUEST_URI} (/¦/*.*\.(htm¦php¦html))$ [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) index.php

Redirect 301 /index.html http://www.example.com/index.php
Redirect 301 /p_download_reg.htm http://www.example.com/registration.html

CheckSpelling On
<Files 403.shtml>
order allow,deny
allow from all
</Files>

Any ideas?

[edited by: jdMorgan at 4:06 pm (utc) on Sep. 19, 2007]
[edit reason] example.com [/edit]

jimbeetle

2:42 pm on Sep 19, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Not sure if this is it, but your robots.txt file is a bit funky. Try running it through a validator.

ang099

2:49 pm on Sep 19, 2007 (gmt 0)

10+ Year Member



Funky how? I ran it through the Google Webmaster tools and it checked out OK - I'll try others...

jimbeetle

2:58 pm on Sep 19, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Insert a blank line between the sitemap and User-agent directives, then delete the blank line between the user-agent and subsequent disallows. Be sure to keep the blank line at the end of the record.

Though googlebot should be sophisticated enough not to get confused by this, it's one less possible problem to look at.

jdMorgan

4:01 pm on Sep 19, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



robots.txt: There must be no blank line after "User-agent: *" and there must be a blank line after the last "Disallow:" line.

However, Google is very forgiving of errors, and I doubt that's the problem.

Jim

ang099

4:04 pm on Sep 19, 2007 (gmt 0)

10+ Year Member



I just changed the robots.txt, thanks for the input on that.

I did just find something, we also own example.biz, and THAT seems to be indexed, but with a frame! We are registered with GoDaddy, not this [other registrar].

The first thing I thought of was the Google Proxy Hack, but since we control that domain that doesn't make sense.

[edited by: jdMorgan at 4:07 pm (utc) on Sep. 19, 2007]
[edit reason] No URLs, please. See terms of service. [/edit]

lOptimiseur

10:21 pm on Oct 2, 2007 (gmt 0)

10+ Year Member



Did you go back into Google Webmaster Tools and "reinclude" the URLs or directories that you had previously blocked?

g1smd

10:48 pm on Oct 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> In a drastic move we completely deleted our site from the Google index, (via webmaster tools) <<

Unless things have recently changed, using that method completely removes the site from the index for six months.

lOptimiseur

4:16 pm on Oct 5, 2007 (gmt 0)

10+ Year Member



This is from the Google Webmaster Help Center:

"Content removed with this tool will be excluded from the Google index for a minimum of 90 days, regardless of whether the content becomes available to our crawler during that time."

g1smd

7:25 pm on Oct 5, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK. 90 days. It always used to be 180 days. Whatever, it is gone from the index for months.

Additionally I would not mix up RewriteRule and Redirect directives in one .htaccess file. I would use only one type to guarantee the order that they will be processed in.