Google Ban

Forum Moderators: DixonJones

Message Too Old, No Replies

Google Ban

visa666

7:26 pm on Feb 6, 2008 (gmt 0)

I have a site probably banned from Google. I am trying to get it reindexed but this is how far it goes:

HEAD / HTTP/1.1" 200 254

Anybody knows what does it mean?

Thank you

robsoles

8:00 am on Feb 7, 2008 (gmt 0)

Hi visa666,

what do you mean by [ this is how far it goes:

HEAD / HTTP/1.1" 200 254

To explain why I ask:

{
HEAD / HTTP/1.1
Host: www.example.com
User-Agent: Loony with Putty or similar telnet style program using HTTP specific settings.
Accept: text/xml,text/html,text/plain
Accept-Charset: ISO-8859-1,utf-8
Connection: close

}

is enough 'request' to get the header details from a server using that protocol, if you exchange 'HEAD' for 'GET' then you are asking for the document proper. Coincidentally '200' is the server response to say that it can deliver that page.

When you type 'http://www.example.com/this_url/thanks.html' into your browser's address bar it 'looks up' the IP address of www.example.com, connects to the HTTP service at that IP address and sends a request starting with

{
GET /this_url/thanks.html HTTP/1.1
Host: www.example.com
User-Agent: <whatever you are using>
etc
etc

}

So I really can't figure out where you are reading that string from, unless it's in your httpd logs in which case: If you are saying that the log entries that have Googlebot as the 'cs-user-agent' only ever 'HEAD' documents off your server then yup, till Googlebot uses 'GET' on your documents you are pretty surely out of SERPs.

Regards,
robsoles.

visa666

1:23 pm on Feb 7, 2008 (gmt 0)

Thanks robsoles

What I mean is: The Google robot won't go farther than that.
All it asks for is robots.txt and header

What 254 represents? ( HEAD / HTTP/1.1" 200 254 )

Thank you much

phranque

1:50 pm on Feb 7, 2008 (gmt 0)

welcome to WebmasterWorld [webmasterworld.com], visa666!

the 254 should be the content length in bytes of the response that would have occurred if it was a HTTP GET request rather than a HTTP HEAD request.

visa666

7:08 pm on Feb 7, 2008 (gmt 0)

Thank you phranque

robsoles

3:14 am on Feb 8, 2008 (gmt 0)

Hey visa666,

if your robots.txt doesn't read as below then could you please post it?

{
User-Agent: *
Disallow:

}

just worried you've banned yourself with a 'poor' directive in your robots.txt file is all.

Regards,
robsoles.

visa666

12:26 pm on Feb 8, 2008 (gmt 0)

I took it out few times, modified it few times....
Didn't work!

User-agent: *
Disallow: /cgi-bin/

robsoles

12:55 pm on Feb 8, 2008 (gmt 0)

fair enough, it's a fine and fair robots.txt file so it can only be that Google has banned it, I've not heard of Googlebot only HEADing documents off servers, but then none of my mates have been blacklisted (or have mentioned it to me anyway.)

I would find an appropriate contact form buried away there in Google somewhere and ask them what I could do if I had your situation.

Regards,
robsoles.

visa666

5:00 pm on Feb 8, 2008 (gmt 0)

Since March 2007 I bugged them almost every month with reinclusion or a clue about the BAN. Still in dark�.

Thank you for your suggestions robsoles

robsoles

12:30 am on Feb 9, 2008 (gmt 0)

Hey visa555,

now I'm fascinated - I haven't read the rules of this place well enough to be positive but I think you must be allowed to sticky-mail me the domain name so I can throw a few tools at it to see if I can reveal why Googlebot hates it.

I'll tell you if I see anything I know breaks Google-TOS or anything else that makes it look bad.

Regards,
robsoles.

smallcompany

4:18 am on Feb 9, 2008 (gmt 0)

robsoles:

Are those tools something that could be of use for anyone, regardless of site�s index status?

Thanks

robsoles

5:05 am on Feb 9, 2008 (gmt 0)

Hi smallcompany,

Yes, all of the tools can be found at some level of implementation riddled all over the internet - many of these do such and such a level of detail to get you in and buy the product behind, there are a few gems out there like w3 that simply host tools of great worth without seeming to ask more than that you use them to make a decent pig of your site.

I'm writing a crawler of my own because it seems that to have everything I feel one program should be able to detail for you, you have to buy two programs currently available on the market and use a few of the free-hosted-tools. Terribly useful already but I am not ready to release it, it needs to do plenty more in my opinion.

Among other great search terms for finding free hosted tools this is probably the best one;

seo tools

Regards,
robsoles.

visa666

4:48 pm on Feb 9, 2008 (gmt 0)

Hi rabsoles

Guess What?

Google boot yesterday got 1 page and one more today! This is the first time in month when it passes the robots.txt
Probably they just heard I was talking to you.
I'll pm you the domain.

Thanks again

robsoles

1:25 am on Feb 10, 2008 (gmt 0)

visa666,

I threw a few tools at the domain, yahoo has info about your site and Google does not, actually invariably a bad sign. The 'GET's that Googlebot is performing on pages on your site may indicate the end of the problem of not even being in Google's index but it may just be confirming whatever it didn't like enough to switch to just 'HEAD'ing documents off your site in the first place.

in Google:
site:<insert-your-domain>

in Yahoo:
linkdomain:<insert-your-domain> -site:<insert-your-domain>

A general domain checking service detailed a few areas of concern I'd fix if I was running your show. One or two statistical sites couldn't cope with requests about your domain.

I found an email address in the contact page of the site and will send my crawler's output with some explanation in the email and links to whatever else I find of interest about your site to that address.

Regards,
robsoles.

robsoles

1:53 am on Feb 10, 2008 (gmt 0)

actually visa666, it's occured to me that maybe that's not your email address on that contact page and it could even be rude of me to send my email there, I'll reply your SM with request for your preferred email address.

robsoles.

visa666

2:44 am on Feb 10, 2008 (gmt 0)

Sorry, didn�t mean to �Typing mistake.

You can use that e-mail but I PM you a different one

Thank you again rObsoles

robsoles

3:01 am on Feb 10, 2008 (gmt 0)

Hey visa666,

Don't stress, although...

The details my crawler pulled out of the host worry me a bit but while I was composing the email I am sending, a simple enough Google search occurred to me and

blacklist "<insert-domain-here>"

in Google search box, brings the site up with a blacklist context, it's a clickable thing in the email.

You are welcome, and Good Luck!
robsoles.

Google Ban

visa666

robsoles

visa666

phranque

visa666

robsoles

visa666

robsoles

visa666

robsoles

smallcompany

robsoles

visa666

robsoles

robsoles

visa666

robsoles

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week