Forum Moderators: open
2003-01-04 04:38:53 216.239.46.172 - 216.119.105.20 GET /robots.txt - 200 299 395 16 HTTP/1.0 www.XXXXXX.com Googlebot/2.1+(+http://www.googlebot.com/bot.html) - -
2003-01-04 04:38:53 216.239.46.172 - 216.119.105.20 GET /index.cfm - 200 15164 385 94 HTTP/1.0 www.XXXXXX.com Googlebot/2.1+(+http://www.googlebot.com/bot.html) - -
I dunno, but if I were you, I'd convert those page names to HTML instead of CFM and make sure there isn't so much whitespace at the top of the sourcecode. Those extra blank lines might be killing you off, too.
Your ROBOTS meta tag might be messing things up. I didn't go to my handy-dandy guide, but I don't think "Robots" Content="all" is valid. You'll have to check that one.
You've also got some DIV "layers" in there there. Not sure how the bots like those. Someone who uses them may be able to shed some light on that one for ya.
Hope that helps and at least gets you rolling.
G.
Like page.cfm?id=1
Example of indexed .cfm url's in Fast [alltheweb.com]
Also i though it was interesting how many .gov sites that had .cfm pages [alltheweb.com]
The Robots.txt file for webmasterworld [webmasterworld.com]
Threads about Robots.txt
Robots.txt [webmasterworld.com]
How important is the Robots.txt file now? [webmasterworld.com]
Robots.txt Tutorial [searchengineworld.com]
Thanks in advance for your help!
Please don't make me look up the right values for you. I'm WAY to lazy for that. :)
G.
No. Google crawls the ODP directly. I've seen actual evidence of that from the backlinks. In fact, look at link:www.webmasterworld.com:
dmoz.org/Computers/Internet/Web_Design_and_Development/ Authoring/Webmaster_Resources/Chats_and_Forums/
Is the first listed. If Google doesn't spider a site, it won't show on the backlinks.
HTTP/1.1 400 Bad Request
Server: Microsoft-IIS/5.0
Date: Sun, 05 Jan 2003 03:50:23 GMT
Content-Type: text/html
Content-Length: 87(This server violates the HTTP standards by
returning content after the header in a HEAD request:)
<html><head><title>Error</title></head><body>The parameter is incorrect. </body></html>
That's the error I got. Your server isn't returning a proper head request (it's giving both the server headers and the page contents). Oddly enough, on every tool I ran, the last line of HTML was missing from your home page. The only time I've seen this happen is when an improper content-length was given by the server. Viewing from a normal browser doesn't cause the error to happen. Some very weird stuff is going on there.
"ALL" is equivalent to "follow,index" so your robots meta tag is not your problem.
This is the 2nd month in a row when deep crawl comes by that he only gets my index page and robots.txt.
GET /robots.txt - 200 299 395 16 HTTP/1.0 www.XXXXXX.com Googlebot/2.1+(+http://www.googlebot.com/bot.html) - -
you sure this isn't an adword editor checking your site after an adword is placed? I've seen this single page crawling on my sites after placing an adword... -aV-