Welcome to WebmasterWorld Guest from 54.145.208.64

Forum Moderators: goodroi

Message Too Old, No Replies

robot.txt needed to be crawled by google?

bot leaves after not finding one

   
2:35 pm on Jan 22, 2005 (gmt 0)

10+ Year Member



I don't use robot.txt simply because I'm not really sure how to use it yet. I noticed in my logs that everytime googlebot visits my site, it looks for robot.txt and leaves after not finding it (I assumed). My site has hundreds of static pages, and google crawls a few, about 20 pages and stop at robot.txt.

Am I correct in assuming that google needs robot.txt to crawl my entire site?

Any advice on what to do to have my entire site crawled?

Thank you very much.

2:49 pm on Jan 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello GP:

To the best of my knowledge, no robots.txt is needed to be spidered by Google or any other engine.

However, as a relative newcomer, I had the same vague fears you express.
What I did was to set up a shortest possible robots.txt that would not exclude anything much.
In effect, my robots instructed all engines to stay out of my cgi.bin directory,
one I never mess with in the first place.

It read something like this (and don't quote me):

robots *
exclude /cgi.bin

The actual code is better, I'm too full of beer to look it up now.

Check elsewhere [burp!] and give it a try, maybe you will feel better.
Then you can take a fresh look for the REAL reasons the engines don't crawl your site more.
- Larry

2:58 pm on Jan 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To address the issue of all your site not being indexed there are lots of reasons why goglebot is not indexing all of your site but non relate to the robots.txt file

The correct format of the robots.txt file is

User-agent: *
Disallow: /cgi-bin/

you dont need to have the file and it will make no difference to googlebot not spidering all of your site but when requested it will stop any 404 errors appearing in your sites log file

for more information see the link below
[searchengineworld.com...]

Hope this helps

3:21 pm on Jan 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks ncw:

I never try to code when I get into that second 6-pack of European beer.
Robots and languages simply do not allow one to paraphrase. [burp!] - Larry

3:41 pm on Jan 22, 2005 (gmt 0)

10+ Year Member



Thank you Larry and Ncw,

I don't know if it's just a coincidence, many times googlebot visited my site, crawled some pages, then as I said, after going to /robots.txt (which I don't have) it leaves my site. Which made me think it has something to do with it.

Please confirm again if I am really wrong.

Thanks again!

4:01 pm on Jan 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello again GP:

If I don't respond again, its because I collapsed under the desk.

My experience is that spidering and listings (as in SERPs) can differ by months.
MSN was spidering my site from top to bottom for months.
Then! at long last it was listed with good high placement.

Beyond the usual tutnums about quality content (highly valid of course!)
I would suggest taking existing pages and improving them.
Take them one at a time, and change something. Always think of the reader.
Would this or that wording be better?

Change one page a day, always improving what your potential visitors see.

If all else fails, you could spam a few guestbooks.
they almost never use "nofollow" gimmicks.

Try to say something nice about the cornbread recipe or whatever. - Larry

4:12 pm on Jan 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't think any of us will ever understand the mind of a robot but I see the same for a few days on end, googlebot will visit and only request the robots.txt file and then will do a deep spider.

Bit off topic for this forum but I will post it anyway
What you do need are links in to your site, the more links the better on the same theme as your site as well as good internal navigation within your pages making good use of anchor text.

Don't call your pages page1 page2 etc use a name associated with what the page is about with a good page title and description taken from the body text, while you are doing this add keywords, these not used by google at this present time but other search engines do use them

Thats it in a nut shell, its up to you to do a lot of hard work getting your pages just right

read up on Brett's 26 steps to 15k a Day site and then read it again and again

[searchengineworld.com...]
did I say read it again and again (you get the message)

This gives you the basics to building a better site, its up to you then what heights you want to take it with the amount of work you put into it, the harder you work the more rewards you should see, even if the rewards are just more visitors to your site.

4:25 pm on Jan 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Oh a couple of more thoughts.

Show your pages to your wife, girlfriend, sister, some female.

Women are half your potential audience, never forget that.
If your color scheme sucks, they will warn you.
The same goes for poorly worded stuff.
Listen to them and fix your site.

You can go golfing with the toilet seat up and/or down later. - Larry