Forum Moderators: phranque

Message Too Old, No Replies

How do I create a crawlable website?

My site not crawlable by robots

         

crizz_king

1:56 pm on Sep 10, 2005 (gmt 0)

10+ Year Member



I wonder if anyone can help.
I have recently designed and written my website using html and notepad.
It all works perfectly on the server and comes up on yahoo.
I'm trying to get it on google but no luck except once for a couple of days.
I am trying to create a site map and have discovered whilst using an automatic sitemap generator that the pages don't crawl. I believe the robots metatag is ok and the robots.txt file is also ok.
I wonder why I'm having this problem, do you think that each page reference url has to contain the complete http:// pathway?

NB: I did notice that the yahoo cache only contains the index.html and no others which confirms the non-crawlable nature of my site.

many thanks, crizz.

[edited by: jdMorgan at 2:00 am (utc) on Sep. 12, 2005]
[edit reason] No URLS. Please see TOS. [/edit]

DanA

2:10 pm on Sep 10, 2005 (gmt 0)

10+ Year Member



You should read that :
[webmasterworld.com...]
Your html doesn't validate, Mozilla cannot display your pages as intended, but I can see no reason to make your site uncrawlable.
You can test that with any offline browser or change sitemap generator.
The problem may be that Google may or may not come to crawl your site and index it and that it takes a few weeks before a site appears.

Matt Probert

2:38 pm on Sep 10, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As generic advice, try viewing your site with the free text browser 'Lynx', this will give an indication of what a web spider will detect.

Regarding your own site, there seems to be a lack of data on the pages. Try adding content which describes the page.

Matt

crizz_king

1:03 am on Sep 12, 2005 (gmt 0)

10+ Year Member



many thanks mat and dan and the moderator.
I've spent all day , literally, validating all the pages in my sites. They all validate except 2 pages where some of the text i require to be visible comes up as errors when reaching an apostrophe (even using PRE) but i'm working on that. It was surprising how many mistakes i hadn't noticed. Part of the problem was i didn't specify the doctype at the top of all the pages. Its strange that none the tutorials that i learnt html with, mentioned that important fact!
According to my sitemap generator program i've now created site maps that crawl properly when the robots file is turned off so am wondering if the robots file is at fault although google has varified the sitemaps i submitted.
I just added a couple of exclusions as i read its best not to leave the dissallow statement empty.

User-agent:*
Disallow: /cgi-bin/
Disallow: /images/

so heres hoping everything runs properly now.
thanks again for all your assistance, and i will try all the suggestions.
crizz in Newcastle England.

essayswap

6:36 pm on Oct 4, 2005 (gmt 0)

10+ Year Member



I found a nice free tool that creates sitemaps online. It's at:
[neuroticweb.com ]