Forum Moderators: open
Just a quick question about google's spidering. Google spiders/crawls me daily, but only about 3 pages (out of about 300) and always the same ones. Every once in a while however, it will spider more pages (40+) but never all of them.
The other funny thing is that one of the pages it crawls daily is default.asp - I dont have a page called default.asp and I never had. Why would it spider this page which I dont have? Also, the pages it DOES spider everyday is very irrelavant (to me anyway) - it's things like "contact us" and some other very irrelevant pages. I would like it to spider/crawl my PRODUCT pages which surely is more relevant than "contact us".
Any ideas? As far as I can see my directory/layout should enable any spider to easily go through the whole site - no broken links etc etc.
Thanks,
MBF
Changing your link structure on your site is a great way to suggest to Googlebot which pages you think are the most important. If every one of your pages points to the "contact us" page, you can see where we'd crawl that earlier instead of later.
So I'd be thinking about how to take the visits you do get and how to try to guide them in the direction you want. Something to think about..
Thanks for this. I was thinking about exactly that, but I was/am scared that google in some how frowns upon that.
To tell you the truth - all the things you read about google banning you and "negative effects" on pr and position etc scares me so much that I never know what I can and cannot do.
I will however take your advice and give it a go.
Thank you very much.
MBF
I venture to say Google NEVER has banned a site for linking to itself. dmoz.org has approximately 6 MILLION links to itself: you'd think if this were a bannable offense, we'd be first. Yahoo! probably has 2 to 4 million. I have a quite nicely spidered personal subsite with 4 times as many internal links as pages -- and I think I need more internal links.
Although someone has experienced a penalty for a site that provides a navigation system/site map that links to itself - apparantently Keyword stuffing (!) - and this may be part of the way that Google has cracked down on Directory sites that link to their own serps (These sites typically have hundreds of site maps to get the thousands of pages indexed).
i know this question of mine is very lame. but i want to ask something.
how would i know the pages that googlebot crawled?
if im in the webserver (linux) how would i know that googlebot is crawling my website(s)?
*logs?
thanks