Forum Moderators: open

Message Too Old, No Replies

Googlebot hits

How to get them to deep crawl

         

EAHunt

11:33 pm on Sep 25, 2002 (gmt 0)

10+ Year Member



What makes googlebot stay and deep crawl your site. I seem to get hit but then he leaves.

I used Brett's spider simulator and here is my question:

in my navigation bar I have links that look like this

div style="position:absolute;top=100px;">
<div id="whatever" style="position:absolute; width:131px; height:22px; z-index:1; left:0px; top: 0px;"><a href="../about.htm" onMouseOut="MM_swapImgRestore()" onMouseOver="MM_swapImage('about','','../navimages/navabout2.gif',1)"><img name="about" border="0" src="../images/navabout.gif" width="130" height="25" alt="whatever"></a></div>

but in the spider simulator it indicates an error grabbing url: 500
[about.htm...]

is this what googlebot is seeing, relative as opposed to absolute?

If so, then I am coding all my navigation incorrectly

bobmark

11:44 pm on Sep 25, 2002 (gmt 0)

10+ Year Member



Your edit wasn;t up when I replied.
If the spider sim doesn;lt like it Googlebot won;t either. You are referencing a page two levels back (the ../) but the spider is looking for a page in the home directory.
For example if you were in home/folder1/folder2/page.html that would be the correct ref; if it is on your home page it is not,
Assuming you don't have something like a ROBOTS "noindex,nofollow" metatag or something prohibiting Googlebot in your robots.txt or an .htaacess file, pages must be accessible to Googlebot. That is, the robot follows links so if links are available off your home page (text or image links for sure, php, etc, if you set them correctly) Googlebot will follow them and deep crawl.
You can also use an "index,follow" metatag but Googlebot should follow them anyway so it isn't strictly necessary.
Often with a new site, Googlebot will come once or twice just to the homepage and then return later for a deep crawl so it may be only that depending on how new you are to Googleobot.
hope this helps

miuser

1:27 am on Sep 26, 2002 (gmt 0)



I also just used the spider sim and all the link pages show as :

[about.htm...]
[company.htm...]
etc...

is this really how the spider sees my links or not? In the html of the page it would read something as <A href="about.htm">

Do I need to change all my links?

jatar_k

1:41 am on Sep 26, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



relax everyone

EAHunt, that url is fine as long as the page it is one is one level below the directory that contains about.htm, if not, change the link.

bobmark
>>If the spider sim doesn;lt like it Googlebot won;t either

I don't agree with that actually. I have used the sim spider quite a bit and it doesn't read relative url's very well. If you are very meticulous you can read through the links the sim spider returns, look through your code and see which ones are fine and which are broken.

Welcome to WebmasterWorld [webmasterworld.com] miuser

my previous comments apply here as well.

personal opinion but I don't think the sim spider is on quite the same level as SE spiders. Obviously no disrespect but SE's spend thousands maybe millions of man hours with multiple PHD's on their spiders, should be pretty refined and intelligent enough to read both relative and absolute urls.

EAHunt

2:11 am on Sep 26, 2002 (gmt 0)

10+ Year Member



okay, you opened it up, thanks, I was afraid to ask.

My partner is a pain in the "a**" about absolute and relative and screams and throws a fit if it says www.mysite.com/about.htm instead of about.htm

So which is better for search engines absolute or relative?

jatar_k

2:14 am on Sep 26, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



to be honest I don't really thin it matters in essence. Both are valid formats. I am sure there are a millio and one theories based on ranking, best practice, whatever. Ther are certain situations where you need to use absolute but even then there are two methods of absolute links.

Some absolute links include the url but starting with a / is also absolute. For spidering purposes I don't think it matters as long as the links are valid.

EAHunt

2:43 am on Sep 26, 2002 (gmt 0)

10+ Year Member



I think I threw all off with the relative/absolute/spider sim questions.

For about a month all that I have gotten from googlebot is a little nip and off she goes. So is it cause I don't have enough links, or is it my code and it can't follow my navigation.

I used dreamweaver to build the site, and I used the library function for my navigation (change once, updates every page). Would this cause a problem.

jatar_k

2:45 am on Sep 26, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



If the links are all valid and are of a similar style to the one you posted that isn't the issue.

bobmark

2:57 am on Sep 26, 2002 (gmt 0)

10+ Year Member



i agree with jatar that if the links work it is probably not a problem. If the site is new to Google then deep crawling often takes awhile after the first Googlebot visit to your site. I can definitely remember a 4 to 6 week initial period with one site I did and was wondering the same as you "what did I do wrong?" Usually from about now to the 10th. or so is heavy Googlebot crawl time so you may see a deep crawl any day.
If your site repeatedly gets visited by Googlebot for say 8-10 weeks but not deep crawled then something else is wrong. Googlebot will crawl a site with zero links in; it just does not find the site on its own as there is nothing to lead the bot to them, but it found yours so that isn't your problem.

sw8296

11:29 am on Sep 26, 2002 (gmt 0)

10+ Year Member



Give the spider simulator the index page...

[mysite.com...] (for example)

not just the domain..

[mysite.com...]

and it works fine.

thepcstore

1:32 pm on Sep 26, 2002 (gmt 0)

10+ Year Member



Hi.

Please check out this post which I started yesterday, which my be quite helpful...

[webmasterworld.com ]

Steve.