homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Marketing and Biz Dev / Deprecated - Search Engine Submission
Forum Library, Charter, Moderator: open

Deprecated - Search Engine Submission Forum

Spidering Links

 4:39 pm on Jun 11, 2001 (gmt 0)

When I used the search engine simulator at searchengineworld.com I realized that it was spidering my pages' internal links as 'http://link.html' rather than 'http://www.mysite.com/link.html'. Do all the SEs work like this? Should I go back and but the full links in all my pages?



 5:08 pm on Jun 11, 2001 (gmt 0)


Not sure I understand your question. You don't want your Site's links to it's own pages to be structured 'http://link.html', you want them to be structured simply 'link.html'


 5:35 pm on Jun 11, 2001 (gmt 0)

The links should be seen on an SE as 'http://www.mysite.com/link.html, but they are appearing as only 'link.html', because that's the way I've linked them within my site under the same domain...


 4:35 pm on Jan 9, 2002 (gmt 0)


I am a bit new to SEO, and was wondering the same thing. If a link is written "page1.html" as opposed to "http://www.mydomain.com/page1.html", are spiders still able to follow the link successfuly?

Thanks in advance for any help.


 4:37 pm on Jan 9, 2002 (gmt 0)

Sure. Any relatively modern spider is going to understand relative links.


 7:19 pm on Jan 9, 2002 (gmt 0)

Thanks Brett,

and great site btw.


 1:42 pm on Jan 16, 2002 (gmt 0)

I saw this as well today. I was looking at a site with the SimSpider, and they have just 1 link from the index.html. The tag looks like this :

<a href="navigation.html"><big><strong>ENTER</strong></big></a>

All this lives in a table <td>, and I have picked up a dangling <p> tag inside the <td> that the link is in. Could this cause problems with SimSpiders ability to correctly parse the link?

If you hover over, or use the link, its fine. SimSpider reports is as pointing to "http://navigation.html" though. I also note with interest that the site in question now has a PR of 0

Any thoughts anyone?


 1:51 pm on Jan 16, 2002 (gmt 0)

I'm going out on a limb here to add or haze things up a degree or two.

In my experiences, it is easier to create sites with relative links - and let the SE's figure it out.

Does that yield the best results? Not necessarily. I am with Brett in believing that the SE's spiders, bots and crawlers can understand where the link goes without any issues.

However, looking at engines like Google - who index AND cache pages for their users - perhaps they prefer absolute links over relative links, as it make the archiving of pages easier.

Again, just a theory in the most undeveloped sense.

~ Eric


 2:29 pm on Jan 16, 2002 (gmt 0)

Having had a lot more experience since my original post in June, I don't think it matters. Relative links are the same as absolute links in a spider's eyes. Spider technology is quite advanced...


 3:43 pm on Jan 16, 2002 (gmt 0)

True, I'm just curious as to why the SimSpider isn't reading a perfectly good (at least it LOOKS perfectly good) relative link on this other site correctly. It insists it is being pointed to "http://navigation.html"

I just find it odd that this site ALSO has a PR of 0, AND the effect of the fault would be to make virtually the entire site unreachable, if an SE spider behaves the same way as the SimSpider.

<a little later>

Ding, got it!

Spidering www.domain.co.uk gets the link wrong
Spidering www.domain.co.uk/ gets the link right

I bet SimSpider constructs relative paths from the entered URL (not unreasonably), by just tagging the filename called on the end of the base URL without checking for a terminal /. Also, entering .../index.html gets it right, so I bet its smart enough to sub the filename correctly, cos it'll have a / there.

So, my fault for being sloppy <grin> Who'd have guessed that? ;) It may have been a total waste of 1/2 an hour, but it makes me happy, y'know?


 6:39 pm on Jan 22, 2002 (gmt 0)

Exactly. I'm using a stock module to extract links. It works on whatever it is fed. I looked at fixing it, but the exceptions were many (base hrefs...etc). I preferred to leave it stock.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Marketing and Biz Dev / Deprecated - Search Engine Submission
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved