Welcome to WebmasterWorld Guest from 107.22.7.35

Forum Moderators: open

Message Too Old, No Replies

Spidering Links

     
4:39 pm on Jun 11, 2001 (gmt 0)

10+ Year Member



When I used the search engine simulator at searchengineworld.com I realized that it was spidering my pages' internal links as 'http://link.html' rather than 'http://www.mysite.com/link.html'. Do all the SEs work like this? Should I go back and but the full links in all my pages?
5:08 pm on Jun 11, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Outtatheblue,

Not sure I understand your question. You don't want your Site's links to it's own pages to be structured 'http://link.html', you want them to be structured simply 'link.html'

5:35 pm on Jun 11, 2001 (gmt 0)

10+ Year Member



The links should be seen on an SE as 'http://www.mysite.com/link.html, but they are appearing as only 'link.html', because that's the way I've linked them within my site under the same domain...
4:35 pm on Jan 9, 2002 (gmt 0)

10+ Year Member



*bump*

I am a bit new to SEO, and was wondering the same thing. If a link is written "page1.html" as opposed to "http://www.mydomain.com/page1.html", are spiders still able to follow the link successfuly?

Thanks in advance for any help.

4:37 pm on Jan 9, 2002 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Sure. Any relatively modern spider is going to understand relative links.
7:19 pm on Jan 9, 2002 (gmt 0)

10+ Year Member



Thanks Brett,

and great site btw.

1:42 pm on Jan 16, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I saw this as well today. I was looking at a site with the SimSpider, and they have just 1 link from the index.html. The tag looks like this :

<a href="navigation.html"><big><strong>ENTER</strong></big></a>

All this lives in a table <td>, and I have picked up a dangling <p> tag inside the <td> that the link is in. Could this cause problems with SimSpiders ability to correctly parse the link?

If you hover over, or use the link, its fine. SimSpider reports is as pointing to "http://navigation.html" though. I also note with interest that the site in question now has a PR of 0

Any thoughts anyone?

1:51 pm on Jan 16, 2002 (gmt 0)

10+ Year Member



I'm going out on a limb here to add or haze things up a degree or two.

In my experiences, it is easier to create sites with relative links - and let the SE's figure it out.

Does that yield the best results? Not necessarily. I am with Brett in believing that the SE's spiders, bots and crawlers can understand where the link goes without any issues.

However, looking at engines like Google - who index AND cache pages for their users - perhaps they prefer absolute links over relative links, as it make the archiving of pages easier.

Again, just a theory in the most undeveloped sense.

Thanks,
~ Eric

2:29 pm on Jan 16, 2002 (gmt 0)

10+ Year Member



Having had a lot more experience since my original post in June, I don't think it matters. Relative links are the same as absolute links in a spider's eyes. Spider technology is quite advanced...
3:43 pm on Jan 16, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



True, I'm just curious as to why the SimSpider isn't reading a perfectly good (at least it LOOKS perfectly good) relative link on this other site correctly. It insists it is being pointed to "http://navigation.html"

I just find it odd that this site ALSO has a PR of 0, AND the effect of the fault would be to make virtually the entire site unreachable, if an SE spider behaves the same way as the SimSpider.

<a little later>

Ding, got it!

Spidering www.domain.co.uk gets the link wrong
Spidering www.domain.co.uk/ gets the link right

I bet SimSpider constructs relative paths from the entered URL (not unreasonably), by just tagging the filename called on the end of the base URL without checking for a terminal /. Also, entering .../index.html gets it right, so I bet its smart enough to sub the filename correctly, cos it'll have a / there.

So, my fault for being sloppy <grin> Who'd have guessed that? ;) It may have been a total waste of 1/2 an hour, but it makes me happy, y'know?

6:39 pm on Jan 22, 2002 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Exactly. I'm using a stock module to extract links. It works on whatever it is fed. I looked at fixing it, but the exceptions were many (base hrefs...etc). I preferred to leave it stock.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month