homepage Welcome to WebmasterWorld Guest from 54.237.134.62
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe and Support WebmasterWorld
Home / Forums Index / Marketing and Biz Dev / Deprecated - Search Engine Submission
Forum Library, Charter, Moderator: open

Deprecated - Search Engine Submission Forum

    
Spidering Links
Outtatheblue




msg:702907
 4:39 pm on Jun 11, 2001 (gmt 0)

When I used the search engine simulator at searchengineworld.com I realized that it was spidering my pages' internal links as 'http://link.html' rather than 'http://www.mysite.com/link.html'. Do all the SEs work like this? Should I go back and but the full links in all my pages?

 

Hunter




msg:702908
 5:08 pm on Jun 11, 2001 (gmt 0)

Outtatheblue,

Not sure I understand your question. You don't want your Site's links to it's own pages to be structured 'http://link.html', you want them to be structured simply 'link.html'

Outtatheblue




msg:702909
 5:35 pm on Jun 11, 2001 (gmt 0)

The links should be seen on an SE as 'http://www.mysite.com/link.html, but they are appearing as only 'link.html', because that's the way I've linked them within my site under the same domain...

domder




msg:702910
 4:35 pm on Jan 9, 2002 (gmt 0)

*bump*

I am a bit new to SEO, and was wondering the same thing. If a link is written "page1.html" as opposed to "http://www.mydomain.com/page1.html", are spiders still able to follow the link successfuly?

Thanks in advance for any help.

Brett_Tabke




msg:702911
 4:37 pm on Jan 9, 2002 (gmt 0)

Sure. Any relatively modern spider is going to understand relative links.

domder




msg:702912
 7:19 pm on Jan 9, 2002 (gmt 0)

Thanks Brett,

and great site btw.

TallTroll




msg:702913
 1:42 pm on Jan 16, 2002 (gmt 0)

I saw this as well today. I was looking at a site with the SimSpider, and they have just 1 link from the index.html. The tag looks like this :

<a href="navigation.html"><big><strong>ENTER</strong></big></a>

All this lives in a table <td>, and I have picked up a dangling <p> tag inside the <td> that the link is in. Could this cause problems with SimSpiders ability to correctly parse the link?

If you hover over, or use the link, its fine. SimSpider reports is as pointing to "http://navigation.html" though. I also note with interest that the site in question now has a PR of 0

Any thoughts anyone?

Eric_Lander




msg:702914
 1:51 pm on Jan 16, 2002 (gmt 0)

I'm going out on a limb here to add or haze things up a degree or two.

In my experiences, it is easier to create sites with relative links - and let the SE's figure it out.

Does that yield the best results? Not necessarily. I am with Brett in believing that the SE's spiders, bots and crawlers can understand where the link goes without any issues.

However, looking at engines like Google - who index AND cache pages for their users - perhaps they prefer absolute links over relative links, as it make the archiving of pages easier.

Again, just a theory in the most undeveloped sense.

Thanks,
~ Eric

Outtatheblue




msg:702915
 2:29 pm on Jan 16, 2002 (gmt 0)

Having had a lot more experience since my original post in June, I don't think it matters. Relative links are the same as absolute links in a spider's eyes. Spider technology is quite advanced...

TallTroll




msg:702916
 3:43 pm on Jan 16, 2002 (gmt 0)

True, I'm just curious as to why the SimSpider isn't reading a perfectly good (at least it LOOKS perfectly good) relative link on this other site correctly. It insists it is being pointed to "http://navigation.html"

I just find it odd that this site ALSO has a PR of 0, AND the effect of the fault would be to make virtually the entire site unreachable, if an SE spider behaves the same way as the SimSpider.

<a little later>

Ding, got it!

Spidering www.domain.co.uk gets the link wrong
Spidering www.domain.co.uk/ gets the link right

I bet SimSpider constructs relative paths from the entered URL (not unreasonably), by just tagging the filename called on the end of the base URL without checking for a terminal /. Also, entering .../index.html gets it right, so I bet its smart enough to sub the filename correctly, cos it'll have a / there.

So, my fault for being sloppy <grin> Who'd have guessed that? ;) It may have been a total waste of 1/2 an hour, but it makes me happy, y'know?

Brett_Tabke




msg:702917
 6:39 pm on Jan 22, 2002 (gmt 0)

Exactly. I'm using a stock module to extract links. It works on whatever it is fed. I looked at fixing it, but the exceptions were many (base hrefs...etc). I preferred to leave it stock.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Marketing and Biz Dev / Deprecated - Search Engine Submission
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved