it depends on whether you wanted the link for user or search engine
If 'user' then it's still a reciprocal
if it's SE there are easier ways of stopping a SE spidering from that page (but always peak inside the robots.txt just in case)
I have a handful of "partners" that block their links pages one way or another. I keep the links because I have them for my users, but I am sorry to lose out on whatever small SE benefit their page may have given.
Any body else think creating "one way" reciprocal links by blocking is a bit rude?
I think it is very important to be honest with any partnerships we make. It's a good idea to post your policy on your site, if you decide on this method, just to keep from offending.
Welcome clueless, to Webmasterworld. Happy posting.
I was thinking a page that was blocked from Google spidering would not get PR, however, now that I think about it, the page would still show up as having PR. I have noticed as soon as I upload a new page to my site it gets "PR" from the main domain long before Google stops by.
So unless a link is made purely because of value to visitors, it is wise to make sure your reciprocal link partners are not blocking the search engines from spidering their outbound links.
Can you tell me...
How do you check if a website is blocking search engines from spidering their links pages? How do you check the robots.txt file?
>>Any body else think creating "one way" reciprocal links by blocking is a bit rude?
Only if the site explicity stated in their agreement with you that your links should be spiderable by SEs, in which case its not only rude but lying.
dont assume that other sites realise why you wanted the link! Many wouldnt have a clue what PR is and probably just thought you were just being neighbourly and social!
We rarely do reciprocal links but when we do we assume that when people ask for links they are looking to receive direct referrals from people reading our pages, unless they specifically state they want the "link popularity benefit" or that the linked pages needs to be spiderable, in which case we usually don't bother to go any further.
[edited by: chiyo at 1:59 pm (utc) on May 4, 2003]
type their domain name followed by robots.txt Thay will tell you which directories they are requesting SE's to ignore.
Simple when you know how!
Wondered if you can help...
If a site has 'Disallow: /root/' in the file, will that stop the WHOLE site being spidered?
|brotherhood of LAN|
'Disallow: /' would disallow crawling of the whole site.
Check out robotstxt.org [robotstxt.org], there's only a few pages of light reading, all about robots.txt and how to use it.
I think its kind of rude to have robots.txt eliminate the value of a link page.
Also, all you'd need is for one of your link partners to get wise and email the others, and its game over.