| This 43 message thread spans 2 pages: < < 43 ( 1  ) || |
|How to handle outbound links in a directory?|
We have a large directory for our distributors.
The outbound links are in a file which counts the clicks called link.php, we have a function called url which when used in this syntax, counts the link then redirects to an outside site.
This would count a click for example2.com on our script then redirect to example2.com
The problem, is that we have over 100,000 outbound links on our site and it is virtually impossible to manually check all the links.
What prevention method can I take in order to make sure I'm not penalized for broken links or any site that is involved with methods that search engines frown upon? (I'm referring to links after the url=)
I would prefer a script that simply redirects in such a way that it doesn't count as an outbound link and redirects in some manner, is this possible?
I'm seriously thinking about writing a script that put the link in a text box, since I'm so sick of seeing 404 errors everytime I use a validator. I don't want to be penalized if any of these links are bad.
Since it is a "redirect" and I don't actually have a link to anything directly, could I still be penalized for my redirected links? (every outside link on my site uses the link.php?url= format)
...I was thinking, what if I add a robots.txt file like this:
That should solve the whole problem, right?
I'm not sure if blocking the robots has anything to do with counting an outbound link or not?
Looking for some opinions on this subject.
Interesting observation frox:
2) it says that "This isn't a negative vote for the site where the comment was posted"
Do you take this to mean the same thing as I do (that you will not be penalized for any site that you link to using the rel="nofollow" tag, regardless if it is a bad neighborhood, etc?)
I am suprised that you don't already spider sites yourself with 100,000+ sites. I have less than 5,000 sites and i have over 1/2 spidered, in which i have recorded their IP address (for future http "head" requests ... noting that if the IP address changes the site may have a new admin requiring a new review as to site content).
I look down myself on sites that try to hide their links from search engines. And, as a admin for a directory website i don't plan to hide my links. But, I have also had my share of links that go bad (mostly from new admins in my case ... the sites I link to normally make a profit so most people are not going to be changing their content away from that topic).
I do have currently the same problem with my website. I did implemented a routine that checks the links before redirecting (on demand) and then does a 302 to the target.
A simple, clean 302.
I have to say that im a little p. off both with Google and MSN. MSN took the simple and clean 302's (yes, built with MICROSOFT technology response.redirect) and cached the target pages as my own content. Very clever. Made me look like one of these hijackers. Thank you MSN. Google is just refusing to give any usable guidelines to professional coders. What could we use to count? ha?
Here is my solution that will both work for your visitors and for these "high skilled" search engines:
Just keep your redirection script but change it in the way to show the url as a hyperlink. Place an additional text on the top which states something like:
"You're leaving our website now .. tralala...click this link to go .. tralala...while visiting this website you may do ... tralala"
While you are loading the page you may check the link and remove it for the next visitor - or even display a message that states something like: "This server is may be temporary down..." And do count the display of this redirection page as a click.
Completely stupid, completely unnecessary, just more traffic from Googlebot - but it should work. I will do it on my site tomorrow.
If you do a 301 just be aware that it is a permanent redirection. It means:
This content is not longer fresh, actual.. whatever - go to the new address. Googlebot will do that - and i assume it will drop your page right after it has done it.
I did a 301 on some of my internal pages - and believe me ... i will NEVER EVER do it again...
itloc, not quite sure I know what you mean. I mean, I understand the idea but I'm not sure how the coding would work to mask the link correctly.
Can you show me an example snippet of code that you would use on your site?
i think itloc refers to making the redirect script generate a frameset with a message for the user in a top frame and the website in a bottom frame, like eg. About.com does.
As for 301, that's when it's used on real pages that these pages drop. I haven't tried it in a redirect script yet, because i'm also a bit uncertain as i've seen the effects on real pages.
Perhaps it will work, as the redirect script is not a real page. A redirect script is a "virtual page", ie. a page that doesn't exist. Still, the SE's think there's a page on that URL, as by some strange logic an URL that returns something must be a page. Not true.
I feel really bad about the SE behaviour on this issue, as they leave us no option but to do the frames thing. That is not only an artificial inflation in our page numbers, and extra server load for nothing - it's also very annoying to most users (as in "reason to use another directory in stead").
So to be friends with the SE's we have to annoy our users. No way i'm going to do that. Users first, always.
Of course we can just link using straight text links, and i tend to do this in stead, but: This makes things a lot easier for scrapers, and it removes the possibility to identify the most popular links, which is a directory feature that a lot of users use and like a lot.
301s and 302s are standard tools that webmasters can and should use. It's tools that are transparent to users, so they don't bother any real persons. I do see the SE attitude as an attempt to take those tools away from us, and i can't find a word strong enough to describe my feelings about this.
I will try another attempt to do the verification and the counting. I will do it exactly the way Google is doing it. Otherwise my customers will get a bad referer (the redirection page) and i don't want that.
I will send you the page when i have done it...
Yes, i fully agree with you. There are a lot of guys out there which are using all kinds of redirections in a wrong and harmful way. The non availability of proper guidelines lead to some very strange behaviour. After Allegra around 10 Webmaster joined me in discussing what exactly happened, why they have dropped and so on. My discoveries after several hours of intensive discussions:
1. Some have written an email to every single site that linked to them using a 302, asking to remove the link. Not Hijackers btw - simple, clean 302's.
2. Everybody seems to know how "Evil" a 302 is. Asking them: "Is a 301 better?" you will get the answer "What is a 301?"
One guy even tried to have his link removed from a website of the catholic church - if that is not paranoia - what else...
Could you please send me a page as well.
>>The problem, is that we have over 100,000 outbound links on our site and it is virtually impossible to manually check all the links.
What prevention method can I take in order to make sure I'm not penalized for broken links or any site that is involved with methods that search engines frown upon?
If your site is large, well established and with a decent PR I would not worry too much about it. Linking to a couple of PR0 sites will not kill a stable, established site. It might have a small effect but that should be minor. Brand new sites and low PR directories have more to worry about perhaps but who is building/running the directory, you or Google?
Of course you do want to police link rot, but I would do that for your user's sake not for Google's.
|The problem, is that we have over 100,000 outbound links on our site and it is virtually impossible to manually check all the links. |
I thought this was the purpose of running a "directory" as opposed to a spider-based "search engine".
|What prevention method can I take in order to make sure I'm not penalized for broken links or any site that is involved with methods that search engines frown upon? |
You may want to consider using a form submit disguised as a link and passing the info via the POST method.
mgeyman and Ept:
Stickied you the according page with my solution. Im kind of proud of it and love it ;-)
Frontpage also has a link checker:)
Don't know if this will get nuked or not, but I use Xenu. Excellent piece of software it checks all links, including outbound. Good little report at the end.
| This 43 message thread spans 2 pages: < < 43 ( 1  ) |