Forum Moderators: Robert Charlton & goodroi
Alexa surely knows that googlebot and msnbot cannot handle these redirects. The bots olgo's were never meant to do this convoluted filtering and they simply cannot cope.
I challenge "googleguy" "google" "msnsearch" or more learned search engine specialists to disprove this cast iron fact.
Millions or websites accross the globe who have spent much money and time to create their websites should not be treated in this manner. In the name of common sense, why is alexa allowed to preserve its pagerank and destroy sites with a directive that is frowned upon by all search engines. Should we all now use CGI, ASP, GO-PHP, or pearl based scripts for linking?. Deny robots access at .htaccess to link pages pointing out, buy throwaway domains and use them for IP DELIVERY containing 302 directives to ones own benefit until this problem is eradicated?. Using these redirects must place a credibility issue on any site that pours them out in an automated process that can rival thousands of very fast typists.
Blatant use of 302 temporary redirect protocol directives that preserve alexa's own pagerank and the links that point to yours are as deadly as they come. Pouring out 302 redirects in their thousands, maybe millions, Who knows. The icing on the cake is a dynamic page generated with your websites link in it condeming your site into total oblivion in search results.
God help websites that are listed in alexa.com. Even if you are not, Alexa will pick up your pure html link that points to a site that is in alexa. Alexa will then reward you with a DEADLY 302 DIRECTIVE. Totally without your consent and googlebot will demote your site accordingly, conforming to the protocol it has been fed that your website is a temporary page of where the link in alexa resides.
URL = [redirect.alexa.com...]
UAG = Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
AEN =
FMT = AUTO
REQ = GET
Sending request:
GET /redirect?www.yoursite.com%2F HTTP/1.0
Host: redirect.alexa.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
• Finding host IP address...
• Finding TCP protocol...
• Binding to local socket...
• Connecting to host...
• Sending request...
• Receiving response...
Total bytes received = 410
Elapsed time so far: 0 seconds
Header (Length = 199):
HTTP/1.1•302•Found(CR)
(LF)
Date:•Sun,•06•Mar•2005•01:35:51•GMT(CR)
(LF)
Server:•Apache(CR)
(LF)
Location:•http://www.yoursite.com/(CR)
(LF)
Content-Length:•211(CR)
(LF)
Connection:•close(CR)
(LF)
Content-Type:•text/html;•charset=iso-8859-1(CR)
(LF)
(CR)
(LF)
Content (Length = 211):
<!DOCTYPE•HTML•PUBLIC•"-//IETF//DTD•HTML•2.0//EN">(LF)
<html><head>(LF)
<title>302•Found</title>(LF)
</head><body>(LF)
<h1>Found</h1>(LF)
<p>The•document•has•moved•<a•href="http://www.yoursite.com/">here</a>.</p>(LF)
</body></html>(LF)
Done
Elapsed time so far: 0 seconds
Its not your fault. You sound concerned and I raise my hat to you.
This is a google problem.
Follow the main thread at [webmasterworld.com...]
And please contribute to the thread with a comment describing your anxiety.
As indicated by shri, a good reply. But it still remains google's problem to fix.
So perhaps google has already taken care of this? Or am I missing something...
I have asked Alex to remove my site and they promptly did so.
Japanese has the villagers all running amok with torches looking to burn witches.
Why would you do that?
Alexa is not malicious, not in the slightest.
Yahoo uses a 302 redirect, did ya know that?
Go tell them to promptly remove you as well....
Japanese has the villagers all running amok with torches looking to burn witches
Yes, I quite agree.
It is hilarious to see these people over-reacting to the issue. So many people believing their site has been hijacked when in actual fact it hasn't.
If they'd bother to study the subject and learn how 302 hijacking works, they'd find that it isn't too hard to spot the difference between a normal 302 redirect and a malicious (hijacking) 302 redirect.
Still. With all these people running around getting their backlinks pulled, it can only be good news for the rankings of the rest of us :-)
If you're referring to the fact that we redirect before the site leaves
Alexa.com, but still deliver the visitors to the site in question, it is
our right to use redirects to track where people go on our site and that
behavior will not be changed.
I don't now, nor have I EVER, seen any traffic from Aleza for any of my 25 clients so no great loss even if I am wrong in worrying about this redirect.
PS. they haven't removed the links and it's been about a week now.
I challenge "googleguy" "google" "msnsearch" or more learned search engine specialists to disprove this cast iron fact.
I think you'll find that society generally abides by the "innocent until proven guilty" guide, and most people would expect the onus to be on you to prove your own allegations.
302 redirects in themselves are not capable of properly hijacking a site
They do change the site listing in Google and cause searchers to take an extra hop via the redirecting server before arriving at their destination.
However, in order to fully hijack a site, the redirecting site has to implement some extra code other than the 302 redirect.
I think Japanese has investigated this problem in-depth and had made a good case for his point of view. You have said nothing to proof your case!
Japanese make a case about some site hijackings but Alexa has been around forever and has all of us in their web directory so suddenly a call to arms with villagers and torches is just plain nuts.
I've asked before and have not been shown a single site "hijacked" by Alexa!
Show me proof and I'll agree with his premise on Alexa
- anyone EVER see a site hijacking by Alexa's scripts?
Considering Alexa has just about every site in the internet indexed wouldn't it make sense if Alexa hijacking was an issue that they would've hijacked EVERYONE that could and would be hijacked by now?
Did his homework....
That's where he found my site and many others to do a 302 link to. But he could probably just as well have used the google results or Dmoz so can't really blame alexa for showing sites that are being visited. Or is it something more sinister?
I quote from there website:
"Alexa is continually crawling all publicly available web sites to create a series of snapshots of the Web. We use the data we collect to create features and services:
Site Information: Traffic rankings, pictures of sites, links pointing to sites and more
Related Links: Sites that are similar to the one you are currently viewing
Alexa has been crawling the Web since early 1996, and we have continually increased the amount of information that we gather. We are currently gathering approximately 1.6 Terabytes (1600 gigabytes) of Web content per day. After each snapshot of the Web, which takes approximately two months to complete, Alexa has gathered 4.5 Billion pages from over 16 million sites."
I quote from there website:
"Alexa is continually crawling all publicly available web sites to create a series of snapshots of the Web. We use the data we collect to create features and services:
I can't stress enough how silly this conversation is. You obviously don't understand Alexa, grabbing random quotes off their site isn't going to help.
The archiver that is mentioned in that quote would prevent a graphical preview of your site, and that's about it. You can stop it with robots.txt if you want but it won't take your listing down.
Nor would you want to. They are not hijacking sites.
Simple as that.
If Alexa was really hijacking sites then there would be examples of it, it would be clear to see in Google. They aren't.
If anyone thinks that Alexa is hijacking their site then sticky me. I'm waiting for the evidence...
/me watches the tumbleweed getting blown across his stickymail inbox.
incrediBILL stated that Alexa does not Crawl site
I was talking in terms of building the SEARCH results, plainly says "Powered by Google"
Something Alexa DOES do that I forgot about (it was late) is they feed the Internet Archive (archive.org) WayBack Machine which is why their little crawler is actually called "ia_archiver".
To prevent ia_archiver from visiting add this to your robots.txt file:
User-agent: ia_archiver
Disallow: /
Of course then you'll never show up in the WayBack Machine and service has been useful more than once in copyright disputes to prove how long ago specific text was on my site.
I can admit when I make an oversight :)
incrediBILL stated that Alexa does not Crawl site, the quote was just to proof how wrong he is.
Actually, incrediBILL is right.
It's not Alexa that's crawling the site, it's The Internet Archive's Wayback machine.
The Wayback machine uses Alexa's database of sites (gathered from Alexa toolbar users), but Alexa and the Internet Archive are two seperate organisiations.
Take this as another scenario...
Q. If Google were to allow you to access their database of 8 billion URLs, and you were to write a program to crawl each of those URLs, who would be doing the crawling?
A. You of course!
By your logic, it seems that you would believe that Google would be doing the crawling!