Welcome to WebmasterWorld Guest from 18.104.22.168
Alexa surely knows that googlebot and msnbot cannot handle these redirects. The bots olgo's were never meant to do this convoluted filtering and they simply cannot cope.
I challenge "googleguy" "google" "msnsearch" or more learned search engine specialists to disprove this cast iron fact.
Millions or websites accross the globe who have spent much money and time to create their websites should not be treated in this manner. In the name of common sense, why is alexa allowed to preserve its pagerank and destroy sites with a directive that is frowned upon by all search engines. Should we all now use CGI, ASP, GO-PHP, or pearl based scripts for linking?. Deny robots access at .htaccess to link pages pointing out, buy throwaway domains and use them for IP DELIVERY containing 302 directives to ones own benefit until this problem is eradicated?. Using these redirects must place a credibility issue on any site that pours them out in an automated process that can rival thousands of very fast typists.
Blatant use of 302 temporary redirect protocol directives that preserve alexa's own pagerank and the links that point to yours are as deadly as they come. Pouring out 302 redirects in their thousands, maybe millions, Who knows. The icing on the cake is a dynamic page generated with your websites link in it condeming your site into total oblivion in search results.
God help websites that are listed in alexa.com. Even if you are not, Alexa will pick up your pure html link that points to a site that is in alexa. Alexa will then reward you with a DEADLY 302 DIRECTIVE. Totally without your consent and googlebot will demote your site accordingly, conforming to the protocol it has been fed that your website is a temporary page of where the link in alexa resides.
URL = [redirect.alexa.com...]
UAG = Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
FMT = AUTO
REQ = GET
GET /redirect?www.yoursite.com%2F HTTP/1.0
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
• Finding host IP address...
• Finding TCP protocol...
• Binding to local socket...
• Connecting to host...
• Sending request...
• Receiving response...
Total bytes received = 410
Elapsed time so far: 0 seconds
Header (Length = 199):
Content (Length = 211):
Elapsed time so far: 0 seconds
It's not Alexa that's crawling the site, it's The Internet Archive's Wayback machine.
I thought that was the case too but the WayBack machine plainly states:
"the Archive has been receiving data donations from Alexa Internet and others"
Now if that's just the list of web sites to crawl, I suppose that constitutes a data donation.
The fact that the crawler is called "ia_archiver" kind of points back to the Internet Archive doing the crawling and not Alexa, not sure which one of them runs the crawler for sure.
LOL, don't ask for other people's advice. You should learn about the issues for yourself and make your own decision.
This forum is absolutely crammed full of trolls that would happily advise you remove every 302 redirect that's pointing at your site. And they'll probably be laughing as your SERPS start to tumble.
Unless you are being hijacked (and I mean properly hijacked, not just having a messed up listing in Google), leave well clear of any silly ideas.
my basic premise is that the 3xx series of messages are part of the http protocol. see rfc 2616. nothing wrong with anyone using these. now wait before you declare me insane, there's more ...
a search engine is supposed to have the core mission of indexing pages and returning references to those pages in response to a search.
with me so far?
now, irrespective of X arbitrary number of redirects of any kind recognised by the crawler of the subject engine, it is the page not the link that should get the final credit.
i don't care if it's a 3xx or meta redirect. if the crawler resolves it then it knows where it ended up because it had to resolve the url to get there.
that's the general case applying to all search engines.
in the particular case of google, they like this ranking by link and anchor text thingie. that's fine, but it has nothing to do with resolving where the page actually lives. and the url of that page should get the credit for the content. that *is* what the user wants is it not?
now, if its a 302, meaning temporarily moved, then fine, still give the final url the credit and try to crawl it again later to see if it still exists.
if the phd wants to differentiate and make things complicated then fine, differentiate between 302's within the same domain and to other domains. or they could use the technology from crossref.org which apparently libraries and scholars use to uniquely identify thingies down to the citation level in scholarly works. but they would have to pay royalties on that.
it's broken. they don't know how to fix it. time to move on.
in the extreme case, just ignore them and design the site the way it should be designed. if your site has value, it will still be found by hook or by crook.
as i've said before, it ain't rocket science. they just want you to think it is. because otherwise you would have to conclude that it's broken.
one cat that i know of has about 10 real vendors in it. i've watched it move from about 1.8 million results to 8 million results in the past 6 months for a 3 word search. at one point the three words in quotes actually claimed to have 50% more results than the unquoted search of the same words. huh? that ain't the math i learned. this oddity was consistent for about 3 months pre-allegra.
They need to change the way they handle hijacking 302 redirects. They can leave all other 302 redirects as they are, interpreting the RFC correctly.
They need a way of detecting whether the 302 is likely to cause a hijack (most 302's aren't), and if it does, they should penalise just that redirect page.