homepage Welcome to WebmasterWorld Guest from 54.227.11.45
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 36 message thread spans 2 pages: < < 36 ( 1 [2]     
302 Redirects to/from Alexa?
japanese




msg:725810
 2:30 am on Mar 6, 2005 (gmt 0)

I cannot believe what I am seeing alexa.com doing in its popular websites information pages. All a competitor to your site has to do is place a link in his index page pointing to alexa's page that contains deadly links to yours, and your site could be history, especially if your pagerank is 0 to 4.

Alexa surely knows that googlebot and msnbot cannot handle these redirects. The bots olgo's were never meant to do this convoluted filtering and they simply cannot cope.

I challenge "googleguy" "google" "msnsearch" or more learned search engine specialists to disprove this cast iron fact.

Millions or websites accross the globe who have spent much money and time to create their websites should not be treated in this manner. In the name of common sense, why is alexa allowed to preserve its pagerank and destroy sites with a directive that is frowned upon by all search engines. Should we all now use CGI, ASP, GO-PHP, or pearl based scripts for linking?. Deny robots access at .htaccess to link pages pointing out, buy throwaway domains and use them for IP DELIVERY containing 302 directives to ones own benefit until this problem is eradicated?. Using these redirects must place a credibility issue on any site that pours them out in an automated process that can rival thousands of very fast typists.

Blatant use of 302 temporary redirect protocol directives that preserve alexa's own pagerank and the links that point to yours are as deadly as they come. Pouring out 302 redirects in their thousands, maybe millions, Who knows. The icing on the cake is a dynamic page generated with your websites link in it condeming your site into total oblivion in search results.

God help websites that are listed in alexa.com. Even if you are not, Alexa will pick up your pure html link that points to a site that is in alexa. Alexa will then reward you with a DEADLY 302 DIRECTIVE. Totally without your consent and googlebot will demote your site accordingly, conforming to the protocol it has been fed that your website is a temporary page of where the link in alexa resides.

URL = [redirect.alexa.com...]
UAG = Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
AEN =
FMT = AUTO
REQ = GET
Sending request:
GET /redirect?www.yoursite.com%2F HTTP/1.0
Host: redirect.alexa.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)

• Finding host IP address...
• Finding TCP protocol...
• Binding to local socket...
• Connecting to host...
• Sending request...
• Receiving response...

Total bytes received = 410
Elapsed time so far: 0 seconds
Header (Length = 199):
HTTP/1.1•302•Found(CR)
(LF)
Date:•Sun,•06•Mar•2005•01:35:51•GMT(CR)
(LF)
Server:•Apache(CR)
(LF)
Location:•http://www.yoursite.com/(CR)
(LF)
Content-Length:•211(CR)
(LF)
Connection:•close(CR)
(LF)
Content-Type:•text/html;•charset=iso-8859-1(CR)
(LF)
(CR)
(LF)

Content (Length = 211):
<!DOCTYPE•HTML•PUBLIC•"-//IETF//DTD•HTML•2.0//EN">(LF)
<html><head>(LF)
<title>302•Found</title>(LF)
</head><body>(LF)
<h1>Found</h1>(LF)
<p>The•document•has•moved•<a•href="http://www.yoursite.com/">here</a>.</p>(LF)
</body></html>(LF)

Done
Elapsed time so far: 0 seconds

 

incrediBILL




msg:725840
 3:55 pm on Mar 18, 2005 (gmt 0)

It's not Alexa that's crawling the site, it's The Internet Archive's Wayback machine.

I thought that was the case too but the WayBack machine plainly states:

"the Archive has been receiving data donations from Alexa Internet and others"

Now if that's just the list of web sites to crawl, I suppose that constitutes a data donation.

The fact that the crawler is called "ia_archiver" kind of points back to the Internet Archive doing the crawling and not Alexa, not sure which one of them runs the crawler for sure.

T_Rex




msg:725841
 6:20 pm on Mar 18, 2005 (gmt 0)

I am increasingly "hot to trot" to do some IP blocking. I see ia_archiver in my logs a lot. It won't be hard to do some tracking and IP identifying for archive.org. Right now I'm wieghing the risks, and I don't understand the benefit of feeding this badwidth to something that is starting so smell as liability, as this thread goes on. Can somebody please help me understand why feeding this bandwidth is of practical use?

mrMister




msg:725842
 8:39 am on Mar 19, 2005 (gmt 0)

Why on earth is is a liability?

LOL, don't ask for other people's advice. You should learn about the issues for yourself and make your own decision.

This forum is absolutely crammed full of trolls that would happily advise you remove every 302 redirect that's pointing at your site. And they'll probably be laughing as your SERPS start to tumble.

Unless you are being hijacked (and I mean properly hijacked, not just having a messed up listing in Google), leave well clear of any silly ideas.

incrediBILL




msg:725843
 9:09 am on Mar 19, 2005 (gmt 0)

Heck, I bet some of them wear foil helmets to avoid getting the ultimate hijack.

plumsauce




msg:725844
 10:15 am on Mar 19, 2005 (gmt 0)


ok, i just put my tinfoil beanie on, and want to share some thoughts on this 302 thing. again.

my basic premise is that the 3xx series of messages are part of the http protocol. see rfc 2616. nothing wrong with anyone using these. now wait before you declare me insane, there's more ...

a search engine is supposed to have the core mission of indexing pages and returning references to those pages in response to a search.

with me so far?

now, irrespective of X arbitrary number of redirects of any kind recognised by the crawler of the subject engine, it is the page not the link that should get the final credit.

i don't care if it's a 3xx or meta redirect. if the crawler resolves it then it knows where it ended up because it had to resolve the url to get there.

that's the general case applying to all search engines.

in the particular case of google, they like this ranking by link and anchor text thingie. that's fine, but it has nothing to do with resolving where the page actually lives. and the url of that page should get the credit for the content. that *is* what the user wants is it not?

now, if its a 302, meaning temporarily moved, then fine, still give the final url the credit and try to crawl it again later to see if it still exists.

if the phd wants to differentiate and make things complicated then fine, differentiate between 302's within the same domain and to other domains. or they could use the technology from crossref.org which apparently libraries and scholars use to uniquely identify thingies down to the citation level in scholarly works. but they would have to pay royalties on that.

bottomline?

it's broken. they don't know how to fix it. time to move on.

in the extreme case, just ignore them and design the site the way it should be designed. if your site has value, it will still be found by hook or by crook.

as i've said before, it ain't rocket science. they just want you to think it is. because otherwise you would have to conclude that it's broken.

one cat that i know of has about 10 real vendors in it. i've watched it move from about 1.8 million results to 8 million results in the past 6 months for a 3 word search. at one point the three words in quotes actually claimed to have 50% more results than the unquoted search of the same words. huh? that ain't the math i learned. this oddity was consistent for about 3 months pre-allegra.

++

mrMister




msg:725845
 12:12 pm on Mar 19, 2005 (gmt 0)

Plumsauce, I think your solution as it stands would leave Google favouring doorway pages again.

They need to change the way they handle hijacking 302 redirects. They can leave all other 302 redirects as they are, interpreting the RFC correctly.

They need a way of detecting whether the 302 is likely to cause a hijack (most 302's aren't), and if it does, they should penalise just that redirect page.

This 36 message thread spans 2 pages: < < 36 ( 1 [2]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved