Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google Showing Outbound Re-Directed Pages In Results

         

c41lum

5:05 pm on Jul 19, 2010 (gmt 0)

10+ Year Member



Hi everyone,

I have noticed some strange things happening in Google recently which have alarmed me and I wanted to see if anybody else has had the same issue. And more importantly what the fix would be.

Google is showing our " out bound re-direct pages"in its results... These page have no content "just a title header and a script that redirects them out to a affiliate partner website that we write news about on a daily basis. Whats baffling me is these pages are blocked in our robots.txt file.

It looks like these pages are supplemental because there is no cached copy. I would also expect them to be supplemental because they offer no value to a user. What is worrying me is we have recently been hit pretty hard in the SERPS and wonder whether these supplemental pages could be the problem.

Our page works like this: there is a link that says 'More Info' that when clicked it goes to a page on the server called blue.asp?12342 the script then redirects the user to the desired affiliate partner. What Google is doing is listing the blue.asp? pages in its results. When I click the link in the results the page just linking straight to that affiliate.


When I do the "site:" command search Google shows 2194 results for these redirected pages. We have 72k pages indexed.

I'm I barking up the wrong tree or could these redirects be having a detrimental effect on our site quality and therefore our positions.

Fraggler

9:24 pm on Jul 19, 2010 (gmt 0)

10+ Year Member



Yep, I am experiencing the same thing but my links are being auto generated through a Wordpress plugin and the htaccess is pointing them to the right location. They are appearing with the site: command with the Title and description of the end page.

This has been happening for a few weeks now but never investigated it any further.

tedster

9:33 pm on Jul 19, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Are these pages returning a 301 status in the http header?

c41lum

9:50 pm on Jul 19, 2010 (gmt 0)

10+ Year Member



We are doing a 302 in the header.

aristotle

9:51 pm on Jul 19, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you can put a noindex meta tag in the headers of these pages, that should prevent Google from indexing them.

tedster

10:00 pm on Jul 19, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



We are doing a 302 in the header.

And that is why the URL ranks.

c41lum

10:28 pm on Jul 19, 2010 (gmt 0)

10+ Year Member



Hi guys thanks for the help.

Do you think its the 302 that's getting these pages ranked? I thought the 302 told G that this page had moved temporarily. I never thought they would rank the redirect page.

Its basically a redirect that goes outbound, isn't doing a 301 to a outbound affiliate URL bad?

Sorry if i seem like i'm going round in circles.

c41lum

10:55 pm on Jul 19, 2010 (gmt 0)

10+ Year Member



Using the site: command its now jumped from 2194 to over 21000 redirect pages showing in G.

Looks like this might be a bigger problem than originally thought. I imagine to Google we look very spammy/dodgy having this many redirect pages.

mhansen

12:14 am on Jul 20, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I noticed the same thing on one of my MayDay affected sites.

- The directory is blocked in the robots.txt
- The links have been rel=nofollowed wherever they exist in content for +1 year, to prevent junk in serps. (against the latest MattCutts directive)
- The URL's send a 301 to the header

Using the site:domain.tld does NOT show the URL's in serps unless I hit the "show more results" link in the bottom of the serp.

MH

c41lum

12:26 am on Jul 20, 2010 (gmt 0)

10+ Year Member



Hi MH I'm not sure your problem is exactly the same. My supplemental pages do a 302 redirect out to our affiliate partners. And its the redirects that are showing in the G results.

Your 301 supplemental pages should be fine and will drop off i would of thought in time.

tedster

6:10 am on Jul 20, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Here's how Google handles a URL that 302 redirects: they index that URL itself - because a 302 is a "temporary" redirect. This is exactly the effect you're seeing. So change those redirects to use a 301 Permanent status, and those URLs should fall out over time.

johnnie

7:18 am on Jul 20, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Don't rely on robots.txt if you want a URL removed. It will show as discovered (no description etc.), but the bots won't *spider* it.

c41lum

2:14 pm on Jul 20, 2010 (gmt 0)

10+ Year Member



On doing a lot more research it turns out we can find 252,000 of these "redirect pages" sitting in googles index or supplemental index.

We don't actually need these pages they offer no benefit to our users we simply had them to handle the redirects out to our affiliate partners.

I think these pages maybe the reason for our MayDay hit, because they are essentially blank pages with a redirect.

Heres my plan of action, so far. Any help or advice would be a massive help.

1.404 all 250,000 old redirect pages that G returns. THIS REALLY WORRIES ME BECAUSE IV READ G DOESN'T LIKE LOTS OF 404'S.
2.Google Removal Request on all old redirects.
3.Change redirect URL.
4.Make new redirect pages with rel=nofollow.
5.Restrict new URL in robots.txt
5.Block new redirect pages using "Parameter handling" in WMT.

Is there anything else I should do or am I using a sledge hammer to crack a nut.

side note: Google found these links through Javascript using window.open().

mhansen

5:11 pm on Jul 20, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Heres my plan of action, so far. Any help or advice would be a massive help.

1.404 all 250,000 old redirect pages that G returns. THIS REALLY WORRIES ME BECAUSE IV READ G DOESN'T LIKE LOTS OF 404'S.
2.Google Removal Request on all old redirects.
3.Change redirect URL.
4.Make new redirect pages with rel=nofollow.
5.Restrict new URL in robots.txt
5.Block new redirect pages using "Parameter handling" in WMT.


I know I am preaching to the choir here, but this is all just ONE MORE REASON it would nice if GooG was a bit more forthcoming about why sites are seeing OBVIOUS penalties.

Since my own MayDay debacle... I have spent more time chasing and trying to correct the "whatif's" (what if there was nothing even wrong with what you change, and you just make it worse?)

#1 - ouch

#4 - if you nofollow an onsite affiliate hoplink, isn't that just as bad? MCutts just went out of his way a few weeks ago to say "DO NOT rel=nofollow onsite links" for any reason.

tedster

5:33 pm on Jul 20, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



THIS REALLY WORRIES ME BECAUSE IV READ G DOESN'T LIKE LOTS OF 404'S.

I read that too and I think it's total mythology. If you take care of removing the internal links to those removed URLs, then what's the problem? I've never seen one.

Any character string that doesn't resolve on you server is a 404 - that means every website has an infinite number of 404s! The problem is broken internal links. And not broken external links, either, although they are certainly a waste of potential link juice and require different handling.

c41lum

8:46 pm on Jul 20, 2010 (gmt 0)

10+ Year Member



Thanks for the feed back.

Yeah I think as long as we aren't linking to the 404 we should be ok.....fingers crossed.

mhansen where did you read that about rel=nofollow links, I cant find it? I did read some stuff about G trying to stop PR sculpting. Was that it?

I was under the assumption I would have to rel=nofollow the links other wise G would try and index the "blank affiliate redirects" again.

I would 301 them but the destination of the URL changes depending on the cheapest affiliate price.

side note: prior to the MayDay update we spotted a huge spike in crawled pages, looks like these supplemental pages were the ones they were crawling. Has any body else seen a huge jump in supplemental or robots blocked pages showing up in the results.

c41lum

9:55 pm on Jul 20, 2010 (gmt 0)

10+ Year Member



I dont know whether this is a coincidence or not but I added 10 or so rel=nofollows on lots all of my main pages...... and my site has bombed.

In just under 5 hours some pages have gone from bottom of page 1 to top of page 3. All pages have had some kind of drop.

We do get crawled by G very often.

I cant believe these "nofollows" could have this big an effect.

Any Ideas?

tedster

10:28 pm on Jul 20, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How about using robots.txt to disallow the URL pattern for all these pages?

c41lum

10:52 pm on Jul 20, 2010 (gmt 0)

10+ Year Member



These out bound redirect URLS have always been blocked in robots.txt.

I never thought there was a problem until I spotted these thousands of pages with no description when i was doing a search for a product we list.

tedster

12:39 am on Jul 21, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



These out bound redirect URLS have always been blocked in robots.txt

Then they will be removed from the index when you request it directly. Don't know why those URLs would be included in the first place - Google shouldn't even have the title, they should be url-only.

c41lum

9:27 am on Jul 21, 2010 (gmt 0)

10+ Year Member



The URLs that have been included in googles index are those found through links that DON'T have rel="nofollow" on them, also pages with a robots meta tag stating nofollow also AREN'T affected, so this is clearly the control.

It seems that G was blocked from seeing the 302 redirect and instead received a 200 response header, but as the file was blocked in robots G listed an 'Uncrawled URL Reference' using the text from any link pointing to that page. I can't stress enough how its only links without rel="nofollow" that are affected... the other links on the site ran through the same script and process and are not indexed.

Some, but not all, of the discovered links were using javascript, so we are unsure whether rel="nofollow" will help fix the problem with these, and with Matt Cutts stating that the use of rel="nofollow" can be an issue with G we may be between a rock and a hard place.

Have you any ideas as to the correct approach to resolving this?

Can some cloaking be good... I.e. serve a full page to none users with a noindex,nofollow meta tag, but let normal users 302 to the correct page?

c41lum

9:14 pm on Jul 21, 2010 (gmt 0)

10+ Year Member



Just a quick update on what I have done.

I decided not to 404 the indexed affilate urls and start with new ones. We did this for various reasons.

What we have done to try and clear these pages from the index.

1. Blocked the Parameter in WMT
2. Rel nofollowed links
3. used a landing where which we noindex, nofollow.
4. opened up robots.txt to the page so they can be seen by G then hopefully they can de-index them.
5. Serve 200 header to bots and users alike.

is there anything else we should be doing.... Thanks for all your help I'll keep you posted with any changes.