homepage Welcome to WebmasterWorld Guest from 54.204.79.235
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
How can I remove php pages from google?
serenoo




msg:4491970
 3:46 pm on Sep 6, 2012 (gmt 0)

I have some php pages on my website that only contains:
<?php
header('Location: http://www.example.com/?id=154077');
?>

I am affiliated to example.com and 154077 is my id to earn money.

When I put a textual link on my website I use <a href="the php page" rel="nofollow"> and those php pages are included in my robots.txt

But if I do a search by site: operator they appear on the Google index.
I tried to remove them by Webmaster tool. It works, but after some months they appear again.
I cannot put <meta name="robots" content="noindex, nofollow"/> on the header because it is a redirect.
Is there a solution to remove them from the index or using another technology for affiliation?

 

deadsea




msg:4492088
 10:03 pm on Sep 6, 2012 (gmt 0)

One option: if you remove them from robots.txt, then Googlebot will crawl them, see that they are redirects, and not index them.

levo




msg:4492116
 11:07 pm on Sep 6, 2012 (gmt 0)

header('Location: http://www.example.com/?id=154077');
header('X-Robots-Tag: noindex, nofollow', true);

serenoo




msg:4492244
 9:17 am on Sep 7, 2012 (gmt 0)

thank you for your help deadsea and levo. I prefer the levo's solution.
Once I add header('X-Robots-Tag: noindex, nofollow', true); will google recognize itself he has to remove it from the index or do I have to go to webmaster tool to remove it manually?

levo




msg:4492255
 9:56 am on Sep 7, 2012 (gmt 0)

You just have to remove them from robots.txt. No manual action needed.

deadsea




msg:4492261
 10:12 am on Sep 7, 2012 (gmt 0)

If they can't crawl it, they can't see those meta headers.

Robert Charlton




msg:4492571
 1:35 am on Sep 8, 2012 (gmt 0)

remove them from robots.txt

The issues with X-Robots noindex are exactly parallel to the issues with the meta robots noindex currently being discussed in detail in this thread...

Pages are indexed even after blocking in robots.txt
http://www.webmasterworld.com/google/4490125.htm [webmasterworld.com]

Yes, you must remove the pages from robots.txt. If you do that and use X-Robots noindex, you do not need to use the WMT removal tool.

serenoo




msg:4492662
 2:16 pm on Sep 8, 2012 (gmt 0)

Is the order important?
I mean:
Can I write
header('X-Robots-Tag: noindex, nofollow', true);
header('Location: http://www.example.com/?id=154077');
too?

phranque




msg:4492843
 2:09 am on Sep 9, 2012 (gmt 0)

the header order is not significant.

serenoo




msg:4492958
 3:30 pm on Sep 9, 2012 (gmt 0)

I have 48 php files for affiliation links. So I would include a file.php at the beginning of each redirect that contains header('X-Robots-Tag: noindex, nofollow', true); so next time I have to modify something then I only need to update only one file.

nofollow.php contains:
header('X-Robots-Tag: noindex, nofollow', true);

But when I try to modifiy redirect.php in this way:
<?php
include($_SERVER['DOCUMENT_ROOT'] . '/path/nofollow.php');
header('Location: http://www.example.com/?id=154077'
?>

It says:
header('X-Robots-Tag: noindex, nofollow', true);
Warning: Cannot modify header information - headers already sent by (output started at path/nofollow.php:2) in /home/path/public_html/redirect.php on line 4

that means I cannot use include to add header()?

levo




msg:4492963
 3:46 pm on Sep 9, 2012 (gmt 0)

Make sure nofollow.php file doesn't have any space/newline/other characters outside of <?PHP ?> tags.

serenoo




msg:4492976
 4:43 pm on Sep 9, 2012 (gmt 0)

Thank you levo. Now it works.
Is there a way to check if it is really nofollow?
On a static page I can do view source and check for
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">, but here there is no static page.

Or do I have to wait for google's train?

levo




msg:4493003
 6:18 pm on Sep 9, 2012 (gmt 0)

You can use 'fetch as Google' in Google Webmaster Tools.

serenoo




msg:4493039
 8:26 pm on Sep 9, 2012 (gmt 0)

HTTP/1.1 302 Found
Content-Encoding: gzip
Vary: Accept-Encoding
Date: Sun, 09 Sep 2012 20:23:13 GMT
Server: LiteSpeed
Connection: close
X-Powered-By: PHP/5.2.17
X-Robots-Tag: noindex, nofollow
Location: [...........................]
Content-Type: text/html
Content-Length: 21

correct, right?

phranque




msg:4493045
 8:41 pm on Sep 9, 2012 (gmt 0)

if you provide a 302 status code to googlebot then it will index the redirected content at the original url.
you need to provide a 301 response code and by providing a Location: header without specifying the status code it defaults to a 302.
therefore, your response must start with the 301 status code.

deadsea




msg:4493100
 9:58 pm on Sep 9, 2012 (gmt 0)

To do a 301 redirect change the location line to:

header('Location: http://www.example.com/?id=154077',TRUE,301);

serenoo




msg:4493164
 7:56 am on Sep 10, 2012 (gmt 0)

Thank you for your help. This is the new one:

HTTP/1.1 301 Moved Permanently
Content-Encoding: gzip
Vary: Accept-Encoding
Date: Mon, 10 Sep 2012 07:52:34 GMT
Server: LiteSpeed
Connection: close
X-Powered-By: PHP/5.2.17
X-Robots-Tag: noindex, nofollow
Location: http://www.example.com/?id=154077
Content-Type: text/html
Content-Length: 21

Is that ok?

phranque




msg:4493198
 11:10 am on Sep 10, 2012 (gmt 0)

that looks good but it depends and what you ultimately want to happen.

i've never actually tried sending a X-Robots-Tag: noindex header with a 301 response, but i would assume that it will remove the requested url from the index and then not do a follow up request for the url in the Location: header.

if instead what you want to do is replace the requested url (which had been previously indexed) with the url and content from the Location: header then you should remove the X-Robots-Tag header and let the 301 show the search engine where to go.

serenoo




msg:4493253
 1:48 pm on Sep 10, 2012 (gmt 0)

My target is remove the redirect.php files from the index.
When I search for site: for my website it returns 48 php redirect files that I use as affiliation links to earn money. They should not appear in the index, because they are redirect to money links.

I do not why google shows such redirect.php files starting from a few days ago (they have been unknown for years) because they was on robots.txt and when I link I add rel="nofollow" ... to the <a> tag

Can someone tell me if I made the right move?
I have to edit 48 files and it will be a waste of time, so if you could confirm it would be a great help.

phranque




msg:4493273
 2:19 pm on Sep 10, 2012 (gmt 0)

in that case i would go with what you have here (301 & X-Robots-Tag)

serenoo




msg:4498537
 7:39 am on Sep 22, 2012 (gmt 0)

My php pages are disappeared after only 12 days from when I applied your advices.

g1smd




msg:4498572
 12:49 pm on Sep 22, 2012 (gmt 0)

It can take several months. Twelve days is good.

phranque




msg:4498994
 10:12 pm on Sep 23, 2012 (gmt 0)

my rule of thumb is <2 weeks = fast & >2 months = slow but a lot depends on the number of pages affected vs indexed and the crawl rate.
a natural decay over time is not unusual.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved