Forum Moderators: coopster

Message Too Old, No Replies

Seeking PHP Help For 410 Gone Status

         

austtr

1:37 am on Nov 18, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Seeking help from someone familiar with PHP code.

Having uploaded a new Joomla CMS site to the root, I found that Google had crawled all the new content as it appeared in the build folder and indexed it. Instant duplicate content…. well done, Google.

I’ve deleted all the duplication, but I want to clean the Google index by applying a 410 Gone status to all the URL’s that included the build folder name (joomla30) anywhere in the URL. I’ve tried every published suggestion for using a rewrite in the .htaccess but nothing works, so I'm hoping an entry in the error.php file might work.

Can someone code the second line in the following:

<?php
if the URL contains "joomla30” (without quotes) anywhere in the string) {
header("HTTP/1.0 410 Gone”);
} ?>

lucy24

1:47 am on Nov 18, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Why would you need to do it in php when a simple
RewriteRule joomla30 - [G]
will do it?

The rule has to be located after the statement "RewriteEngine on" but before any Joomla-specific RewriteRules. I suspect the problem was that the rule is located in the wrong place.

austtr

6:04 am on Nov 18, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I wish it were that simple and I'd have saved three days of fruitless research to try and solve what should be a really, really simple task.

Short Story: Your suggestion seems to be ignored. Enter the URL of a missing page from that folder and there is 404 response, not a 410. The Screaming Frog status checker also shows 404. It would seem from both of those results that the rewrite instruction to a 410 is not recognized.... hence the flow on to a 404 because the URL is for a page that was not found.

Long Story: I first had problems with rewrite rules in the .htaccess trying to implement a custom 404. No matter what I tried, it would trigger the default template 404 Not Found page but never a 404 header response. I eventually found a reference, buried deep down in the bowels of Joomla documentation, that said something about the SEF and mod_rewrite configuration could causes issues with header status which "might be problematical with some search engines like Google".

The suggested remedy was to add the following script to the template/error.php to force a 404 header and (optionally) to manually edit the template/css/error.css to improve the visuals.

<?php
if ($this->error->getCode() == '404') {
header("HTTP/1.0 404 Not Found");
} ?>

That worked, even though it seems a dogs breakfast way of doing things but it still leaves the issue of the 410. As we have seen, it seems a solution via the .htaccess has the same incorrect outcome as the 404, so as asked in the OP, maybe the solution lies in adding custom php script in the error.php to force a 410 header.

austtr

6:47 am on Nov 18, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OK.... hold the phone.

I tried a test .htaccess with just the following lines:

Options +FollowSymlinks
Options -Indexes
RewriteEngine On
RewriteRule joomla30 - [G]

... and it worked just fine with a 410 screen message and a 410 header status. It looks like there is a conflict happening when all the default Joomla statements and comments get included in the .htaccess even when they are placed below the rewrite rule.

I'd still like to see a php code option if there is one.

coopster

2:45 pm on Dec 16, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



A little late, but here you go austtr -- the PHP version. I don't know if you need case-insensitive or not but I added the "i" to show you could do it that way:
<?php
if (preg_match('/joomla30/i', $_SERVER['REQUEST_URI'])) {
header("{$_SERVER['SERVER_PROTOCOL']} 410 Gone”);
exit;
}
?>

robzilla

9:40 pm on Dec 16, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Note that strpos (or stripos for case insensitivity) is lighter than preg_match, and we're not really flexing regex muscle here.

<?php
if (strpos($_SERVER['REQUEST_URI'],'joomla30') !== false) {
header("HTTP/1.0 410 Gone");
die();
}
?>

Or just switch out the 404 with a 410 in your previous PHP solution?

<?php 
if ($this->error->getCode() == '404') {
header("HTTP/1.0 410 Gone");
} ?>

lucy24

10:25 pm on Dec 16, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Or just switch out the 404 with a 410 in your previous PHP solution?

But doesn't that mean the server has to physically check for the file? I think the idea was to return a 410 automatically, without having to check.

robzilla

10:35 pm on Dec 16, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Certainly, and letting the web server handle it is the better way, but austtr requested a PHP option regardless.