Welcome to WebmasterWorld Guest from 54.234.153.186

Message Too Old, No Replies

Google changing my .php URLs to .php plus a slash

     

karkadan

4:47 pm on Jul 9, 2012 (gmt 0)

5+ Year Member



Hello,

2 months ago, I noticed a small drop in my traffic. I didn't check it cause sometimes traffic comes and goes.

2 weeks ago, I noticed another drop, making a total loss of 30%. I began checking, and I noticed one page had gone missing.

It was strange.

So, I Googled the following
"sitename + widget"

And I found that Google was listing the following URL
www.sitename.com/widget.php/

Please notice that it has a / and the end. My URL has not. And the page has been live for over 6 years.

So, I found it strange.

Yesterday, I got an additional fall in rankings. Another 10%.

Then I noticed another page had the same issue, and other pages too.

Why the fall in rank? Cause widget.php/ shows a broken page, which users don't like, the bounce, and overtime, causes me to dissapear from ranking.

Why is this happening? I don't get it!

netmeg

5:48 pm on Jul 9, 2012 (gmt 0)

WebmasterWorld Senior Member netmeg is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Do the URLs with a slash after them resolve, or do they 404?

I would take a look at your site structure, it sounds like you may have an issue with redirects or something.

deadsea

6:51 pm on Jul 9, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sounds very similar to this thread: [webmasterworld.com...]

Given two threads on the topic, maybe Google has a bug? Connect with the original poster there and see if they resolved the issue.

karkadan

10:36 pm on Jul 9, 2012 (gmt 0)

5+ Year Member



Looks similar.

The page does not 404. IT simply doesnt show any images.

rango

10:53 pm on Jul 9, 2012 (gmt 0)



I'd 301 redirect that bad page and add a canonical tag on there to ensure that Google knows which one matters. They should disappear from the index before too long.

karkadan

12:41 am on Jul 10, 2012 (gmt 0)

5+ Year Member



I have thousands of .php files, and Google is starting to affect many of them.

rango

12:53 am on Jul 10, 2012 (gmt 0)



You can write one simple rewrite rule to redirect them to a version without the slash.

Better to do it through Apache (or whatever web server you're using) than in php anyway. Doing it in php means unnecessary load and a slower redirect. And for a redirect like this there really is no need to do it in php.

serpsup

1:03 am on Jul 10, 2012 (gmt 0)



I just started noticing this on my site too so I am adding rel canonical to all sections of my site.

@rango mentions 301 redirect as well as canonical but isn't that redundant? Shouldn't one or the other suffice?

Thanks,
Tom

rango

1:11 am on Jul 10, 2012 (gmt 0)



The 301 is just for this known situation. The canonical should help protect against future problems.

For the user who lands on one of those bad pages, the 301 is definitely nicer. They aren't going to be looking at your canonicals ;) And hey, the more good signals to Google the better.

That other thread also mentioned canonical tags not helping with this specific problem mind you.

lucy24

7:44 am on Jul 10, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



The page does not 404. IT simply doesnt show any images.

:: peering into crystal ball ::

The images use relative links, which worked fine as long as you had
/filename.php

side by side with
/images/imagename.php

but now that the user-agent thinks you're in
/filename.php/

it goes looking for
/filename.php/images/imagename.php

g1smd

9:17 am on Jul 10, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Replace the links to images in the form
href="images/file.png"
with
href="/images/file.png"
(add a leading slash and the full path to the file) to fix the problem.

By the way, the thread title is misleading. Google didn't "change" your URLs. When Google accessed those incorrect URLs (from a link they found somewhere on the web, either on your site or on some other) a design error on your site meant they returned 200 OK and were therefore indexed.

karkadan

2:25 pm on Jul 10, 2012 (gmt 0)

5+ Year Member



lucy24, I read somewhere that using fixed links to resources is bad.

Can someone confirm?

karkadan

2:27 pm on Jul 10, 2012 (gmt 0)

5+ Year Member



g1smd sorry it was misleading ("the title"). (my God, I'm starting to write as Yoda).

English is not my main language, and I found it difficult to explain the situation in one line. I was going to add more info to the title, but there is a max of characters you can add.

lucy24

5:34 pm on Jul 10, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



I read somewhere that using fixed links to resources is bad.

Depends whom you ask. Right here in this very thread, you'll see g1 telling you to use site-absolute links for everything-- that is, the kind with / at the beginning. Conversely I'm all for keeping things in packages-- page plus associated files-- so the relative links stay the same even if the package is moved.

If your site is designed from scratch with all the images in one place, then absolute links will always work.

There are some situations where you must use absolute links, notably in error documents.

Oh, and google may not be literally changing your URL. But it has a solid history of inventing URLs out of its fevered imagination-- or out of willful misreading of anchor text, which amounts to the same thing. And depending on extension, the extra bits may calmly attach themselves to the URL. And then you're stuck with two.

karkadan

8:07 pm on Jul 10, 2012 (gmt 0)

5+ Year Member



I'm using

<IfModule mod_rewrite.c>
RewriteBase /
RewriteRule (.*)\.php/ $1.php [R,L]
</IfModule>


I hope I'm not screwing my fate.

deadsea

8:17 pm on Jul 10, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Make that [R=301,L] to make it a permanent redirect. Other than that, I think it should work for you.

karkadan

9:07 pm on Jul 10, 2012 (gmt 0)

5+ Year Member



Thanks

g1smd

9:26 pm on Jul 10, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Add the protocol and domain name to the rule target. Never start the rule target with a backreference. It's a huge security risk.

Replace
(.*)\.php/
with a more efficient pattern.

Something like
RewriteRule ^([^/]+/)*([^/.]+)\.php/ http://www.example.com/$1.php [R=301,L]

will do.

This parses left to right in one go and does not invoke tons of "back off and retry" trial matching.

lucy24

9:45 pm on Jul 10, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Edit: Proving once again that I type much slower than g1...

Do you think there is the least possibility that your Apache installation does not include mod_rewrite? There's a thought to make the blood run cold. The "IfModule" envelopes are for boilerplate htaccess that comes with CMS packages whose designer has no idea where they will be used. Once you're on an individual site, you either have a given module or you don't. If it exists but the AllowOverrides settings don't let you use it, change hosts :)

.* should be expressed as
^[^.]+
so the server doesn't have to backtrack after capturing the entire request and then learning that it was supposed to leave room for .php at the end. (Apache works only in one dimension. It can't see what's coming up ahead.) Opening anchor so it can't cheat by ignoring any earlier full stops-- not that there should ever be any in mid-URL. Unless, ahem, your name is apache dot org

+ rather than * because if you get a request for www.example.com/.php/ then the slash is the least of your problems.

Does it also attach / to the names of index pages? If so, you need to do some ruthless redirecting, because "index.php" should never occur at all:

RewriteRule ^(([^./]+/)*)index\.php/? http://www.example.com/$1 [R=301,L]

thms

5:14 am on Jul 12, 2012 (gmt 0)

5+ Year Member



I just started to see this problem on my site too.... certainly some kind of bug on google's side

in my case, the .php/ and the .php version both show up fine(same content), but google is indexing only the .php/ version(for a few pages)

the problem is that on some of those pages I have internal links pointing to relative urls, for example, www.example.com/somepage.php/ has a relative link to contact.php so the user ends up going to www.example.com/somepage.php/contact.php which shows the same content as www.example.com/somepage.php because my server will then treat /contact.php as some kind of query string

karkadan

5:20 am on Jul 12, 2012 (gmt 0)

5+ Year Member



UPDATE:

One page that was "fixed" dissapared from Google's database. Maybe temporarily, maybe it will be "punished".

Previous page to disspear, lost a lot of ranking.

I've seen the same behaviour in one site of a new client, that had just been hacked, the thief was using 301 to redirect... the client lost ranking for 2-3 months.

karkadan

5:21 am on Jul 12, 2012 (gmt 0)

5+ Year Member



a more IMPORTANT UPDATE

I just checked the Webmaster tools and it states that I am having a DNS problem. Doesn't say how or why. It is strange. Nobody has been moving anything.

IT began in July 2. Weird.

karkadan

5:51 am on Jul 12, 2012 (gmt 0)

5+ Year Member



ns2 ping is failing. :S

thms

11:19 am on Jul 12, 2012 (gmt 0)

5+ Year Member



Something like
RewriteRule ^([^/]+/)*([^/.]+)\.php/ http://www.example.com/$1.php [R=301,L]
will do.


actually it should be:

RewriteRule ^([^/]+/)*([^/.]+)\.php/ http://www.example.com/$2.php [R=301,L]

isn't it

g1smd

6:55 pm on Jul 12, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



No. There's an outer layer of parentheses missing. Add that and change to $1.

RewriteRule ^(([^/]+/)*([^/.]+))\.php/ http://www.example.com/$1.php [R=301,L]

lucy24

7:21 pm on Jul 12, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Or, in the alternative,

RewriteRule ^(([^/]+/)*([^/.]+)\.php)/ http://www.example.com/$1 [R=301,L]

;)

g1smd

7:36 pm on Jul 12, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I prefer the former as it more clearly higlights that the redirect is to a URL with a filename that includes a extension.

atlrus

5:15 pm on Jul 30, 2012 (gmt 0)

10+ Year Member



I had to resurrect this thread, since I recently fell in this boat, as well.

But unlike the OP, my page is not php, but simple .html which now has the trailing slash behind it. Unfortunately, it's also my most linked-to page and even in the GWMT it shows as "mypage.html/", as well as in the search results. Needless to say, the page ranks nowhere :(

I have no idea what to do. I don't really want to use redirect, since Google is clearly thinking that "mypage.html" and "mypage.html/" is the same page, not sure how any form of redirect would affect this.

Don't' really know the reasoning behind google choosing the page with the slash to be the one that displays, when the majority of the links pointing to this page are all without the slash. I am assuming some scraper messed up and linked to us with a slash.

Anyone has anything new to add to this problem?

g1smd

9:32 pm on Jul 30, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



This points to extremely lax URL rewriting or some such configuration error. On a normal site, a request for URL with both extension and trailing slash would simply return 404 Not Found.

A redirect will tell Google to no longer index the URL with trailing slash and to replace it with the URL without the trailing slash.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month