homepage Welcome to WebmasterWorld Guest from 54.243.17.133
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Google changing my .php URLs to .php plus a slash
karkadan




msg:4473853
 4:47 pm on Jul 9, 2012 (gmt 0)

Hello,

2 months ago, I noticed a small drop in my traffic. I didn't check it cause sometimes traffic comes and goes.

2 weeks ago, I noticed another drop, making a total loss of 30%. I began checking, and I noticed one page had gone missing.

It was strange.

So, I Googled the following
"sitename + widget"

And I found that Google was listing the following URL
www.sitename.com/widget.php/

Please notice that it has a / and the end. My URL has not. And the page has been live for over 6 years.

So, I found it strange.

Yesterday, I got an additional fall in rankings. Another 10%.

Then I noticed another page had the same issue, and other pages too.

Why the fall in rank? Cause widget.php/ shows a broken page, which users don't like, the bounce, and overtime, causes me to dissapear from ranking.

Why is this happening? I don't get it!

 

netmeg




msg:4473862
 5:48 pm on Jul 9, 2012 (gmt 0)

Do the URLs with a slash after them resolve, or do they 404?

I would take a look at your site structure, it sounds like you may have an issue with redirects or something.

deadsea




msg:4473881
 6:51 pm on Jul 9, 2012 (gmt 0)

Sounds very similar to this thread: [webmasterworld.com...]

Given two threads on the topic, maybe Google has a bug? Connect with the original poster there and see if they resolved the issue.

karkadan




msg:4473930
 10:36 pm on Jul 9, 2012 (gmt 0)

Looks similar.

The page does not 404. IT simply doesnt show any images.

rango




msg:4473936
 10:53 pm on Jul 9, 2012 (gmt 0)

I'd 301 redirect that bad page and add a canonical tag on there to ensure that Google knows which one matters. They should disappear from the index before too long.

karkadan




msg:4473959
 12:41 am on Jul 10, 2012 (gmt 0)

I have thousands of .php files, and Google is starting to affect many of them.

rango




msg:4473962
 12:53 am on Jul 10, 2012 (gmt 0)

You can write one simple rewrite rule to redirect them to a version without the slash.

Better to do it through Apache (or whatever web server you're using) than in php anyway. Doing it in php means unnecessary load and a slower redirect. And for a redirect like this there really is no need to do it in php.

serpsup




msg:4473964
 1:03 am on Jul 10, 2012 (gmt 0)

I just started noticing this on my site too so I am adding rel canonical to all sections of my site.

@rango mentions 301 redirect as well as canonical but isn't that redundant? Shouldn't one or the other suffice?

Thanks,
Tom

rango




msg:4473965
 1:11 am on Jul 10, 2012 (gmt 0)

The 301 is just for this known situation. The canonical should help protect against future problems.

For the user who lands on one of those bad pages, the 301 is definitely nicer. They aren't going to be looking at your canonicals ;) And hey, the more good signals to Google the better.

That other thread also mentioned canonical tags not helping with this specific problem mind you.

lucy24




msg:4474011
 7:44 am on Jul 10, 2012 (gmt 0)

The page does not 404. IT simply doesnt show any images.

:: peering into crystal ball ::

The images use relative links, which worked fine as long as you had
/filename.php

side by side with
/images/imagename.php

but now that the user-agent thinks you're in
/filename.php/

it goes looking for
/filename.php/images/imagename.php

g1smd




msg:4474039
 9:17 am on Jul 10, 2012 (gmt 0)

Replace the links to images in the form
href="images/file.png" with href="/images/file.png" (add a leading slash and the full path to the file) to fix the problem.

By the way, the thread title is misleading. Google didn't "change" your URLs. When Google accessed those incorrect URLs (from a link they found somewhere on the web, either on your site or on some other) a design error on your site meant they returned 200 OK and were therefore indexed.

karkadan




msg:4474144
 2:25 pm on Jul 10, 2012 (gmt 0)

lucy24, I read somewhere that using fixed links to resources is bad.

Can someone confirm?

karkadan




msg:4474145
 2:27 pm on Jul 10, 2012 (gmt 0)

g1smd sorry it was misleading ("the title"). (my God, I'm starting to write as Yoda).

English is not my main language, and I found it difficult to explain the situation in one line. I was going to add more info to the title, but there is a max of characters you can add.

lucy24




msg:4474292
 5:34 pm on Jul 10, 2012 (gmt 0)

I read somewhere that using fixed links to resources is bad.

Depends whom you ask. Right here in this very thread, you'll see g1 telling you to use site-absolute links for everything-- that is, the kind with / at the beginning. Conversely I'm all for keeping things in packages-- page plus associated files-- so the relative links stay the same even if the package is moved.

If your site is designed from scratch with all the images in one place, then absolute links will always work.

There are some situations where you must use absolute links, notably in error documents.

Oh, and google may not be literally changing your URL. But it has a solid history of inventing URLs out of its fevered imagination-- or out of willful misreading of anchor text, which amounts to the same thing. And depending on extension, the extra bits may calmly attach themselves to the URL. And then you're stuck with two.

karkadan




msg:4474345
 8:07 pm on Jul 10, 2012 (gmt 0)

I'm using

<IfModule mod_rewrite.c>
RewriteBase /
RewriteRule (.*)\.php/ $1.php [R,L]
</IfModule>


I hope I'm not screwing my fate.

deadsea




msg:4474347
 8:17 pm on Jul 10, 2012 (gmt 0)

Make that [R=301,L] to make it a permanent redirect. Other than that, I think it should work for you.

karkadan




msg:4474352
 9:07 pm on Jul 10, 2012 (gmt 0)

Thanks

g1smd




msg:4474356
 9:26 pm on Jul 10, 2012 (gmt 0)

Add the protocol and domain name to the rule target. Never start the rule target with a backreference. It's a huge security risk.

Replace
(.*)\.php/ with a more efficient pattern.

Something like
RewriteRule ^([^/]+/)*([^/.]+)\.php/ http://www.example.com/$1.php [R=301,L]
will do.

This parses left to right in one go and does not invoke tons of "back off and retry" trial matching.

lucy24




msg:4474363
 9:45 pm on Jul 10, 2012 (gmt 0)

Edit: Proving once again that I type much slower than g1...

Do you think there is the least possibility that your Apache installation does not include mod_rewrite? There's a thought to make the blood run cold. The "IfModule" envelopes are for boilerplate htaccess that comes with CMS packages whose designer has no idea where they will be used. Once you're on an individual site, you either have a given module or you don't. If it exists but the AllowOverrides settings don't let you use it, change hosts :)

.* should be expressed as
^[^.]+
so the server doesn't have to backtrack after capturing the entire request and then learning that it was supposed to leave room for .php at the end. (Apache works only in one dimension. It can't see what's coming up ahead.) Opening anchor so it can't cheat by ignoring any earlier full stops-- not that there should ever be any in mid-URL. Unless, ahem, your name is apache dot org

+ rather than * because if you get a request for www.example.com/.php/ then the slash is the least of your problems.

Does it also attach / to the names of index pages? If so, you need to do some ruthless redirecting, because "index.php" should never occur at all:

RewriteRule ^(([^./]+/)*)index\.php/? http://www.example.com/$1 [R=301,L]

thms




msg:4474871
 5:14 am on Jul 12, 2012 (gmt 0)

I just started to see this problem on my site too.... certainly some kind of bug on google's side

in my case, the .php/ and the .php version both show up fine(same content), but google is indexing only the .php/ version(for a few pages)

the problem is that on some of those pages I have internal links pointing to relative urls, for example, www.example.com/somepage.php/ has a relative link to contact.php so the user ends up going to www.example.com/somepage.php/contact.php which shows the same content as www.example.com/somepage.php because my server will then treat /contact.php as some kind of query string

karkadan




msg:4474872
 5:20 am on Jul 12, 2012 (gmt 0)

UPDATE:

One page that was "fixed" dissapared from Google's database. Maybe temporarily, maybe it will be "punished".

Previous page to disspear, lost a lot of ranking.

I've seen the same behaviour in one site of a new client, that had just been hacked, the thief was using 301 to redirect... the client lost ranking for 2-3 months.

karkadan




msg:4474874
 5:21 am on Jul 12, 2012 (gmt 0)

a more IMPORTANT UPDATE

I just checked the Webmaster tools and it states that I am having a DNS problem. Doesn't say how or why. It is strange. Nobody has been moving anything.

IT began in July 2. Weird.

karkadan




msg:4474878
 5:51 am on Jul 12, 2012 (gmt 0)

ns2 ping is failing. :S

thms




msg:4474945
 11:19 am on Jul 12, 2012 (gmt 0)

Something like
RewriteRule ^([^/]+/)*([^/.]+)\.php/ http://www.example.com/$1.php [R=301,L]
will do.


actually it should be:

RewriteRule ^([^/]+/)*([^/.]+)\.php/ http://www.example.com/$2.php [R=301,L]

isn't it

g1smd




msg:4475090
 6:55 pm on Jul 12, 2012 (gmt 0)

No. There's an outer layer of parentheses missing. Add that and change to $1.

RewriteRule ^(([^/]+/)*([^/.]+))\.php/ http://www.example.com/$1.php [R=301,L]
lucy24




msg:4475105
 7:21 pm on Jul 12, 2012 (gmt 0)

Or, in the alternative,

RewriteRule ^(([^/]+/)*([^/.]+)\.php)/ http://www.example.com/$1 [R=301,L]

;)

g1smd




msg:4475110
 7:36 pm on Jul 12, 2012 (gmt 0)

I prefer the former as it more clearly higlights that the redirect is to a URL with a filename that includes a extension.

atlrus




msg:4480211
 5:15 pm on Jul 30, 2012 (gmt 0)

I had to resurrect this thread, since I recently fell in this boat, as well.

But unlike the OP, my page is not php, but simple .html which now has the trailing slash behind it. Unfortunately, it's also my most linked-to page and even in the GWMT it shows as "mypage.html/", as well as in the search results. Needless to say, the page ranks nowhere :(

I have no idea what to do. I don't really want to use redirect, since Google is clearly thinking that "mypage.html" and "mypage.html/" is the same page, not sure how any form of redirect would affect this.

Don't' really know the reasoning behind google choosing the page with the slash to be the one that displays, when the majority of the links pointing to this page are all without the slash. I am assuming some scraper messed up and linked to us with a slash.

Anyone has anything new to add to this problem?

g1smd




msg:4480320
 9:32 pm on Jul 30, 2012 (gmt 0)

This points to extremely lax URL rewriting or some such configuration error. On a normal site, a request for URL with both extension and trailing slash would simply return 404 Not Found.

A redirect will tell Google to no longer index the URL with trailing slash and to replace it with the URL without the trailing slash.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved