Welcome to WebmasterWorld Guest from 54.160.221.82

Forum Moderators: rogerd & travelin cat

Message Too Old, No Replies

Wordpress blog resolving with url extension

     
10:44 pm on Mar 21, 2013 (gmt 0)

Preferred Member

5+ Year Member

joined:Aug 30, 2007
posts: 555
votes: 3


Can anyone advise how to stop the following from happening? and what is causing it.

I was checking for duplicate content on Google and noticed that one particular page on my blog has several variations in Google's organic serps.

Basically the results show EXACTLY the same wordpress blog post url BUT with the following type of extension AFTER the usual url...

/?p=bxprvxvf

and I notice that if I type in various different letters the url still resolves... all well and good BUT these urls are in Google's search results.

S.. why is it happening and how to stop it?
12:42 am on Mar 22, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lorax is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 31, 2002
posts:7575
votes: 0


That looks like Google indexed the page(s) before you set permalinks and has kept them because they still resolve. If the pages resolve to the new permalink structure then put a redirect plugin in place and redirect the old URLs where you want with a 301.
2:02 pm on Mar 22, 2013 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10542
votes: 8


what lorax said.
in addition, make sure you aren't internally linking to these types of non-canonical urls.
8:24 pm on Mar 22, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12696
votes: 244


There's a third possibility. This happens with or without query strings. Periodically you'll see the googlebot asking for something like

:: shuffling papers ::

/paintings/tundra/lenkoljgkuhz.html

The search engine is testing whether your site returns a 404 when the page genuinely doesn't exist. In the case of php pages, the fix almost always lies in the php itself. If the parameter itself has meaning, the script has to return a 404 when it points to a nonexistent page-- like a post whose number puts it somewhere in the summer of 2016.

If the parameter has no meaning-- or if you don't use it at all, but someone else does-- tell search engines to ignore it.
9:36 pm on Mar 22, 2013 (gmt 0)

Preferred Member

5+ Year Member

joined:Aug 30, 2007
posts: 555
votes: 3


I am double checking when permalinks were added...

what I have also noted is that if I ADD "?p=xyzetc" the the url stays the same as without...
9:39 pm on Mar 22, 2013 (gmt 0)

Preferred Member

5+ Year Member

joined:Aug 30, 2007
posts: 555
votes: 3


It seems that Categories and Pages allow the added "?p=xyzetc" and the previous url and content remains.. but if I add it to a blog post.. it redirects to the actual blog url... not sure how to resolve that...
10:10 pm on Mar 22, 2013 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10542
votes: 8


if you can't figure out how to fix this with a WP config change and if all parameter strings on category and page urls are non-cnaonical, you could do this using mod_rewrite.
(assuming you can recognize category/page url patterns and assuming you're hosted on apache)
12:32 am on Mar 23, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12696
votes: 244


if I ADD "?p=xyzetc" the the url stays the same as without...

I don't think this means what you intended it to mean. Can you clarify?
12:53 am on Mar 23, 2013 (gmt 0)

Preferred Member

5+ Year Member

joined:Aug 30, 2007
posts: 555
votes: 3


Often difficult without being able to put in the url..

but..

greenwidgetxyz.com/blog/index.php/page/2/

will display the content for that page... all is fine..

if I add the extra letters to the url... the content remains displaying.. and it doesn't matter what letters I add... I have only added 3-4 different letters but it doesn't seem to matter which ones..

ie
greenwidgetxyz.com/blog/index.php/page/2/?p=xyzetc

the content remains displaying as if nothing has happened to the url...

HOWEVER... if I do this to a blog post rather than a page... it refreshes to the correct url and removes the added character string..

ie.. greenwidgetxyz.com/blog/index.php/123/test/?p=xyzetc

and ONLY this will display.. in the url..

greenwidgetxyz.com/blog/index.php/123/test/

None of the above affects what displays in the actual content... I am just concerned that these extra urls are getting indexed.. and potentially giving duplicate content issues...
3:59 am on Mar 23, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12696
votes: 244


Got it. If WordPress receives a request for something it has been told to look out for, it processes the request and issues redirects to keep the URL looking pretty. But if it receives a request for something it doesn't expect-- like, say, an URL bearing parameters that originated in a search engine's fevered imagination-- then it doesn't know what to do. It processes the familiar part and ignores the rest.

So then you're back with the two-part solution. One part is to tell the googlebot to ignore certain parameters. There will be a list on the "parameters" page of wmt. The other part is to forcibly redirect any request that contains a query string. Your RewriteRule will need a preceding condition looking at THE_REQUEST so you're only redirecting external queries, not internal ones.

But you will have to be very careful in your htaccess file. It already contains things WordPress put there to make the whole thing work. Make sure your additions are in the right place, where they won't conflict with anything that is already there.

And as long as you're in there, you can probably do some cleaning up. Your average CMS htaccess file is an unholy mess ;)