Forum Moderators: phranque

Message Too Old, No Replies

Added new variable to URL

now there are duplicate pages and url problems

         

andrewshim

2:23 am on Jun 10, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've been having this problem for a long time now. More than a year ago, I added a variable to the URL string of my busiest page to change it from :

http://www.example.com?variable1=value1

to

http://www.example.com?variable1=value1&variable2=value2

The only change was adding a variable but I didn't realize the problems that it would cause. I now have 2 URLs pointing to the same page (approximately 500 pages) in Google's index and all of them are supplemental. Some external sites still link to my site with old URLs because the old URL still resolves.

I originally hacked a few lines of code to refresh the page to the new URL if the second(new) variable was not found. I then read this was a favorite spammer technique and not recommended so I removed the code.

Would a 301 redirect resolve this problem? I really don't know how to do it or what to do next. Could someone explain?

jdMorgan

2:39 am on Jun 10, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What is the default name and value of the second variable if it is missing from the query string? Are they always the same, or dependent in some way upon then name and/or value of the first?

There are two issues: First a meta-refresh isn't a good way to fix the duplicate-content problem, and second, you might consider making your URLs search engine friendly if ranking is very important to you. Folks go on and on about dynamic URLs being OK, but overall, I see best results with URLs that look like they point to static pages. And only one reason to prefer them is that they don't open you up to the "infinite variable space problem" that dynamic URLs are subject to -- another form of the problem you're seeing now.

First things first, though. With an answer to my question above, we can probably fix the duplicate-content problem, allowing you more time to consider using friendly URLs... :)

Jim

andrewshim

5:41 am on Jun 10, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Jim... thanks in advance for taking the time..

What is the default name and value of the second variable if it is missing from the query string? Are they always the same, or dependent in some way upon then name and/or value of the first?

I seem to have made an error in my example.

A sample of original URL is :

http://www.example.com/idea.php?ideaID=1

I added the second variable to make it :

http://www.example.com/idea.php?ideaID=1&ideaTitle=title

There is no default value. If there is no ideaID and ideaTitle, the page is blank (which is another problem I was ignorant of)

I added the second variable in an attempt to force the mediapartners (adsense) bot to pick up on the topic of the page (which it wasn't doing). Wasn't a smart move and done in ignorance of the consequences, but I've now dug myself into a hole that I just gotta live with.

There are two issues: First a meta-refresh isn't a good way to fix the duplicate-content problem, and second, you might consider making your URLs search engine friendly if ranking is very important to you.

I've removed the lines of code that caused the page to refresh. It was up only for a couple of hours before I discovered it wasn't a good way of solving the problem.

jdMorgan

3:15 pm on Jun 10, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, if you find yourself in a hole, the first thing to do is to stop digging. The second thing to do is to stop worrying about being in the hole, and concentrate soley on getting out.

We need to know what title value you would use if the visitor entered the site using an "old" URL with the title name/value pair missing.

How did you resolve this same question when you used a meta-refresh?

It is quite possible that the problem can be fixed, but we need to know the details to fix it.

Specifically, when you meta-refreshed from http://www.example.com/idea.php?ideaID=1 to http://www.example.com/idea.php?ideaID=1&ideaTitle=title, what did you insert as the values for "ideaTitle=title"?

The fix is to implement a 301 Permanently-Moved redirect using the same old-to-new URL "map" that you used for the meta-refresh. The 301 will cause search engines to remove the old URLs from their index, and replace them with the new URLs. But we need to know precisely how to build the new URLs based on the old, for all possible old URLs that need to be redirected.

The answer is either a fixed mapping, which can be hard-coded into .htaccess or httpd.conf, or a database lookup. The first approach is simple, while the second may be quite complex. But either way, it can be done, given the necessary information.

The least-desirable solution, but the one that is simplest in concept, is to simply create 500 redirect directives, one for each old URL. I would strongly suggest ordering this code from most-popular page to least-popular for the sake of efficiency, but it can indeed be done this way.

Jim

andrewshim

6:52 pm on Jun 10, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've stopped messing with the URL (ie I've stopped making my hole deeper).

The ideaTitle variable is actually "not needed". At first, the URL would simply be :

http://www.example.com/idea.php?ideaID=1

which was quite confusing to visitors and to me when I wanted to identify my pages, so I added the ideaTitle variable to become

http://www.example.com/idea.php?ideaID=1&ideaTitle=Lemonade-stand

I created a static page listing ALL the ideas. Then when I used the meta refresh, I would check to see if the ideaID variable was present.
If it wasn't (meaning the visitor simply entered example.com/idea.php), I would refresh to this static page.

If the ideaID variable is present but there is no ideaTitle variable, then I would have to read the record from the database, get the ideaTitle from the record, create a url string, then meta refresh to this URL string.

I know. I can imagine from a trained webmaster's point of view how I have broken almost every rule in the book. Even as I type this I feel really stooopid... ;-(

btw... would it be more effective if I sticky you my website URL?

jdMorgan

10:14 pm on Jun 10, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



We don't deal in specific URLs here. We don't even deal in any specific site's problems. The purpose of this thread (from WebmasterWorld's point of view) is to help you *and* everyone reading this thread later who has the same problem. Therefore (and for many other reasons, some of which greatly benefit you), we don't allow personal URLs. And since I'm a volunteer here, and another word for "personal service" is "consulting," I really don't have enough time to handle personal-service stickymails. This is all by way of explanation, so please take it only as such... :)

If the ideaID variable is present but there is no ideaTitle variable, then I would have to read the record from the database, get the ideaTitle from the record, create a url string, then meta refresh to this URL string.

If the lookup of the 'correct' URL (including ideaTitle) requires database access, then it's too complicated for a simple mod_alias directive or mod_rewrite rule to fix. Mod_rewrite's RewriteMap directive could be used, but that's over-complicating things. It actually sounds like you've worked through almost everything needed to fix this problem, but ended up with the wrong final *mechanism* to inform the client. In other words, you can use your "look up the correct URL" code exactly as it was, but instead of doing a meta-refresh, simply return a 301-Moved Permanently status response, along with the correct URL.

Essentially, you'll be returning a "page" with only the new URL on it, instead of returning the "real page" and including a meta-refresh, so it's actually quite simple to do. When the client sees the 301 response code, it knows that it needs get the new URL from the response, and then ask your server for that URL.

Now how you generate the 301-Moved Permanently response varies based on what scripting language you use, but it's quite simple in PERL or PHP. And you can check your results using the free "Live HTTP Headers" extension for Firefox and Mozilla browsers. I'd recommend a search of the forum here at WebmasterWorld devoted to whichever of those two languages you're using for further information. The relevant search terms will be "redirect," "response," "301," "location," etc.

Jim

andrewshim

2:01 am on Jun 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



hi jim...

I undestand completely. My purpose of stickying you my URL was not for personal consultation. I just didn't think I was describing the situation correctly and thought if you saw the actual URL you would understand better, instead of me wasting precious time.

In any case, I know you guys have more than enough on your plates and you still do a great job keeping things running smooth in the forums. You guys are more appreciated than you think.

As far as the doing the 301 redirect with my present code, I tried it, but can't seem to get it to work. I think it has to do with how I'm coding the redirect. Will try it again. Can I post my code here if I need help?

jdMorgan

4:02 am on Jun 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Please post to the WebmasterWorld PERL or PHP forum based on which server-side scripting language you're using, and link to this thread for "background." As with this forum, the PERL or PHP forum moderators will appreciate it if you use example.com as your domain name in the code... :)

As I said, you've already done the hard part looking up the correct URL, so getting the rest working should be pretty simple.

Jim

andrewshim

5:40 am on Jun 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



jim...

It actually sounds like you've worked through almost everything needed to fix this problem, but ended up with the wrong final *mechanism* to inform the client. In other words, you can use your "look up the correct URL" code exactly as it was, but instead of doing a meta-refresh, simply return a 301-Moved Permanently status response, along with the correct URL.

wooooweee! you pointed me in the right direction. I've got it to work. Thanks Jim.