Forum Moderators: open
I've posted this [webmasterworld.com] in the AdSense folder to get an idea of how the MediaBot treats a particular situation, and I hope I am not remiss for reposting a similar note to get feedback on how the general GoogleBot handles the same situation.
In a nutshell, I want to know how Google handles database driven pages when two are more have identical content but differing second variables, e.g.,:
somesite.com/photos.php?var1=123&var2=147
and
somesite.com/photos.php?var1=123&var2=623
Var2 contains user navigation info, such as which gallery is being browsed, in case you're curious.
I'm concerned that two problems may occur from this situation:
1) Google may see identical content on multiple pages with nearly-but-not-entirely matching URLs, and then penalize the site for duplicate content.
2) Various sites may all use different var2's in the linking to particular photo pages, thus diffusing those page's PR.
Am I correct in fearing these situations? If so, I'll do my best to have the photo gallery author implement the var2 in a cookie.
I would say that in the long term that Google would drop the "duplicates", however, in the short term it may very well index the page many times.
I am aware of a page indexed about 5 times with sml00B.asp, sml00B.asp?Tx=1, sml00B.asp?Pf=1, and many other variants. The content is identical. The Pf=1 variant is the Printer Friendly version for example. I don't expect it to last. The number has slowly reduced from about 8 or 9.
Or was that one just an exception and probably google would index A LOT more pages of the site if I got rid of punctuation and spaces?
What do you say?
Thanks a lot for the input
In dynamic URLs try to limit the query string to less than 3 variables.
Try to avoid using ID as part of the query, and avoid having any long number (more than about 6 digits I guess) in the query that might look like a session ID even if it isn't one.
the thing i see is that google doesn't give the page any rank even though all of these pages show up page 1 in alot of the SERPs they target for. I know they can do better so I'm implementing a mod rewrite to change the url to
mysite.com/widget/1111
mysite.com/widget/1112
so now the SE's will think that widget and 1111 are directories and then it will score them :)
I can make them as dynamic as I want and still make them look like a directory :)
In fact, I am running into a problem similar to that of ThatAdamGuy. On each product page I have a link to the same page with "¤cy=CAD" or "¤cy=USD", whatever the case may be. Problem is, that creates 3 pages for Google to index, and the GoogleBot is indexing nearly all of them.
I'm exploring various possibilities, including using javascript to place a cookie client-side that indicates either the currency in my case, or a navigation feature for ThatAdamGuy. Without a link, GoogleBot can't get confused, but this won't work for people without javascript enabled.
If I try to redirect from a page with a currency parameter to the very same page, the only thing that will change is the price- and if the GoogleBot goes back, say, to the homepage, the prices will have mysteriously changed from when they started crawling (and change every time they click on a currency change link!).
Another possibility would be to use mod_rewrite interpreting:
somesite.com/CAD/showProduct?productId=125
somesite.com/USD/showProduct?productId=125
as:
somesite.com/showProduct?productId=125¤cy=CAD
somesite.com/showProduct?productId=125¤cy=USD
Of course, I'm not sure I like what that would do to page rank, what with having to redirect people based on IP/locale from the very front page, something which would also confuse Google- although there are more pages which should offset the loss of PR. But then, they are mostly dupes... :(
GoogleGuy- If you're reading this, any insight would be appreciated.
So old SERP links using widget.php?id=2222 will work. Then in come the spiders and reindex the site and discover new linking and ranking widget/2222. For a while you might have twice as many results :)
All a mod rewrite does is assign a variable to a variable like this query right here
somesite.com/USD/showProduct?productId=125
I would assign a mod rewrite variables to translate the incoming page request
product/=showproduct.php
category/=productId
scented_widgets=125
so then I can change my linking to
somesite.com/USD/product/category/scented_widgets
mod rewrite changes it back to this when the link is clicked
somesite.com/USD/showProduct?productId=125
so the php engine can output the correct query
I can have 50 variables in a query and optimize everylast one of them and turn them into a directory instead of a query variable. So imagine implementing that to a shopping cart. I can optimize every item to rank and so on.
The only down fall is the server load. It can get heavy with alot of queries.
A good example of optimization is nextag.com
they have a mod rewrite going big time and they rank for items in their site down to the very names of the items they sell.
[google.com...]
that is good SEO.
sticky me if you have questions :)
Any ideas of how to work around that?
Sure you can put dashes in the filenames. You're using a script to run the site, so you can do anything you like. You'll need to alter the script so that the information from the database is printed without dashes if it is just a description on screen, but the dashes are added if it is a filename.
seomike- I'm sorry, I don't think I was clear.
I would just as soon avoid having two URLs like this:
somesite.com/USD/showProduct/125
somesite.com/CAD/showProduct/125
The only difference would be the price, and I am afraid that Google would penalize it for being duplicated content. I would much prefer having only the one URL:
somesite.com/showProduct/125
READ $word FROM database.
$description = $word.
$thepagename = REPLACE (" " WITH "-") USING $word
Then you use one of these two variables ($description or $thepagename) in place of each occurance of $word in the script.
One of the regular PHP or ASP gurus here could probably write a few lines of real code in as many minutes.