Forum Moderators: coopster

Message Too Old, No Replies

Can you get rid of the?id=?

Can you get rid of the?id=?

         

webwit

9:01 pm on Mar 28, 2004 (gmt 0)

10+ Year Member



Does anyone know how to get rid of the?id=

I built a PHP shopping cart that works like so.

[My...] Domain Name.com/products.php?id=108

I added the following .htaccess file so I can make the pages html.
AddType application/x-httpd-php .html .php .htm
XBitHack full

Is there a way to use make the variable pass so that the URL and Links look static instead of dynamic?
Example:
[My...] Domain Name.com/products/108.html

Is there some way to do this using mod_rewrite, or some other method?

I have a few hundred products, with frequent inventory changes, so I don't really want to make them static by hand.

digitalv

9:50 pm on Mar 28, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



First question ... why would you want to do this? I'm using viewProduct.asp?id=X in my cart (which I wrote from scratch in ASP) and every single product I have has been indexed by Google. It's great for searches.

There is a way to accomplish what you're asking but I can't think of any advantage to it. Anyway, here's what you do ... and again I'm an ASP/IIS guy, but there is probably a way to do it in PHP/Apache.

The first thing I did was make a custom 404 page called 404.asp in my root directory. THAT page was basically a copy of the "viewProduct.asp" with one difference: instead of looking for the product ID in the querystring, I got the product ID from the URL itself. So the URL was like domain.com/products/100.html and custom 404 page would strip the "100" part out and make that the product ID. I don't know how this would be spidered, I don't think the search engines would be able to tell it was a 404 page returning data since it's not actually redirected just called from the server. As far as anyone can tell they're really looking at /products/100.html. Also you could use If/Then statements within your custom 404 page to make sure that the URL is one that would contain a product ID, so it could show them the proper "page not found" message if it WASN'T a product page.

So that's one way, but you would have to see if it produced the results you wanted. The other way is fairly easy - download a program that can spider an entire website and save the results to your local drive (Blackwidow does this I think) - then spider all of your product pages. You'll now have a page for each product on your hard drive. Then just do a batch rename to make them xx.html and upload to your products folder.

I don't want to really solicit my own services here, but if you need some help with this I could write a program to do it for you. I could probably have it done in a day or two. <Mods if you have a problem with my offer please just delete this paragraph but leave the rest of my answer for the benefit of the poster>.

charlier

9:57 pm on Mar 28, 2004 (gmt 0)

10+ Year Member



If you want to have the pages spidered you do need to send the 200 0K header to the browser/spider, otherwise they will see a 404 code coming back and assume the page is not there.

I send
Header("HTTP/1.1 200 OK");
Header("Status: 200 OK");

and it seems to work fine, I have several thousand email archive pages in google using this method.

digitalv

10:05 pm on Mar 28, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Good to know charlier, I hadn't thought about that :)

jamesa

11:00 pm on Mar 28, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



websit, the best way is definitely mod_rewrite. (And you don't need to add the .html extension to the addType declaration.) Check out:

- An Introduction to Redirecting URLs on an Apache Server [webmasterworld.com] (webmasterworld.com)
- The Apache documentation [httpd.apache.org] (apache.org)

Your mod_rewrite would probably look something like this (I wrote this in a hurry so double-check):

rewriteRule ^/products/(.*).html$ /products.php?id=$1

tomda

4:59 am on Mar 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you want to have the pages spidered you do need to send the 200 0K header to the browser/spider, otherwise they will see a 404 code coming back and assume the page is not there.
I send
Header("HTTP/1.1 200 OK");
Header("Status: 200 OK");

Could someone explain very briefly what this is used for?
Thanks

charlier

6:12 am on Mar 29, 2004 (gmt 0)

10+ Year Member



When the server can't find the page it will automatically send a Status 404 page not found code to the browser. You can see this in the access log. You can overwrite this initial code by using the php Header command to send your own headers. If you do this and it works you will see a status 200 (page found) in your log file. There are also several sites that will connect to your page and tell you what header information it is returning. Do a search on here for 'header test' and you should be able to find a thread that mentions some of them. Also, if you are using apache and can control the server config. you can use mod_log_sql and set it up to record all the headers sent and recieved, you can find more info at [outoforder.cc...]

Cheers

ergophobe

7:29 pm on Mar 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




There is a way to accomplish what you're asking but I can't think of any advantage to it.

It depends on the situation. If you're putting up zillions of products, that's pretty much true. However, in other cases the main advantages are

1. Human readable URIs that people can remember.

2. Logical URIs that frequent customers can guess, as on the MS site where you can ask for www.microsoft.com/powerpoint and you get right there.

3. Portability. If your URIs don't use file names that depend on a givne technology (PHP and parameters passed by GET), you can more easily change systems.

4. Security. This is a very weak reason. As the PHP manual says, security through obscurity is one of the weakest forms. However, it tells that much less about your implementation.

5. Human-speakable URIs make it easier to give the URI over the phone to a client who may want to consult the website.

For more info, Google on "Cool URIs Don't Change"

Tom

Marcia

7:44 pm on Mar 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>viewProduct.asp?id=X

What's interesting is that I've seen sites that couldn't get properly crawled that had URLs like that, such as

/shopping/viewProduct.asp?id=X

or index.asp?id=X

ergophobe

7:52 pm on Mar 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Marcia,

Do you mean in the last few years? It used to be a problem, but most crawlers (certainly Google) have no problem crawling getting at least the first part of the get parameters.

Tom

jatar_k

8:26 pm on Mar 29, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



though they have also said that you shouldn't use get vars called id as this may be construed as a session id and could cause problems with spidering.

webwit has the right idea, I much prefer the look of the rewritten urls than the get style.

webwit

1:02 pm on Mar 30, 2004 (gmt 0)

10+ Year Member



Thank you. That is a lot of good advice.

All of my pages are in the index with good pagerank, but I have noticed my positions slipping badly over the past few months on my sites that have?id=. I have also seen all the pages now in the top 10 for my category are now static and not dynamic.

This article on ranking makes a lot of sense with this situation.
[webmasterworld.com...]
This list contains some of the potential factors that Google can use for ranking Web pages.
META Description
URL
- Domain
- Word Separators
-? and &
- id=

Has anyone here made this change from dynamic to static appearance and seen an improvement it there rankings?

Someone also mentioned that PHP sends out a header to the seach engines that reads something like "built with PHP". Is this true? If so, is there anyway to stop that?