Forum Moderators: open

Message Too Old, No Replies

Dynamic URLs - big hassle!

         

Slidigaijs

12:11 am on Apr 1, 2003 (gmt 0)



I looked over the entire forum, but did not find any real solution how to make Google to spider the DYNAMIC URLs. And it the most important issue at this time.

Well, I spent last few weeks tring to figure out the RULEs, which spider might follow to analize the LINK & QUERY.

So, I did created a few additional pages on the small, often visited(every 1 or 2 days) site. But I am still confused why GOOGLE does not go to some links/pages if they have basicly the same QUERY structure/syntax.
For example
----------------
page.php?var1=seasonmodule&var2=2
page.php?env1=article:
----------------
were crawled, but bunch of other pages/links from the same main page were not (page.php?env=news_article_new).

So the main question is how deep will the CRAWLER go and what exectly make a difference on that.

Plus, there is one more GOOD question. I need to modify the QUERY, so it will be visited and crawled by GOOGLE. You may apply any changes to the webserver.

Here is the page:
index.php?env=news_article-:l-1-1-:bb-4-1-1-1-1:n-2-1-637:s-3:m-1-:img-5-1-1-1-1

I would really appriciate any ideas!

Thank you in advance & it is truely a GREAT forum!

Dolemite

12:19 am on Apr 1, 2003 (gmt 0)

10+ Year Member



That's a nasty URL. Google has been doing much better with dynamic URLs lately, but there are still some things that throw it off. I'm not sure how well spiders like those colons for one thing...

In any case, mod_rewrite should do the trick. Do a search either here or on google and you'll have plenty of good examples of how to use it.

BigDave

12:25 am on Apr 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How are you linking to them? Are they a simple <a href="URL"> or are the forms? javascript links or menus?

Rhadamanthus

12:40 am on Apr 1, 2003 (gmt 0)

10+ Year Member



Google crawled all of my dynamic URLs without me having to do anything special, but my understanding is that they beefed up their understanding of dynamic URLs right before I launched my site.

Either way, I have totally replaced them in all of my links with "static" urls that actually map (thanks to Apache's mod_rewrite) back to my dynamic URLs. So www.mysite.com/articles/1.html actually links to www.mysite.com/showarticle.html?id=1 - but to the outside world it looks just like a static URL. I like this approach a lot because it makes my URLs cleaner, while still giving me full dynamic control over them.

Your URLs look a lot more complicated then mine, but a similar trick should still work. There's a similar (and free) plugin for IIS that does the same thing as mod_rewrite, if you use that instead. The best part is that it doesn't even break any old links to your pages, because the old dynamic page is still there.

Rhadamanthus

2:15 am on Apr 1, 2003 (gmt 0)

10+ Year Member



indigojo stickeyed me with this earlier.

indigojo wrote:


Whats the name of the product you talk about?, I've looked at heaps for ASP sites and they all seem to play around with the url's

I thought I'd post my response here, too, so that everybody could benefit.

It took me a bit of looking to scrounge it up the first time, too.

I had to go look and find it again (which is why I didn't just post it before), but it's called "ISAPI_Rewrite". There are two versions - Lite and Full. Lite is free, and most likely has everything you need. If you need proxiing or if you run multiple virtual servers, you'll need the Full version, which is $69 for one server or $535 for unlimited servers. You can find them all at [isapirewrite.com...] .

ISAPI_Rewrite uses a very similar (if not the same) rewriting syntax as mod_rewrite for apache. It's very, very powerful, but it can be kind of a pain to setup. I will be using the free version myself in the near future, once I finish converting my site from Apache/PHP to IIS/ASP.NET. I've already got it installed on my test server, and it works just fine for what I'm doing. I'd include the rewrite rule that I wrote for my mapping, but I don't have my test server booted at the moment. I'll try and remember to post it later tonight.

Good luck with your site!

Rhadamanthus

2:47 am on Apr 1, 2003 (gmt 0)

10+ Year Member



The following is the rule that I use to accomplish my directory -> dynamic page mapping:

RewriteRule ^/articles/([0-9]*)\.aspx$ /showarticle.aspx\?id=$1

The "^" at the beginning tells it to map to the start of a path, so that "/articles/" must be at the beginning of the path (immediately following "http://www.mysite.net"). The "[0-9]" says to look for a numbr, and since it's enclosed in the ( *), it's looking for a sequence of one or more numbers. The "\." means to look for a "." next, since "." alone is a special character in their rule set. Then, it has to end in ".aspx" (the "$" says that this must be the end of the URL). I have another, earlier rule mapping all ".html" files to ".aspx", so that the actual URL the user sees in a browser window ends in .html instead of .aspx.

The next section tells ISAPI_Rewrite what to map the URL to. It starts the URL with "http://www.mysite.net/showarticle.aspx?id=" (? is also a special character, hence the slash before it). The "$1" tells it to map the first variable sequence (in my case, the ([0-9]*) ) to this spot, so the number gets appended to the end. If I had a second variable sequence, I could map it with "$2", etc.

So, a link in my html looks like this: "http://www.mysite.net/articles/12345.html". This maps to "http://www.mysite.net/articles/12345.aspx", which is then mapped to "http://ww.mysite.net/showarticle.aspx?id=12345".

It took me some trial and error to get it all exactly right, and your URL looks considerably more complicated, so you're going to need to do a bit more, but my rule might be a good starting point for you.

Good luck!