Forum Moderators: open

Message Too Old, No Replies

Spiderable URL?

spider url asp

         

UKSEOconsultant

4:00 pm on Nov 15, 2004 (gmt 0)

10+ Year Member



Hi guys!

Just a quick question; the following URL will be easily spiderbale by the SE's - true or false?

www.example.com/directory1/cms/page.asp?121

I believe that the SE's would have no problems spidering this url!

Many thanks guys.

[edited by: Xoc at 8:23 pm (utc) on Nov. 20, 2004]
[edit reason] changed to example.com [/edit]

webboy1

4:06 pm on Nov 16, 2004 (gmt 0)

10+ Year Member



From what i hear, they will definitely spider that URL. Its only when you get to rediculous levels of variables after the? that the problems start.

We have several websites with very similar URLs .... all have been listed in all SE's we have submitted to - including Google.

Google infact states in its help pages that it can read '.asp?......', however it does advise not to go over the top i.e. filename.asp?something=1&something=1&something=1&something=1 etc etc etc etc

Hope this helps

Lorel

11:24 pm on Nov 18, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi UKSEO,


Just a quick question; the following URL will be easily spiderbale by the SE's - true or false?

www.example.com/directory1/cms/page.asp?121

I believe that the SE's would have no problems spidering this url!

They can spider it but it will more than likely not produce any PR. I personally stay away from such directories when looking for backlinks to build PR.

[edited by: Xoc at 8:24 pm (utc) on Nov. 20, 2004]

UKSEOconsultant

8:38 am on Nov 19, 2004 (gmt 0)

10+ Year Member



Thnaks Lorel and Webboy1! appreciate your response.

The URL not being able to build PR, that is a problem. How can the pages rank with a PR of 0? Is there anyway you can build PR to these pages - I'm guessing if we get backlinks to the pages then this can be achieved.

Cheers, Lee

Lorel

12:05 pm on Nov 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi Lee,


the URL not being able to build PR, that is a problem. How can the pages rank with a PR of 0? Is there anyway you can build PR to these pages - I'm guessing if we get backlinks to the pages then this can be achieved.

The problem with directories not gaining PR is because the pages are dynamic and produced from a database and are not static, and thus search engines cannot the contents. Some search engines can now catalog what is in databases but whether or not they transfer PR I'm not sure. You need to check your software to see if there is some way to produce static pages for the search engines. I manage a classified site run from a database and it has the option to produce static pages for the search engines.

mattglet

3:14 pm on Nov 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



...and thus search engines cannot the contents. Some search engines can now catalog what is in databases ...

That comment isn't entirely true. Dynamic pages are just as easily spiderable as static pages (provided your querystring isn't insane). All that matters is the HTML outputting by your server-side code. If you are curious as to what the spider will see when it tries to index the page, just view the source of your page in the browser. If everything looks good, you will be good.

macrost

4:37 pm on Nov 19, 2004 (gmt 0)

10+ Year Member



mattglet,
I have noticed one thing about pr and the passing between a static looking url to a dynamic url.

One site that I have, most of the urls are static, and pass pr wonderfully, but the ones with qs don't.

Take a look, you'll see.

mattglet

6:12 pm on Nov 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I wasn't arguing the PR part, just the search engines not being able to index dynamic URLs.

I've seen a couple of threads that pertain to the passing PR though... that's very interesting.

macrost

6:26 pm on Nov 19, 2004 (gmt 0)

10+ Year Member



mattglet,
True, if I have to have any qs, I try to keep it at a max of 3.

nick7272

3:18 pm on Nov 23, 2004 (gmt 0)

10+ Year Member



Have any of you tried the IIS url rewriters? I haven't yet, but am considering using one of them. Any feedback?

UKSEOconsultant

3:46 pm on Nov 23, 2004 (gmt 0)

10+ Year Member



I tried a mod rewrite (apache server) rule a couple of years ago and it prived very effective! Although there would be no need to use one if your URL's have less than 3 parameters.

Cheers guys!

mattglet

5:25 pm on Nov 23, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've got a few sites built using a custom 404 page, that use rewritten URL's. Easily the most useful thing I could ever imagine.

Buying your own ISAPI filter is fine and all, but if you're on a shared host, installing 3rd party software isn't always an option.

pageoneresults

5:50 pm on Nov 23, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You need to check your software to see if there is some way to produce static pages for the search engines.

Be careful in this scenario. While you may think Google cannot get through various query strings, think again. I've seen Googlebot chomp through multiple variables and index the URI (yuck!).

If you have an option to produce static pages from any software, then you need to make sure that your dynamic pages are invisible or you stand a chance of duplicate content issues. I've reviewed many sites where this type of situation exists and they are suffering from the indexing of multiple URIs all leading to the same content.

If you are producing static pages from dynamic pages, then your dynamic pages should be a directory by themselves and disallowed via the robots.txt file. I would even go one step further and place a robots meta tag on each of the main dynamic template pages...

<meta name="robots" content="none">

ISAPI filters are the only way to go in a Windows environment. We use ISAPI_Rewrite and have been for years, the product is flawless and allows infinite rewrite routines. If your host is reluctant to install the global .ini file for ISAPI_Rewrite, they should first review the product which I think will ease their reluctance.

What's nice about ISAPI_Rewrite is once the global .ini is installed at the server root, all you need to do is drop a .ini file at the root of each web and configure from there. It does not affect anything but the web it resides in. All sites that we manage on Windows now have a root .ini which contains this...

[ISAPI_Rewrite]

RewriteCond Host: ^example\.com
RewriteRule (.*) http\://www\.example\.com$1 [I,RP]

The above is a simple 301 for permanently redirecting non www requests to www.

pageoneresults

5:51 pm on Nov 23, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've got a few sites built using a custom 404 page, that use rewritten URL's. Easily the most useful thing I could ever imagine.

Another one I'd be real careful of. I'm sure yours are correct but I've seen many that are not. Their custom 404 pages were returning a 200 status instead of 404.

Always, always, verify that your server headers are returning the proper HTTP Status Codes.

Although there would be no need to use one if your URL's have less than 3 parameters.

If you are going to rewrite URIs which is strongly advised in today's environment, you should strip all variables from the string. Trim that puppy down as far as you can so it becomes user friendly.

mattglet

7:06 pm on Nov 23, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Another one I'd be real careful of. I'm sure yours are correct but I've seen many that are not. Their custom 404 pages were returning a 200 status instead of 404.

That is a VERY important point. You absolutely need to run through numerous spider simulators/header checkers before deploying to production. Make sure all your pages are returning 200 where they need to, 404 where they need to, and your 301's from your previous files are working correctly (if you are changing your structure on an existing site).

Become friends with SearchEngineWorld's Header Checker [searchengineworld.com].

beauzero

4:56 pm on Dec 3, 2004 (gmt 0)

10+ Year Member



Its easy in .NET env.
1. Just redirect the 404 to a page that breaks down the uri such as www.mysite.com/[productid]. Works really well.
2. If you want to go the best route write a custom ISAPI plugin (much easier in .NET vs. ASP). Search on .net "http application" in G and you will get plenty of examples. Then just build your own extension such as 0000000001.prid
This will run extremely fast.

Good luck. I have done #1 and verified. Its quicker and easier. We are short on resources, long on projects.