Forum Moderators: open

Message Too Old, No Replies

Dynamic URL problem

Dynamic URL's dont get crawled

         

MonkeyReview

2:59 pm on Nov 5, 2002 (gmt 0)

10+ Year Member



I have been dealing with this problem for about a year now and its more or less forcing me to consider closing my site. Basically i will give you all the details, i run a review site, we review hardware, games, movies, software, and anyways the whole site, or at least 90% of it is dynamic content. I have built the site in what you could call "chunks". I am no wizz at php and this has made things that much more difficult.

Okay so i have dynamic links on my main page, one that goes to a news script, and one that goes to my reviews, the URL's both have the same format:

News:
[wwwfoobar.com...]

Reviews:
[wwwfoobar.com...]

Problem is Google can crawl the news links no problem, but it cannot crawl the reviews, the reviews are broken down into catagories and these can be crawled with ease, the bot just never goes further BUT for one catagory it did crawl the reviews, funny thing is google see the url as this:

[wwwfoobar.com...]

Personally I dont care how it does it as long as it does. My site is content drive with over 200+ Reviews and google only sees maybe 30 of those....does anyone have any ideas what could be causing it or a way to get around this? I have tried various fixes, changing the url to look like this

[wwwfoobar.com...]

Google picked this up as
[wwwfoobar.com...]

This was fine and i thought it may be a fix but then google dropped the pages a day later, the day after that it was back in google, but now has dissapeared again... I am open to just about anything and will even pay someone if they can help with this.

Brett_Tabke

3:45 pm on Nov 5, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Reduce the number of parameters in the cgi string.

Double check everything that the page is visible and indexable by a crawler.

Check your robots.txt (if you have one), and check any bot related meta tags you may have.

Validate the page - maybe the bot isn't seeing your links because of some minor html error.

Make more links to your dynamic content. Each individual page should have atleast two inbound links to it.

ad6565

4:25 pm on Nov 5, 2002 (gmt 0)

10+ Year Member



If you're worried it's your URL's causing the problem, you may want to try using RewriteRule in your .htaccess to change the appearance of them. You'll need to check with your host to see if mod_rewrite is allowed.

If so, do some research into RewriteRule - it should go something like this...

Options +FollowSymLinks
RewriteEngine On
RewriteRule ^/news/index([a-z]*)-([0-9]*)\.html$ /news/index.php?action=$1&ID=$2

- or you may need to use news/ instead of /news/

If mod_rewrite is enabled, and the above code goes into your .htaccess, or something similar (like I said, research it, get it right before you do it) you should be able to access the news stories (let's say story 99) by using www.domain.com/news/indexfull-99.html

ruserious

4:33 pm on Nov 5, 2002 (gmt 0)

10+ Year Member



His name is his URL. ;)

It looks like you are running your links through some function that encodes your url. Look at the sourcecode of the page that is output it reads like this:

index.php?action=fullreview&id=199

or

index.php?action=fullreview&id=162

It seems like your running your urls through some function like htmlentities(), urlencode() or rawurlencode() ... which you should NOT do. When viewing your source they should look like plain, normal urls.

btw: You should replace those .jpg header on the right columns with .gifs, that would look much cleaner.

MonkeyReview

4:47 pm on Nov 5, 2002 (gmt 0)

10+ Year Member


Okay Mod rewrite is enabled and we have played with it a little but nothing seems to work within the site once the url changes. What you are seeing right now the & is just something i was trying to see if google would then follow the url and in fact it did for about 4 of the reviews when i coded the url to be

http://www.foobar.com/reviews/index.php?action=fullreview&id=200

Google picked this up and listed the url as

http://www.foobar.com/reviews/index.php&action=fullreview&id=200

The files were dropped from google the day after though :(

How could i check everything that the page is visible and indexable by a crawler?

Brett_Tabke

4:51 pm on Nov 5, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Has it been 60days since they were spidered? It can easily take that long to get pages into Google anymore.

MonkeyReview

5:07 pm on Nov 5, 2002 (gmt 0)

10+ Year Member


Hi There, nope but they were visable then dropped, i just changed all the code back to the way it should be, the whole problem has been going on for about a year though and so i know google is having problems, also the way it displays the urls is can crawl is another indication its having issues. it sees the? as %3F

EG

http://www.foobar.com/reviews/index.php%3Faction=fullreview&id=192

Brett_Tabke

5:46 pm on Nov 5, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Not uncommon to see them one day and dropped the next day if it is the first time Google spidered them (called the everflux effect - do a site search).