Forum Moderators: open

Message Too Old, No Replies

Why won't google index my dynamic pages?

         

ichthyous

4:36 pm on Jun 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi there, I have an online database of images (I am a photographer) and have been trying to get the SE's to index the databse for 2 years now. I added direct links to some of my HTML pages and google did index the dynamic pages i linked to, but the sheer volume of liks makes it impossible for me to do it all by hand. I generated a site map (50 pages total) with links to every dynamic page in the database, uploaded it and linked to the sitemap from my HTML pages. Google has spidered the sitemaps, but has not spidered the pages referred to by the site maps. The database is compiled C++ and the dynamic URLs can be very long at times...although google didnt seem to have a problem indexing the dynamic pages when I added links by hand and used link text on my html pages. I suspect that the problem is that the sitemap pages are simply too big. There are 50 pages total, and they weigh from 50-100K each. Each sitemap page contains about 200 links. I am wondering if some sort of Google filter is being applied to the site maps since they are nothing but bages full of links. Any advice would be appreciated!

yowza

6:02 am on Jun 15, 2004 (gmt 0)

10+ Year Member



It could be a filter. I have heard that you should keep links pages under 100 links per page. What is your PR? I think that a higher PR could help with deeper crawling.

somerset

6:40 am on Jun 15, 2004 (gmt 0)

10+ Year Member



give yourself a starting chance and apply mod rewrite (if hosted on unix) to get those long querystring urls made into standard urls.

ichthyous

1:30 pm on Jun 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



can you elaborate a bit more on how to do that? I am on a Unix server and have access to both my htaccess and httpd.conf files, but I don't know how to write the code

Mr_Diggz

2:43 pm on Jun 15, 2004 (gmt 0)

10+ Year Member



Here's a starter for rewriting urls:
[webmasterworld.com...]

ichthyous

3:03 pm on Jun 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I read this earlier today...I don't really follow how the mod_rewrite changes the URLs...I am not a programmer. Also, will I have to generate the site maps over again since the URLs have changed? Could the mod_rewrite affect how the database functions? Thanks

Mr_Diggz

5:30 pm on Jun 15, 2004 (gmt 0)

10+ Year Member




When rewriting the urls, you write rules that the server will follow anytime there is a request for a page.

Say you have:
[yoursite.com...]

The server could take the url, break it into variables, and send them to another page like:
[yoursite.com...] (which is the page that generates the dynamic content).

Depending on how it's used, it could slow down your server, so be careful. It might be a good idea to check with your hosting company first.

In regards to rewriting your sitemap, I would personally update the sitemap to use the new static pages.

Dpeper

5:45 pm on Jun 15, 2004 (gmt 0)

10+ Year Member



quick question If I make a site map of

[widget.com...]

links like that will google index the pages?

kaled

5:57 pm on Jun 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Given the size of your site, I would carry out a small-scale experiment first. If that works then carry it through. There are many possible reasons why pages are not indexed - the nature of the urls on your site may not be the cause of the problem.

Kaled.

ichthyous

6:15 pm on Jun 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Mr Diggs, I understand the premise, but your example is actually the reverse of what I need done, right? Here is a sample of an actual dynamic URL from my site:

[widgets.com...]

This URL is about as long as they get. So if I understand correctly, you are saying that a mod_rewrite would take this URL and turen it into a static one which Google would have an easier time digesting it? To be honest, I'm not sure that is really the problem...as i mentioned, when i placed the links on my regular HTML pages by hand, added link text, etc. Google indexed them immediately. The problem seems to be with the sitemaps. I used Xenu and it grabbed every last dynamic page I have (8,500 total.) Unfortunately a lot of the links in the sitemap link to repetetive content, i.e. an image with the same name placed in three different categories. I am wondering if Google is simply seeing these 50 sitemap pages which consist of nothing but links which often have repetetive titles as an attempt at spamming. My original attempt to just add links by hand got only those exact pages indexed, but Google will not follow any of the on-page links to index the rest of the databse itself. Yahoo also doesn't seem to want to index it. I have no problem getting my other static HTML pages indexed very quickly.

Mr_Diggz

7:37 pm on Jun 15, 2004 (gmt 0)

10+ Year Member



Although it may at first seem to be the reverse, it's actually not. With the link you gave, you could write a rule that could convert the url to:
[widgets.com...]

You would have to change all the links on your site to use the new static format of the url. Each time the new page is requested, the url will be converted to the original one you gave.

You might also want to get the urls to be more friendly since those extra directories would benefit alot more if they were keywords and not long strings of numbers/letters. It could get pretty complicated since you have such a long querystring.

I would put my money on the very long urls, but that's just me.

Mr_Diggz

7:41 pm on Jun 15, 2004 (gmt 0)

10+ Year Member




quick question If I make a site map of
[widget.com...]

links like that will google index the pages?

With a short querystring like that, I wouldn't worry too much. I've never had any problems with getting pages like that indexed. That's just my opinion though.