homepage Welcome to WebmasterWorld Guest from 54.234.60.133
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Visit PubCon.com
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
Mod-Rewrite or PHP Code causing PR0?
Site added in OCT 2003, and has nothing but PR0.
boyscout




msg:93634
 8:45 am on Jan 28, 2004 (gmt 0)

Hi Guys,

Ive been more of a reader here than a poster, but i must say, this forum is a great place for learning... and ive learnt a lot.. so id just like to start with a thanks to everyone here!

Ok, my company and I launched a site last year, Oct 2003. A site index, that contains a bunch of resources. (Over 5000). Using htaccess mod re-writes, we've managed to make all the URL's static looking.

However, over 3 months and no backlinks in Google, very very minimal SE traffic, (im talking 10 - 12 visitors in total from Google) and home page + one sublevel show PR0, while deeper pages show grey bar.

The site is linked to from PR 7 and PR 6 sites and has been since its launch. (Has over 600 backlinks on ALLTheWeb) and is listed in 3 different categories in DMOZ.

The server is dedicated, and we recently launched a ONE page site, that got a PR6 within a week (Launched Mid December).

The site has adsense, and the adsense spider has no problem crawling. All the ads are relevant to the page / category they are on. Ive also seen multiple crawler4, crawler5 etc having over 300 visits in a month.

Somethings obviously wrong, but i cant for the life of me figure out what. No spam, no overoptimising at all.. as a matter of fact, the entire optimization has been 'natural'.

Im guessing that its got something to do with the code, or mod-rewrite, because another site launched about a month ago that uses the same backend, seems to be the same. Its listed in the google index, but PR0. Two of our other sites launched on the same server, but using a different backend have been PR'd fine, although they really show up no where in SERPS.

Ive been racking my brains and Im going insane... Google support replies with the same canned responses...

Any ideas anyone? TIA.

 

ska_demon




msg:93635
 4:12 pm on Jan 28, 2004 (gmt 0)

Hmm I've notice a few sites that use mod_rewrite have dropped out of the SERPS or have PR0 since the last update.
Ska_Demon

boyscout




msg:93636
 10:44 am on Jan 29, 2004 (gmt 0)

Ska_Demon, can you PM me any sample sites?

Would be good to compare.. and see if something is actually up with the mod re-writes...

progex




msg:93637
 12:40 pm on Jan 29, 2004 (gmt 0)

I believe that Sitepoint.com uses Mod_Rewrite on its URLs, but still maintains the PR8 it had before. I don't know about its SERPS though.

pavlin




msg:93638
 1:35 pm on Jan 29, 2004 (gmt 0)

Hm
I started using mod_rewrite at one of myu sites two months ago and since then my PR increased from 5 to 6.
I do not think mod_rewrite haves anything to do with the PR.

Birdman




msg:93639
 1:49 pm on Jan 29, 2004 (gmt 0)

I think a peek at the mod_rewrite code will help. Also, have you checked to see what headers are being returned for various pages?

Mod_rewrite could very well be the problem, if done incorrectly.

boyscout




msg:93640
 3:10 am on Feb 12, 2004 (gmt 0)

Hey guys, Sorry, late reply, have been away. Mod re-write code:


RewriteEngine On
RewriteBase /

#
# Rewrite rules
#
RewriteCond %{REQUEST_URI} !^/[^/]+/[^/]+/[^/]+$
RewriteCond %{REQUEST_FILENAME}!-f
RewriteRule ^(.+[^/])$ $1/

# subcategory
RewriteCond $3 (^$)(^index\.html$)
RewriteRule ^([^/]+)/([^/]+)/([^/]*)$ /subcategory.php?Cat=$1&Subcat=$2 [L]

#category
RewriteCond $2 (^$)(^index\.html$)
RewriteRule ^([^/]+)/([^/]*)$ /category.php?Cat=$1 [L]

# subcategory paged view (must be ABOVE the script rule)
RewriteRule ^([^/]+)/([^/]+)/page([0-9]+)([a-z])([ad])\.html$ /subcategory.php?Cat=$1&Subcat=$2&Page=$3&SortBy=$4&Order=$5 [L]

# script
RewriteRule ^([^/]+)/([^/]+)/(.+)\.html$ /listing.php?Cat=$1&Subcat=$2&Listing=$3 [L]

#RewriteCond %{REQUEST_FILENAME}!-f
#RewriteCond %{REQUEST_FILENAME}!-d
#RewriteRule ^(.+[^/])$ $1/ [N]

php_flag session.use_only_cookies on
php_flag session.use_trans_sid off

Any more ideas?

whoisgregg




msg:93641
 7:56 pm on Feb 13, 2004 (gmt 0)

Maybe the domain was previously banned?

[webmasterworld.com...]

enotalone




msg:93642
 5:34 pm on Feb 14, 2004 (gmt 0)

Boyscout, i have been using mod rewrite for more than 1.5 year now and all pages rank very well with google. my experience has been that google does not like complex mod rewrites however, i tried to convert my shopping pages featuring amazon products to use mod rewrite too and since amazon uses complex arguments in url my static urls turned to be complex too and google never really indexed many of those pages though the fact that i waited like 4-5 months.

my directory, article pages however do use mod rewrite with simple static urls and rank wonderful with google often being number 1 or in the first page with dozens of very competitive words/phrases in my industry.

JeremyL




msg:93643
 11:45 pm on Feb 14, 2004 (gmt 0)

GoogleBot just pulls a webpage. There is really no way I know of for it to be able to tell you are using Mod Rewrite.

boyscout




msg:93644
 4:56 am on Feb 16, 2004 (gmt 0)

Hmm.. i dont think it was previously banned, there's nothing there to ban it for.

I agree that Google shouldnt care about mod-rewrite and should only follow as a page, but the only thing i can see in common between two sites that are PR0 and have been so for over 4 months is the fact they both use the same backend CMS, with Mod-rewrites.

We've previously launched new sites on the same server (i.e same IP) and theyve done just fine in terms of PR and SERPS.

This is driving me nuts!

jdMorgan




msg:93645
 6:21 am on Feb 16, 2004 (gmt 0)

boyscout,

I would be a bit concerned about the last bit of code where it's looking for file-not-found and directory-not-found (!-f & !-d), and then appending a trailing slash and rewriting anyway. I see it's commented-out now, but it may previously have done some damage.

If your site is designed such that it is impossible or almost impossible for a robot to get a 404-Not Found, they will be leery of indexing very deeply in your site. You'll find a lot of threads here asking, "Why is Ask/Google/Ink/whoever requesting this funny-looking URL from my server? - It does not exist and there's no such link!" The answer is that the server is being tested for its 404 response.

Any site where it's impossible to get a 404 is considered as a potential trap by spiders - They are "afraid" they won't be able to get out again, and so don't go deep. For the same reason, depth is limited on sites with complex query strings or "very deep" static URLs (which they can surmise to be query-string aliases). Because it is potentially-impossible to get to the "end" of such sites, an arbitrary link crawling count must be set. Therefore, page rank can suffer somewhat if you depend on deeper pages "feeding back" PR to other pages.

I may have misinterpreted your code, but I suggest reviewing that last part specifically.

None of your rules invoke external redirects, so your mod_rewrite code is invisible to search engines. Others in this thread may wish to check their servers for unexpected responses (such as unexpected publicly-visible 301 or 302 redirects) using the Server Headers checker [webmasterworld.com]. Best results will usually be had if your site always returns an appropriate server response code [w3.org], and you don't use any "tricks" such as using 404's to create script calls.

Jim

boyscout




msg:93646
 7:52 am on Feb 16, 2004 (gmt 0)

Jim,

Thanks a lot for that! Although yes, it was commented out, the cms was built in a way to send any not found document its own error message without sending a 404 header, so im assuming this did confuse / 'scare' the bots. It's fixed now...

This also seems to be the case, as allinurl:domain.com returns only around 300 or so results, whereas the site itself should have about 5,000 results.

So i guess now all i can really do is sit tight and wait right?

Thanks again!

somerset




msg:93647
 8:04 am on Feb 16, 2004 (gmt 0)

Bear in mind that although the page urls are now re-written, the old query string url's also still exist (i.e it may be possible to get the same content displayed by using pre mod rewrite urls). Can you get to those pages using a variables querystring address.

So, if Google was aware of the old addresses and now the new ones, you may have a whole set of duplicate pages - IF - somewhere in your navigation system you have any links at all to the old urls system (before applying mod rewrite).

It may be worth rechecking your navigation system to ensure there are no links pointing to querystring urls.

alexswalker




msg:93648
 8:31 am on Feb 16, 2004 (gmt 0)

Jim and Somerset - you comments may have really helped me out. I couldn't understand why Googlebot was spidereing this site so little.

I used the server header checker tool and noticed that a 404 error was not returned when I typed in a file that did not exist. Instead the browser is redirected to a custom error page using the feature in Plesk 5 called "Custom Apache Error Docs" - this returns a 200 status code.

I have now unchecked this box in Plesk and proper 404 messages are shown - I help that helps things!

Alex

boyscout




msg:93649
 9:19 am on Feb 16, 2004 (gmt 0)

Thanks somerset, but we never launched the site without the mod re-write.. its funny, we designed and developed the site around google, and by doing so, managed to screw ourselves out of google...

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved