Forum Moderators: open
It is ranked highly on many keywords, with and without the country name as well as having many PR5 subpages (PR6 root page).
It was recently pointed out to me that the site should do much better though and I started to investigate. I found that an alternate domain (that was simply pointing to the same IP) had more pages listed on google (260000 compared to 230000 for the main domain) and was PR0, obvisously penalised. Further more many internal pages could be accessed through a variety of paths, giving them several different urls.
I have now gone through a big resorting process, unifying the diverse urls, i.e. all internal links to a page use the same url, while incoming links still work as usual.
Furthermore the site uses a full 3 part frameset throughout which can't make things better. Incoming liks to internal pages are fixed via a javascript command that rebuilds the frame around the page, for those click-throughs from google.
I'm not really too concerned about the ranking, as we already have more traffic then we need (we will need to upgrade hardware soon, AGAIN) but I would really love to see how well the site could and should do.
So I am wondering if penalisations are all or nothing, just set to PR0 and be done with it, or if there are many aspects of partial penalisations for a variety of more or less severy violations.
For example Google counts around 240 incoming links from outside sources. I've seen sites where pages within the domain counted as incoming links to the homepage. Since all my pages link back to the root url that should make my inbound links over 200000. What can I do to get these inbound links counted?
This site has been established for 4 years now and recieved a large amount of traffic, around 40% of it through google.
Thanks everybody for getting through my lengthy post.
PS: This website obviously went through a long process of evolution, and many things or planned but not easy to implement, such as moving to a frameless layout for example.
S. N.
killroy, that right there gets me dizzy! I have the hardest time figuring things out with pages that do that, it's so hard to see what's happening. :)
It doesn't sound like a 100% sure thing that there's a penalty; it may just be possible that there's a duplicate content situation and that some are being discounted. Not getting credit for the same pages twice isn't necessarily the same as being penalized. It does sound, however, like you're losing a lot of benefit from those inbound links.
If that second domain is just "pointing" to the IP, is it turning up a 301, 301 or 200? It's not my area of expertise, but it sounds like there may be somewhat of a solution for you using mod_rewrite.
You're right about the second domain, it was returning a 200 and I'Ve not used mod_rewrite to set up a proper 301. I wasn'T aware of this situation posign a problem previously, and I msut have dozends of domains showing up 200 that I don't even remember. I used to jsut set up all domains onto that server that I wasn't using for anything...
To clarify my frames situation:
frameset
- frame A
- frame B
- frame C
now frame C is the interestign one and the one that google spiders and where it finds the tons of pages. so all links from google results are frame C results.
now at the top of each frame C page it has following javascript:
if(top.location==document.location)top.location=SCRIPT
where SCRIPT is something like: framebuilder.ext?page=frameCaddress
which then creates a frameset with the three frames properly in place.
In fact arriving at the site from google you don't even notice anything is amiss it goes so quickly, you end up getting the info you clicked to from google but in the proper frameset. AND the page rank is where I want it, with the indexed content page.
In fact google spiders this site very much as if it wasn'T frames at all, which suits me just fine.
I think you are spot on here Marcia. Also I think too many people think in terms of being penalised, rather than just not 'getting plus points'. I believe google rarely penalises sites, but instead ignores code which is either obvious spam or is not helpful to them. The effect is a lower ranking, rather than penalisation, which is a subtle difference.
Anyway,
"javascript command that rebuilds the frame around the page" IMHO very messy and probably the biggest problem. Many people now surf with java turned off, and the spider could easily trip up on this arrangement.
"So I am wondering if penalisations are all or nothing"
Maybe they are, but you really have to do a very blatant spam to get one, resulting in a pr penalty. Instead the majority of 'penalties' is in fact just missed opportunity, resulting in your ranking suffering... e.g.
If you over use H1 or Bold, the benefits are reduced or if very over used google probably starts to ignore them all together. Therefore your targeted keywords will start to lose the benefit of being placed within these tags.... No penalty, just lost opportunity.
I've put out a call for one of our resident mod_rewrite gurus, meantime you might want to research for setting up more spider-friendly navigation, if thats your inclination, though that would be a task if there are that many inbound links. If the site works well for you otherwise, I'd take a very conservative approach to making any major changes.
Here's a good basic tutorial, written up by DaveAtIFG, our Technology Moderator, that's very clear and easy to understand:
An Introduction to Redirecting URLs on an Apache Server [webmasterworld.com].
It does sound like no matter what, the pointing and redirection situation need to be taken care of - that might just do the trick for you, for what you're needing at this time.
MHes it's entirely plausible, and once the URL situation is cleared up, it'll be easier to be able to see what's happening. There may be some additional optimization needed for the site, but there would be no way to tell until Google's presented with the correct picture of where everything is located, which will probably take going through an update or two.
It's never a bad idea to dig in to see if there can be some improvements, but it's probably the wisest course to get the page locations squared away first. If that's what the problem is, that's the approach I'd take - conservative, first things first. The PR will improve once the URL, link and navigation issues are straightened out, so it'll be easier to get an accurate picture of rankings at that time. There chould be dramatic improvement from that alone.
I'm well aware that the site needs a complete rework (a replacement is in the make) But that is expensive and time consuming. remember the site has around 700000 indexed pages. Furthermore it is a directory, so there aren't actually any sepcific kewords I am targeting. I just noticed that we rank top for most of the businnesse and category names in the directory; often covering most of the first 10-20 results.
Regarding the JS. Let me try tpo clear it up. the center page is what is interesting. this page links to other pages, business listingss to categories and vice versa, and google spiders that. In fact you could get aroudn most content (probably all of it) without any frames around it. the JS jsut recreates teh frames for human visitors so htey got the logo anbd a JS compressed category overview on the side. So google spiders the ocntent pages and gives them PR. people link to them from google results and teh JS gives them our "branding" basically.
If you want I can sticky ouy the url, it'll make it clear.
Ultimately the frames will come off, but I have to be very CAREFULL (conservative) to not destroy what I got already. maintaining all the old addresses will be hard, and doign a mod_rewrite for 70000+ urls will be even harder... I'll have to create a complex database just for that purpose!
Also the website is still hosted on a Netscape enterprise server. All other sites I run off my server are on Apache. jsut this (VERY LECACY) site hasn't been moved yet simply because its so big and cumbersome. It's over 1.5GB of files most of it old and bad and crap... but it'S a LOT of rubish to clean up without breaking the whole thing.
It started out as a weekend job and's grown all out of proportion ;)
I WILL fix it... eventually.