Forum Moderators: open
With the use of LSI Google may ignore over used keywords. As an example lets say your main keywords are k1 k2 k3 k4 if you over used (High keyword density) k2 k3 your site will do well for k1 k2 k4 or k1 k3 k4 or k1 k2 or k1 k3 or k3 k4 etc. But not do well for k1 k2 k3 or k2 k3 k4 or k2 k3 etc.
This is not a permanent penalty ~ This is similar to ignoring words like “a” “the” in the search because there so common ~ put it in an other way Googles LSI technology ignore the words which are very common in a document to improve its search speed ~ If you have this type of penalty simply lower the keyword density for the ignored words for the whole site not just for index page.
Sagara Kelaniya
put it in an other way Googles LSI technology ignore the words which are very common in a document to improve its search speed
That's not my understanding of Latent Semantic Indexing (I take it that's what you were referring to with 'LSI'?) I read it to be an overall impression of language as a whole:
e.g. if the search is 'breakfast' a site that talks about 'coffee', 'eggs', 'bacon' and 'cereal' will be classed as more relevant that a site about 'cereal' alone.
I feel that the situation you are expressing leans more toward Stemming. If a page states the same keyword over and over with no variations of that word then it is considered 'overoptimized' or 'un-natural' and therefore does not receive a higher ranking.
True
~~~That's not my understanding of Latent Semantic Indexing~~~
False: Removing of conjunctions, functional and ubiquitous words (words that are present everywhere in a document) is part of Latent Semantic Indexing (LSI)
~~~If a page states the same keyword over and over with no variations of that word then it is considered 'overoptimized' or 'un-natural' and therefore does not receive a higher ranking.~~~
If a document repeat the same words over & over again even with variations (in all their inflected forms) it will be excluded when indexing a document. Unlike standard stop lists (list of words to ignore) LSI stop lists defer from document to document depending very much on the nature of the document itself. These stop lists improve accuracy and lessens the amount of computing power to run the indexing algorithm.
One of my sites had this kind of penalty & it was penalized for two of its main keywords let say red widgets. Before the penalty it was in the top 10 (mostly top 5) all of its main & sub keywords. After the penalty it was nowhere in the SEPR’s for red widgets, free red widgets, wholesale red widgets, etc. But it was still doing well for free widgets, wholesale widgets, black widgets etc.
Site details: 200+ pages, PR5, keyword density 20% - 26%( max)
What I did:
Lowered the keyword density of some of the high ranked pages including index to 1.25% - 2.75% (max)
Result – G cashed the pages but nothing happened
After waiting for few weeks I again lowered the KWD of more than 70% pages to 2.75(max) and the inbound links KWD from 90% - 100% to 50%
Result - G quickly caches the pages and re-indexes my site in no time ~ I was quickly listing in the top ten for all of my penalizes keywords.
Now what about your own theories
I have a page title something like this:
Full Blown Widget Thingy Knocking.
I have been penalised for the complete keyword description,even when in conjunction with Town/ City.
Pages surface when you search on Widget Thingy Knocking. Widget Knocking is the popular search phrase and I dont appear here.
Is this similar to your experience?
What do you recommend?
Regards
Midhurst
I have found that, the more competitive the keyword is in Google, the more important role Backlinks and more specifically, ANCHOR TEXT, determine the results.
Do a search in Google for allinanchor:www.yourdomain.com and find your position. Then do the same search for any of the top ten competitors, for that search.
I am fairly certain you will find that is where you need to focus your attention. Not on on-page factors.
On-page factors play a role once many other factors are tallied.
Note: pre-Florida, you could place very well simply with on-page optimization. These days, that will only work with keywords with not much competition.
Caryl
If you more than 20% keyword density you probably have to say Goodbye for your Google SEPR’S. Keep it under 6% (You may keep it under 3% if you are an average SEO). When you lowering keyword density try to insert semantically close words to the keywords that your are deleting, instead of just removing then. As an example if your main keyword is web cam try to use words like camera, live, video, etc. instead of repeating web cam over and over again. Search ~keyword –keyword to find similer words in Google. Use Brett’s KWD tool [searchengineworld.com] to find overall KWD in you site & Gorank for specific keywords.
Sorry, that's Quatsch (= cheese). I have pages with a density of up to 30% sitting on #1's since ever. There are more factors than keyword density.
>You may keep it under 3% if you are an average SEO
What? Huh?
~~~I have found that, the more competitive the keyword is in Google, the more important role Back links and more specifically, ANCHOR TEXT, determine the results.~~~
This statement use to be true ~ Keyword rich inbound links are very important but it is not the king anymore, not after Florida
~~~Pre-Florida, you could place very well simply with on-page optimization. These days, that will only work with keywords with not much competition.~~~
Pre-Florida it was (Basic SEO skills + High PR inbound links with keyword text) to archive high SERPs in Google.
Post-Frieda it is (excellent SEO skills + keyword inbound links from high ranking sites + semantics of the page/site)
Here are the numbers for the top 10 results for <snip>
First#= Total Pages in main site
Second# = Total backlinks to page in results
Third# = page's position in "allinanchor:" search<snip>
I have done countless searches for highly competitive keywords and found the correlation too great to ignor.
Caryl
[edited by: Brett_Tabke at 2:36 pm (utc) on April 8, 2004]
[edit reason] We do NOT do specific searches on WebmasterWorld - please reread the charter and tos [/edit]
I strongly disagree but yes there’re lot more important factors when optimizing for Google.
~~~I agree, I've had pages at the top for months with 20-30%keyword density~~~
<snip>
[edited by: Marcia at 9:55 am (utc) on April 8, 2004]
[edit reason] No "site reviews" per TOS, including by stickymail. [/edit]
new_BEE, it doesn't matter if you disagree as long as you don't say, i don't tell the truth. But you should at least flag your theories as "your opinion" - not imply that they are facts.
>Pre-Florida it was (Basic SEO skills + High PR inbound links with keyword text)
>Post-Frieda it is (excellent SEO skills + keyword inbound links from high ranking sites + semantics of the page/site)
Even my mum can bring a page to #1. Before and after Friday / Frieda / Florida. What was the query?
>at a key word density of 20 - 30% aren't your pages spam central
djtaverner, you asked tigger but since i mentioned the same, i can tell you - my pages are far from spam central and read very, very well.
Btw ...
... this page has a KWD of 20% for its main keyword: [searchengineworld.com...]
... this one has a KWD of 38.46%: [apple.com...]
... this one has a KWD of 100% (!): [google.com...]
... this one KWD 22.68%: [google.com...]
... this one KWD 17.86%: [google.com...]
... KWD 22.22%: [dmoz.org...]
... to be continued.
... you think they read spammy?
So, it is more likely IMO that the reason you are not doing well, assuming you do well in Yahoo or MSN search, is because you are not listed in an 'expert' directory such as DMOZ or Yahoo and therefore you are not stroking the nutty Google 'Hilltop' algo.
In regards to tweaking for Google resently... Playing around with onpage factors while Google is 'going nuts' messsing and tweaking around on their end, plus any directory listing you may get or lose, makes it extremely questionable to assume anything about your placement in regards to Google and your tweak.
1=Rank, 2=site name, 3=keyword, 4=density
1 www.altavista.com/ - search engine - 0%
2 www.lycos.com/ - search engine – 0%
3 www.searchenginewatch.com/ - search engine – 6.54%
4 www.dogpile.com/ - search engine – 0%
5 www.excite.com/ - search engine – 0%
6 www.google.com/ - search engine – 0%
7…10,,,,
Google top 10 Results for the keyword Google:
1=Rank, 2=site name, 3=keyword, 4=density
1. www.google.com – google - 12.90%
2. www.google.com/addurl.html - google - 4.24%
3. www.news.google.com/ - google - 0.41%
4. ….,,,10
<snip>
[edited by: Marcia at 10:00 am (utc) on April 8, 2004]
[edit reason] No identifiable industry search terms, please. [/edit]
<snip>
~~~Even my mum can bring a page to #1. Before and after Friday / Frieda / Florida. What was the query?~~~
I’m talking about general keywords with more than 200+ monthly searches (overture)
[edited by: Marcia at 10:07 am (utc) on April 8, 2004]
[edit reason] See previous notation re looking at sites. [/edit]
Well, it sure does look that way, doesn't it! It started with Florida, with the more highly competitive money keywords, and started to spread down the keyword chain to less and less competitive phrases.
But we do have to keep in mind that none of us is the WFA (World's Foremost Authority) and that our personal observations, particularly about our own sites or sites that we manage, are not the same as comprehensively researched data.
Regarding keyword density, I believe we can assume that there's a difference between the density of, let's say, a 3-word phrase, a 2-word phrase contained within it, and the density of the individual words contained within the phrase. There's also a difference in the weighting factor between the percentage represented by KWD and the number of occurrences relative to the total number of words on a given page, with and without the global navigation and with and without the page title and meta tags.
There are over 100 factors taken into consideration with scoring, but if anyone has a high interest level in and enjoys studying about keyword density, it's sure worth the time and effort to check into "inverse document frequency." We're not there yet, but there seem to be what appear to be some concepts helpful for avoiding over_optimization.
I think we'd be hard put to explain or describe exactly what does or doesn't constitute over-optimization, or to come up with substantitative proof of penalties strictly based on how lucrative certain keyword phrases are - in spite of the fact that all indications seem to point to it.
And isn't there a difference between getting hit with "penalties" and getting caught in "filters"?