| 2:09 pm on Dec 13, 2003 (gmt 0)|
I have repeated examples of some of my most competitive sites being filtered out for searches containing 3 words I have used in optimizing title tags.
Regardless of word order or proximity they are nowhere to be found unless I add 2 more words to the search, then they appear as they did before Florida.
I tend to agree with Daniel Brandt's analysis in The Great Google Filter Fiasco (sticky me and I'll send you the url) and think there is a deliberate attempt on Google's part to discourage optimization in commercial areas. I'll leave the other webmasters to speculate on why. The hit list was a real eye opener for me and supports the 'dictionary' theory.
If it IS a technical problem then there is hope that pre-florida results may return. With all due respect, I hope so...but I just don't believe Google is so incompetent technically.
I think they are throwing out the baby with the bath water as somebody so elequently once said, in their attempt to "beat the spammers". If Google really wants to improve their SERPs then they could start off by eliminating basic spamming and not the few webmasters who diligently spend hours on designing AND optimizing good websites.
| 2:10 pm on Dec 13, 2003 (gmt 0)|
|Regarding the <title> tag. I also figured this could be the key to the 'penalty', and made it far less descriptive on one of my low income sites: |
I'm worrying a bit about this now. The reason is this. I have a site that I built as abit of an experiment. It has static pages created by exporting XML from a database through an XSLT file. When I started working on the database solution to do this I was a bit new to XSLT and I took the easy way out of incorporating a global head section into the final page. This means that I have a site in which every page has exactly the same title and description. Each page has exactly the same navigation bar and footer on every page as well.
When I search for my target two word term the site/page comes in at #20 in SERPs and for the plural two word term it appears at #3. The index page only has a PageRank of 3.
For allinanchor:search plural it is still at #3 and #20 for the same but singular search.
The term is widget club and the plural widget clubs.
I'm sorry to reopen this sore but I should also say that my midget clubs and is still there after Florida.
There are so many variables that this could be just a coincidence but on the face of it, it does look like commercial terms like "iain site offers widget insurance, this is just 2 numbers away in the IP range from the club site, previous to Florida it was in SERPs at #3 - #1 for my main commercial search terms and this secondary clubs site was somewhere in the top 100. Now the secondary site does not occur in SERPs at all for widget insurance.
For widget clubs my commercial site was around the 240 mark when searching for wnsurance" are being treated in some way differently from non commercial terms like "club" and "clubs"
| 2:11 pm on Dec 13, 2003 (gmt 0)|
|As I see it, if a an over-SEO filter catches one site in its net, then shouldn't all its duplicates go away as well? |
If offsite factors (such as anchor text on inbound links) are part of the filter, copying someone's page and pointing loads of SEO links to it from a bad domain would kill the original as well as the duplicate.
Right now, this would probably work. If so, Google are to be congratulated for handing a dirty great baseball bat to scumbag webmasters - NOT.
| 2:31 pm on Dec 13, 2003 (gmt 0)|
I tried to start a thread about the same time as this one, and failed...
For my site, kw1 kw2 kw3 produces no listing at all.
kw1 kw2 kw3 -waffle used to get me first place.
It doesn't work any more.
What DOES still work and get me where I'd expect to be is
kw1 kw2 kw3 +a
| 2:38 pm on Dec 13, 2003 (gmt 0)|
I posted something alluding to this - but snipped it because I changed my mind and sometimes I go off topic ;)
It struck me that to beat Google's usurping of the -nonsense anti-filter, you could try adding a common English term that is likely to appear in all sites.
'a' is a pretty common term - and it appears to work.
(I was using 'and' and 'with', but realised the results were inconsistent and <snipped> - I hadn't thought of 'a' - nice work!)
edit: Confirmed: I'm from nowhere to No 1 and No 2(?) with '+a'
| 2:53 pm on Dec 13, 2003 (gmt 0)|
kw1 kw2 kw3 +a doesn't work for me, I need to add a 2nd nonsense word like +is or something.
This was the same when we were using the -nonsense, so I suspect we will see this ironed out by Google as well.
Why is it I get the feeling Google doesn't want us to see the old Serps anymore?
Anyway, in my case the 3 kw phrase (and the order of the words does not matter) only appears when 2 more words are added into the search query. This all suggests percentages to me. When 3 words have been identified as "spam" by google then exact searches or searches with some percentage (maybe 75% or more) containing the filthy low-down dirty evil scumbag scam spam words (as an english teacher I excell in adjectives) get filtered out.
| 2:58 pm on Dec 13, 2003 (gmt 0)|
|Why is it I get the feeling Google doesn't want us to see the old Serps anymore? |
A real mystery! But just to add to the confusion, in my case:
KW1 KW2 KW3 - gets me nowhere, even if I add UK to the end.
KW3 UK puts me No 1 in the World.
Worth noting by the way that the corrolary of
is 'nonsense filter'
feel free to correct my spelling of coro***y - I'm an ex-teacher, but physics I'm afraid, and we can't rite ;)
| 3:08 pm on Dec 13, 2003 (gmt 0)|
|KW1 KW2 KW3 - gets me nowhere, even if I add UK to the end. |
KW3 UK puts me No 1 in the World.
try adding UK and another nonsense word (not to suggest anything about the United Kingdom).
I have a similar situation where KW1 KW2 KW3 gets me nowhere too, but KW1 KW3, or KW2 KW3 gets me back on top.
Perhaps the "nonsense" filter is actually a "non cents" filter and the big G is getting back at all of us evil webmasters who stubbornly refused to by adwords and counted on optimization techniques.
| 4:50 pm on Dec 13, 2003 (gmt 0)|
human editing may be a good idea superscript, I personally think Google is mainly trying to eradicate affiliate driven sites from the serps. Seeing as they use recip links more than most at a guess. The thing is they miss the major offenders such as Kelkoo and dealtime. Personally I'd just let it all in and try to manage it as best you can rather then penalising for certain things such as recip links. Because this affects not only affiliate sites but all sites to some degree.
I have seen lycos UK product page results rank really well in the old -in datacenter from a couple of days ago. I know the exact scripting and datasets it uses because I use them. Thing is I got penalised. Looks like the best idea is to create a portal / information site and try and get non-recip links and just create a shopping section.
| 6:44 pm on Dec 13, 2003 (gmt 0)|
|What upsets me most is that people are now going to make (at least) two sites instead of just one |
Spica, please forgive me if I upset you but I need to put food on the table so...
This is EXACTLY what will happen.
Quite truthfully I have already considered the repercussions of developing a mirror site but making enough changes so it would not be considered "mirror" in Google's eyes, now ain't that a shame?
| 7:01 pm on Dec 13, 2003 (gmt 0)|
I agree Bobby, when webmasters get more and more desperate you may see some really bad spamming. I know plenty of ways of filling google with spam and make a good quick buck, but up until now have had too much repect for Google to do it. This may change if they start banning recip linking etc, and making it really hard for webmasters to compete with the big boys in an ethical way.
| 7:24 pm on Dec 13, 2003 (gmt 0)|
MikeD and Spica I appreciate your support.
I'm just frustrated like many others that my bread and butter site (and I mean that in the literal sense, at least in the mornings!) that I spent countless hours on in order to make it attractive to users as well as spiders has gotten the boot.
Because of the respect I do have for other webmasters and users in general I have not nor am I really going to resort to "mirroring" my site, but rather spend more hours and create another solid website to reach out to people from a different angle.
At this point the battle lines have been drawn, and I have already registered a domain in another name, placed it on another server with a totally different I.P. and plan on developing it in unison with the first, though very independent.
I am leaning towards simply developing themed networks and expanding to similarly related markets.
| 9:32 pm on Dec 13, 2003 (gmt 0)|
Previously all of these worked for me.
+the blue widget
-fufufu blue widget
blue widget -site:www.google.com
blue widget -site:www.**nichecomptitor**.com
Now only these work
+the blue widget
Allinanchor: brings me exactly the results I used to know and love. I don't see how they could switch this off other than by removing them as a service.
| 6:14 am on Dec 14, 2003 (gmt 0)|
I am finding that:
for keywords (not the real words)
the most important two pages are in the top ten and the next ten most important pages in the world on that topic are nowhere to be found and using
important technology -utewywreyt
those 3-12 ranked pages are now nowhere to be found.
Last week they were all in the top 20.
important technology -nonsense
also does not bring them back.
On the other hand, I am now finding that keywords
drops the authority sites for that phrase to #5 and #6 but
nontechnical nonmoneyterm -ewquiryew
puts them at #2 and #3 right behind the most notorious superspammer in the world.
| 6:29 am on Dec 14, 2003 (gmt 0)|
returns exactly the results the top technical people in "important technology" expect to see in the top 20.
does not return the results that most people who have any reasonable knowledge of quality in "nontechnical nonmoneyterm" expect to see.
| 7:20 am on Dec 14, 2003 (gmt 0)|
|If you look at the source of the ones returned when you "repeat the search with the omitted results included." you will see that they have identical <title>text and body text. This example and others I've happened upon myself lead me to believe that text in the <title> tag is the most important from the point of view of duplicate filtering. |
I have worked with some of the Stanford code that Google was based on, and it performs a 64 bit cyclic redundancy check (crc64) to generate unique fingerprints for pages.
Some of the Stanford research papers [jamesthornton.com] on detecting duplicate content talk about generating several fingerprints for multiple sections of each page, and if enough of the fingerprints match fingerprints of another page, then the page is identified as a duplicate.
I don't know if Google is willing to expend all the resources required to generate and check multiple fingerprints for each page, but it is clear from claus' post that Google can detect exact dupes (i.e. a fingerprint of the entire page matches the fingerprint of other pages).
| 11:00 am on Dec 14, 2003 (gmt 0)|
here is another way of getting pseudo pre-florida serps / "waffle"
example1: "query" +www OR +www
example2: "query" +com OR +net OR +org OR +www
you can place whatever you would expect most/all pages to have in their url. looks like the OR modifier still shows vunerability to revealing what most would expect as the "normal serps".
| 3:06 pm on Dec 14, 2003 (gmt 0)|
|Some of the Stanford research papers on detecting duplicate content talk about generating several fingerprints for multiple sections of each page, and if enough of the fingerprints match fingerprints of another page, then the page is identified as a duplicate. |
Fingerprinting is a useful analogy. I think in the UK the finger print test used in criminal law looks for I think 16 matches and then can state with a very high probability that there is an exact match with the accused person. The fingerprint analist looks for certain morphological features and marks these.
From what you are saying Google could be using something similar in that it looks for exact matches in some parts of the documant if it finds and exact match in enough of these then there is a high probability of duplication.
This sounds like the principal of Bayesian Probability spam filtering.The principal of which is "If it looks like spam then it is spam" this is the technology used in email spam filters. The difference here is the focus on removing duplicates.
I still wonder though if for search terms where duplicate filtering is applied if what happens next is a step change in the algo for that search. If you remove the allinanchor ellement of the calculation and up weight the PR ellement you would get the results that I am now seeing assuming that dupe filtering has also taken place first.
| 4:14 pm on Dec 14, 2003 (gmt 0)|
Many members of WW are probably aware of some duplicate domains. I am certainly aware of one set that still beats me in SERPS. If the removal of duplicate content were the key to explaining Florida results, I would expect whole domains to have been detected and removed.
Whilst duplicate content detection may have changed with Florida and may explain some SERP changes, I doubt that it is the dominant factor.
| 7:26 pm on Dec 14, 2003 (gmt 0)|
Marissa Mayer, the Director of Consumer Web Products at Google "If you dropped in rankings, go back and look at who you linked to and who’s linking to you. If any of these people are using spam techniques, they're the reason your site no longer appears on Google."
Never believe any thing in this nature UNLESS it is a press release. The above is a statement not a press release.
| 7:59 pm on Dec 14, 2003 (gmt 0)|
I'll be perfectly honest - I don't believe it. I hope the post wasn't made mischievously.
If true, it would suggest it might be possible to damage a competitor by linking to them or setting up links in their name.
This has always been denied by G - so your statement needs watertight justification.
But you're welcome to sticky me, or post whatever you can within the usual rules.
But on the face of it - you appear to be a troll!
| 8:14 pm on Dec 14, 2003 (gmt 0)|
Thank You for the nice comments (Troll) this is a stamentment from Marissa Mayer, the Director of Consumer Web Products at Google not from me. Im simply saying i dont believe it neither unless its a press release!
| 8:21 pm on Dec 14, 2003 (gmt 0)|
|Many members of WW are probably aware of some duplicate domains. I am certainly aware of one set that still beats me in SERPS. If the removal of duplicate content were the key to explaining Florida results, I would expect whole domains to have been detected and removed. |
Whilst duplicate content detection may have changed with Florida and may explain some SERP changes, I doubt that it is the dominant factor.
Before you rule out duplicates on the basis of what you have said I would go and look at those pages a bit more carefully. It may be that they are not duplicates for a keyword term that is filtered or they have messed up some of the main tags in their HTML. If there is a duplicate filter in use which is contributing to the Florida effect (ie that was not working prior to 19th November) then I'm sure that it is only applied to certain search terms and not to others.
The issue of duplicate filtering has been around since way before Florida though.
A viable alternative explanation of what appears to be a filter may be that simply allinanchor (maybe plus other key factors) is not used in the algo for search terms that are "filtered". This does make logical sense since if your do the search allinanchor:search term 80+% of the sites that have been dropped reappear.
If Google is applying some of the technology it bought with AppliedSemantics then the dropping of allinanchor would make lots of sense. If your search is turned into a semantic broad match, lets say you searched for car then semantics would turn this into cars, autos, automobile, automobiles, etc etc. That is going to bring back a lot more matches than if the literal search 'car' was the only term searched for. When two words are searched for the broad search list will be much larger because there will be permutations on the two words. Now how do you go about ranking that mess before you print the list to the browser? If you rank each of the different terms matched and you did it in the same way as it did a simple search the level of complexity would be multiplied several times. This could be happenning as we know that the set presented is now a maximum of 1000 reducing processing overhead.
Alternatively what if you dropped much of the complexity by doing a simple frequency count of any and all matched words, in the semantics list, for text inside the <body> tags weighted by PageRank or maybe the other way round.
What you end up with is relevance based on generaly similar to what was requested ranked by PageRank. Which is exactly what I'm seeing in searches.
By the way if you go and look at the thread on Adsense ads being crawled and contributing to PageRank [webmasterworld.com...] perhaps you will see why Florida happened and what the motivation is behind the recent changes.
| 8:33 pm on Dec 14, 2003 (gmt 0)|
>>By the way if you go and look at the thread on Adsense ads being crawled and contributing to PageRank
| 9:20 pm on Dec 14, 2003 (gmt 0)|
| 9:25 pm on Dec 14, 2003 (gmt 0)|
I assume Google has the good sense not to count AdSense links for PR. If they were, this would be Google selling PR which can lead to higher rankings in the regular SERPs. I doubt the FTC would look on that favorably.
| 9:46 pm on Dec 14, 2003 (gmt 0)|
|I assume Google has the good sense not to count AdSense links for PR. If they were, this would be Google selling PR which can lead to higher rankings in the regular SERPs. I doubt the FTC would look on that favorably. |
My Dad always said "never assume anything son!"
| 10:58 pm on Dec 14, 2003 (gmt 0)|
"If true, it would suggest it might be possible to damage a competitor by linking to them or setting up links in their name."
The quote doesn't say that at all. It only says that you should watch who you link to, because if they are spammy, that could hurt you.
Unless some spammer breaks into your site and creates links to his domain from yours, that quote says nothing about others being able to hurt you without your consent. In fact, it doesn't say anything that we already didn't know was a fact, except that the level of "danger" in linking to spammy sites has increased, perhaps dramatically.
| 11:10 pm on Dec 14, 2003 (gmt 0)|
I presume what Marissa Mayer was saying is a warning about becoming part of a link farm. Which can be easy for a naive webmaster to do.
| 10:39 am on Dec 15, 2003 (gmt 0)|
are we both reading the same post? it says
|go back and look at who you linked to and who’s linking to you. |
| 11:09 am on Dec 15, 2003 (gmt 0)|
Read the whole statement, not just one phrase.
Don't take out of context a perfectly clear statement. She said webmasters need to be careful who they link to and who links to you. Complete sentence/thought. It's a clear statement about linking to and from.
Some folks seem to want to make the most improbable and farfetched interpretation of a portion of a straightforward couple of sentences.
Be careful about linking to people who spam. That's the deal. In particular be careful about linking to people who spam who link to you.
Notice this word... "and"
Notice she did not say... "or"
[edited by: steveb at 11:13 am (utc) on Dec. 15, 2003]
| This 180 message thread spans 6 pages: < < 180 ( 1 2 3 4  6 ) > > |