| This 97 message thread spans 4 pages: < < 97 ( 1 2  4 ) > > || |
|Comment on cloaking from a SE|
As product manager of the largest Scandinavian search engine I would like to add my comment to the recent discussions regarding cloaking. I have been working with the issue from both sides of the table – as moderator at search engine forums, SEO consultant and now as product manager of a large search engine.
The real problem with cloaking is that there is good and bad cloaking. I prefer to call the right way “personalized delivery” and the wrong way “cloaking”.
If you use personalized delivery to protect your code, and to serve relevant information to each search engine independently, then I have no problem with it. In fact, we welcome it. The more relevant targeted and correctly build web pages we can crawl the better – our users will get better results.
We will do anything we can to fight bad cloaking – people trying to manipulate our index. We will not tolerate that. We will do anything we need to protect the quality of our product – our life.
Detecting cloaking or personalized delivery is not difficult but working out what is right and what is wrong is indeed not an easy task.
So how will we do this? I am not sure yet and if I knew I wouldn’t tell exactly how but I have a few ideas ... ;)
One of the things I believe we will need to do very soon is to detect who is cloaking and who is not. That’s step one. We could then make a note on every one of those sites and put them on a special list for manual, automated or semi automated analysis to find out what is good and what is bad cloaking.
I don’t think we would be able to check every page all the time so the analysis would only be based on small samples. So bad cloaking would still be able to pass through but at least we would catch some of the bad guys.
This way it would not be so risk-free to cloak anymore – at least not in our search engines.
Basically, if you use personalized delivery the right way then you will have no problems with us but if you do bad cloaking the wrong way then we will do whatever we can to stop you - ban all your sites or give them an extremely low ranking factor.
Only target your pages to relevant keywords. Do not use personalized delivery unless you really know what you are doing.
I don’t expect to run into any serious problems with major SEO companies. They all want to run an honest business and only target relevant traffic for their clients. They can’t afford to do it the wrong way. If they do we will get mad at them and their clients will end up with low quality visits. So I don’t believe they will.
In fact, I will not only let “the good guys” use personalized delivery but I will go into a direct dialogue with the most professional ones of them. They know a lot about search engines that we can benefit from and I will help them do what they do even better. Helping the “good guys” is my way of trying to stop the bad ones.
I will speak at the IMS2000 seminar in Stockholm on the 26th of October and hope to bring some new exciting tools and statistics with me that I think you will all find useful in your hunt for more relevant traffic :-)
I asked "If being an open and honest SEO/site owner is currently a disadvantage, what are you proposing to do about it?"
Let me put some context around that:
Let's define IP cloaking as using knowledge of search engines IPs to deliver one page(s) to search engines and other page(s) to humans. And let's define IP delivery as delivery of content based on user IP, with no account taken of search engines. Fair?
There are only two reasons for IP cloaking, to do bad or to prevent bad being done to you. If all search engines allowed IP cloaking, those two reasons would still exist. If no search engines allowed cloaking, those who do it solely to prevent bad being done to them may end up damaging their own reputation and their clients' reputations. Why not help them out? Instead of saying you welcome cloaking, why not say you welcome agent-based delivery? (This is just pre-empting widespread XML takeup). And why not give the Web the tools to police itself against those who use cloaking for bad. A simple example: provide a form like Babelfish for allowing those who suspect infringement of their copyright to appear to be you, using a random one of your spidering IP addresses, allowing them to retrieve and check a page they suspect is being cloaked and infringing their copyright.
Expecting the entire Web to cloak to protect itself goes so much against the ethos of the Web. The search industry, if it is to have a reputation, needs standards. Yet in all this debate there hasn't been one standard proposed. So here are some suggestions to chew on:
1) SEs move from opt-in-by-default to opt-out-by-default. Then every page that is in their index has been agreed by the owner. That prevents all problems of accidental spam, accidental cloaking, accidental indexing, unwelcome spiders, etc.
2) Agent-based delivery is permitted.
3) IP cloaking is banned
4) SEs provide tools to allow Web to regulate itself by masquerading as SEs to check for copyright infringement, page-jacking and bait & switch.
5) SEs maintain dialogue with SEOs and the Web to ensure good practises prevail.
In other words, Mikkel, maybe sanctioning cloaking is just the easy way out for search engines. ;)
"There are only two reasons for IP cloaking, to do bad or to prevent bad being done to you"
I disagree. How about tthe following site that I am involved with, that I assume is in a similar situation to many other large corporate sites.
The main bulk of the site is framed dynamic content, which is created on the fly from constantly updated data. This content is full of information but cannot be parsed by the search engines.
The only static page is the splashpage where users have a choice of language. For image reasons, this page is completely composed of graphics. This is out of my hands, a preference of 'them upstairs'.
Would cloaking this splashpage, and serving the search engine a textual version, describing in words what the images convey be 'doing or preventing bad'?
If the content is simply translated into a medium that the search engines can read what is wrong with that?
Simply giving the spider something with which it can interpret the theme of your site surely is a vaild reason for IP cloaking?
Use agent-based delivery. To me, IP cloaking is something ONLY search engines can see.
I'm sorry, but I fail to see the ethical difference.
If I am aiming to deliver different content to the search engines as opposed to human sufers, what difference does the method make?
Surely the most accurate method, IP cloaking, is prefered to a user-agent system where a surfer may stumble across a text based page that reflects badly on our corporate image?
Well the reason we deliver different content to them is because they read the data in different ways. Much like browser delivery.
I have no problem if everyone personolized their delivery depending on who visits. One of the search engines (can´t remember who) said that they would present different search results depending on the user. The example used was "Mustang" that could deliver results about the car or about horses. This is the same thing.
In the end of the day cloaking has little to do with ranking. But it help me to protect those things that do matter. Be sure that the clients that pay SEOs to get their pages on the first page are relevant. -Otherwise they would not pay us what they do.
Did you read this?: [adweek.com...]
If you fail to see the difference, use agent-based delivery.;) Believe me, in terms of reputation and standards, there is a big difference. It's the difference between making it easy to distinguish right from wrong, and making it very difficult.
It's not impossible even now, though. People may always stumble across your cloaked pages. For example, there is a highly competitive search term on a major search engine at the moment, where yesterday I found exactly the same cloaked page three times in the top ten, sold to three different clients, with three different URLs and three different Web sites! Not only that, the cloaked page was originally for body nutrition products, and has simply had the keywords replaced with a new set, at exactly the right keyword density, but all the rest of the page is still about nutrition. That's what cloaking does for you. Makes you lazy. Three positions in the top 10, clients are loosely related to the keywords but not exactly top 10 global players, and the SEO cashes in at least four times over on the same script whilst doing continuous harm to their clients' brands. Their clients are probably happy being spoon fed top ten reports and, as yet, totally unaware of the reputational damage being caused. And the web is a sorrier place for the experience.
If the SEO had used a User-Agent delivery methodolgy, they might have put some effort into making the SE version of the page at least readable and usable by humans, so that it preserved the brands of their clients if it was found. Or, at least for pride's sake, because they knew it was possible to see - whereas this SEO presumably thinks their pages are impossible to see.
Cloaking can be used to hide bad practises. You said you can't see the ethical difference. Maybe you're one of the good guys who has been tempted by the dark side. But the good guys shouldn't be forced to cloak. The big difference between IP cloaking and agent based delivery is this: With agent based delivery, we're living in an open society. With IP cloaking, we're living in a closed society.
An IP can only tell very little about a user. A user-agent can tell you lots. Personalised results will work far better with user agents than they ever could with IP delivery. How is someone's IP ever going to tell you they are into horses rather than cars? IMO, personalised delivery will depend more on user registration than user agent or IP, anyway, but the user agent gives a better anonymous starting point. Where in the world a user is, the language they speak, the platform they are on, the capabilities of that platform - these are all in the user agent. I'm not sure Mikkel is even correct when he says that SEs check his IP number to see he is in Denmark. I think it's more likely the do a DNS lookup on his IP, spot the .dk extension on his ISP's name, and assume he is in Denmark. Or they might even look at his user agent!
>>I think it is spam - which I would define as anything done purely to artificially manipulate a ranking.
>My definition includes the word purely, which is designed to make it very tight.
Years ago I created my first "Dave's Personal Home Page," somebody subequently told me I could add meta tags and the SEs might rank the page better and I'd get more traffic, so I did that. And I did it "purely to artificially manipulate a ranking." The visible portion of the page remained unchanged. "Purely" is subjective and in this context, implies determining a webmaster's motivation. Your definition leads to pages and pages of rules as to what is and is not acceptable, similar to the situation we presently have. The competitive nature of business and SEO will always tempt some people to exploit the "loopoles" in the rules and, IMHO, offers no real solution.
The issue is providing relavent content. If the SEs can provide useful, spam free results to it's users, and SEOs can get relavent pages ranked highly, everybody wins. To me, this "nuts and bolts" discussion of what's acceptable and what isn't for SEO pros doesn't directly address the issue, providing relavent content, which requires that we carefully and thoroughly define spam.
I remain convinced that, with an adequate definition, it won't matter what "tactics" a webmaster or SEO pro uses when he builds or submits a page. If he complies with the definition, he gets listed or listed well, if not, he lives with the consequences.
Admittedly, there are issues surrounding how an SE can verify that a cloaked page is relavent. A simple, short term solution may be a button on every page an SE presents to users named, "Report Spam Here!" and offering some type of incentive to users for reporting it (find the most spam in November and win a $100 gift certificate), until a longer term solution can be established. As we all know, there is presently no formal reporting system and users must go searching simply to find an email address to send their reports to, which discourages most reporting. (Mikkel, are you listening ;) )
Yes, I am listening :)
I think one of the problems is the amount of time it takes to manually handle all the spam reports any SE would get that way. I am not sure we (or any other SE) would have the recources to do it but with a good automated or semi-automated system it may work out ... It's noticed :)
Yes, Dave, but you were an amateur then. You're a pro now, and we're creating an industry here, aren't we? The mistakes that could be made five years ago simply would not be acceptable now.
The page delivered to the search engine should be well structured, allowing the SE to identify key terms, themes, etc. This is the reason that titles, keyword tags, headings, etc, are important. It's to do with how professionals structure and deliver content.
I think spam has been quite well defined by search engines and it doesn't make any difference whether a page is cloaked or not. Spam is still spam, whether it's in the tin or on the plate. IMO, the only extra spam on a cloaked page is reciprocal links. That's the bit of jelly that stays in the tin when the rest of the spam is extracted.
Agent based delivery is another matter. I would like to think that the page delivered to the search engine could also be retrieved by a browser equipped with a text-to-speech converter and used by the blind. In this scenario, every link on the page matters.
I've thought about abuse reporting quite a lot.
1) Most SE users won't know what to do with a "Report Spam" button. It's got to be handled by the industry.
2) My preferred option is along the lines of AV Babelfish, allowing a professional to compare what the SE saw with what they see, and providing a handy "report spam" button with a pre-filled form. This type of interface could be made available to registered SEOs and other industry professionals to keep the numbers of reports down. The most competitive keywords would quickly be cleaned up this way.
Mikkel, the report of three different pages using the same cloaked script that I gave in my previous-but-one post ... would you consider that acceptable use of cloaking?
What is the hang up on IP delivery? That is the only way to guarantee that my code is left alone from the thiefs. With a user agent it would be to easy to spoof.
I still think that it is up to every one who live off the search engines to report the spam as it is. We are the first to see it when someone is spamming. When reported the SE see what was presented to them and what the viewer saw. If it is not relevant for the site it is spam.
Alan - Amateurs will always be with us, progressing thru the learning curve, and their pages will EVOLVE to a presentable structure, so our definition of spam must INCLUDE them and their pages no matter where they are in the evolutionary process, to be useful.
Mikkel's initial visit to this board started with a "let's define spam" discussion. Like everything Internet, these older spam definitons are evolving.
I agree with many of your arguements about IP versus User-Agent delivery. It would be nice if everyone did it this way or that way and would make the SE's job much easier, but human nature being what it is, it's unrealistic to expect this and difficult to enforce.
The text and links on my cloaked pages are, for, the most part identical to my uncloaked pages, the tables and images are removed, some heading tags are added and metas may be revised. As a result the discussion about reciprocal links on a cloaked page makes no sense to me. I use reciprocal linkage openly on uncloaked pages as a technique to achieve good rankings and these are mirrored in my cloaked pages.
>I would like to think that the page delivered to the search engine could also be retrieved by a browser equipped with a text-to-speech converter and used by the blind.
Why? A blind searcher finds an SE listing and visits that page. If it's relavent to his search and he finds the information he sought, who cares what the SE saw!
Can we work on a definition of spam yet? :)
If your "Report Spam" button offers an incentive, the incentive must include the "rules of the contest" which is where spam gets defined to reporters. Their report is no good if it doesn't meet that definition and may keep reporting volume at a reasonable level.
> Why? A blind searcher finds an SE listing
> and visits that page. If it's relavent to
> his search and he finds the information
> he sought, who cares what the SE saw!
The cloaked pages I referred to earlier were pure text. They would have been perfect for a text-to-speech converter, if
1) they made any sense at all - they were gibberish and off-topic
2) they were readable by a browser - in fact they were IP cloaked and required specialist techniques to view. A browser would be delivered a frameset page containing material unreadable by a text-to-speech converter.
The idea is that search engines and blind people have very similar requirements from a page. Using IP cloaking and only allowing search engines to see a page, the blind are denied an opportunity to view (i.e. hear) large parts of the Web that is, in theory, available. They would have this opportunity with agent-based delivery.
Let's not get too hung up on this blind thing, though. It's somewhat off-topic (although, to some readers, maybe very relevant).
Great conversation here... :)
Alan, you say we're creating an industry here? Many of us have been doing this since early on, and IMO we created the industry long ago.. Four years on the web, is how long in "brick and mortar" time? :)
I don't like the phrase, if you are IP cloaking, you are doing one of two things (doing bad, or preventing someone from doing bad to you)...
We target relevant keywords for our clients, from specifically designed pages for each of the major SE's.. Been doing it this way all along, and have never had a domain banned to this point....
I don't need any authoritative body deciding for me, how I'm going to best represent my clients... :) I don't need an association of SEO professionals, nor a governing body overseeing spidering/indexing for the major SE's, nor standards, nor any of the other proposed garbage that floats around the web about giving everybody an "even playing field"..
Guy's like Paul Bruemmer that made a name for themselves when it was easy, are now whining about needing standards, and certifications, and special allowances for certified SEO's considering ranking in the major SE's.. What does this tell me?
Bruemmer can no longer remain competitive in the SEO field, therefore he can no longer support a growing client portfolio, therefore he wants an artificial handicap, so he can get back in the top 10..
It's not my fault he didn't keep up with the technology required to succeed today? I mentioned above, you asking me to throw away thousands of hours of R&D, and the system we've developed, refined, and tuned over the past years?
Ain't happening, without me kicking and screaming all the way to certification bureaucracy institution for SEO wanna-be's... :)
I make a great living doing what I'm doing, I don't have to work for anyone else.. This is what America is about, free enterprise...
It's not for you or anyone else to decide on a set of standards across the globe for all the SE's to abide by, and enforce..
I want my cake, and I plan on eating the whole thing. I've worked hard for it. Directorization has eaten into my business, so I've offset it with technology infrastructure. We can talk about what gives IP delivery a bad name, or what is morally correct, but leave me out of any standards or certification.. :)
We're creating an industry by having constructive dialogue between SEs and SEOs. That's pretty new.
I am not trying to define standards. I do think the SEs will define standards from those they are prepared to maintain a constructive dialogue with.
Everything you do now, everything you have ever done, could have been achieved with far less effort, far less worrying about what the latest spider IPs are, if you had used agent-based delivery. Until now SEs have treated agent-based delivery as spam. The forthcoming "semantic Web" means they cannot persist with that attitude.
Agent-based delivery means you wouldn't have to throw all of your tools away. They would all still work. You could even continue to check IPs so you detected someone impersonating a search engine, I think.
You say you don't like the phrase, if you are IP cloaking, you are doing one of two things (doing bad, or preventing someone from doing bad to you)... given agent-based delivery as an alternative, what other reasons are there for using IP cloaking?
Allan agent cloaking is not an option. IP cloking - because I know who I am designing for and I do not want any one else to see that. That is the only way to protect my RD from hungry newcommers who do not want to take the time.
What I would like is for the SEs to set up a better and more competent spam complaint service. Make it easy to report suspected spam. If the SE know what they are looking for they can verify the spam easy.
When I find e.g. 4,000,000 pages from the same domain in AltaVista I report that and the spammer get taken out.
We are not talking about the avarage word stemmer here, but the high tech spammers that do anything to get on the top.
Professional SEO companies do not spam. They know that the quality of the visitors is the key to keeping the customer.
As I said we will not optimize for words not mentioned on the customers web site. To do anything else is spam in my book. Otherwise I think that RedZone defined what is acceptable very well.
> That is the only way to protect my RD from hungry
> newcommers who do not want to take the time.
Please define the R&D you are trying to protect...
Our RD - The thousand of hours we spent on analyzing the SE algorithms. We do that so we can present the information in a way that let every spider index our content properly.
BTW Our RD is much different from what you saw Daron Babin perform with his reversed engineering. That is his baby. We save our own listings and draw conclusions from them.
As a SEO that knowledge is my edge. That is what I have to offer my clients. If uncloaked, my competitors could just copy cat me. I was in a "war" back in -97, when my competitors did just that. Thanks to personalized delivery that does not happen anymore.
I am fine with those of my competitors who are doing the same thing. Then it is just down to who understand the current algorithm best. As it should be.
Many of the people complaining about cloaking are doing so because they do not have the knowledge to do it, lack the time or have not the money to hire someone to do it for them. Why would they want to see my code in the first place?
The SE can see it and verify that it is relevant to the site in question.
I do not think that page jacking is a big problem these days. Anyone who find their site hijacked will report it to the SE. And it is a criminal offence to steal some ones site. I have not seen it or heard about it for a long time. It is not a solid strategy.
The real spammers are all the submission programs that are sold on the web. Get the same results for free with services such as Jimworld. Brett also provides a lot of great tools for free on search engine world, that less serious competitors are selling as high end services.
>I do not think that page jacking is a big problem these days.
Whew. We must not play in the same world. Every quality content site under our care has been the victim of page jacking - almost daily. Top keywords are lethal for jacking these days.
With some good rankings under competitive keywords right now, they are coming at us like vultures around top words on Alta. I'm now not only cloaking, but obfuscating (poisoning) the page a little that user see's as well. Just enough so that they won't even realize it.
Agent delivery will never be an option, as it's too easy to spoof using any off the shelf, ActiveX http control package, and a little Visual Basic...
The thousands of hours of R&D, not only come from algorithm research, sampling thousands of pages, and performing detailed regression analysis, but also the design and implementation of our system itself. System design, coding, and implementation doesn't come cheap, for a quality system. The top SEO firms, are many levels above "cheap" PERL cloaking scripts. Our relational database has the capability to handle multiple web servers, hot connected to an SQL data server, handling thousands of http requests (SE click throughs) daily. Then collate that data, into a meaningful report system for our client base.
I would estimate the top 20 of SEO firms are billing high six figures, low seven figures in annual billing. This is a serious business with great financial returns for those that have the infrastructure and client portfolio.
You mention had we had a co-operative global community including the SE's, the effort would have been much less.
That's like saying if we had honor system coke machines on every corner, that everyone would voluntarily pay their quarter for a coke... That's absurd.. The greedy, will always be greedy, the lazy will always take the short cut. That's human nature....
Don't get me wrong though. I'm open to the SE's establishing standards, and actually defining an exact "spam" line.. But up to this point, I've experienced a lot of "gray" area, and treaded very lightly. Take AV for example. Nowhere on their site, do they mention what schedule they will crawl links from your root URL, nor that they do. They also don't bother to tell someone that if they submit all 500 URL's for their web site, at once, there is a good probability that AV will:
ignore the URL's and not index them
index them, but penalize the URL's, and bury them in the index.
Or, ban the domain for submitting too many URL's.
All I want is for the SE's to strictly define, exactly what is ok, and what is not. No "gray" area, just the facts...
That would be the optimum final goal in a collaborated effort between SEO's and the SE's... :)
> Our RD - The thousand of hours we spent on analyzing
> the SE algorithms.
I can't speak for search engines, but just thinking logically, I can't believe any SE would consider this as a justification for cloaking. And we're here to determine how SEs and SEOs can work together, aren't we?
You complain about theft, but some SEs might accuse you of stealing their algorithm by your actions. How would you respond to that?
Let's put it another way - if SEs published the pages you gave them for all to see, would any theft of your R&D have taken place?
I believe Mikkel wanted a definition of spam, and I've submitted mine. I really can't see the difference between revesre engineering an SE algo and setting up a page to exactly meet that algo, and keyword repetition, invisible text, tiny text, and other commonly accepted spam. It's all designed purely to get a ranking boost, it's an utter waste of human endeavour, it's getting the whole industry a bad name. SEOs are obeying anti-spam rules, such as invisible text, on pages where the whole page wasn't designed to be read by a human anyway! Doesn't that show the futility of the whole thing? The whole page is invisible text!
Carrying on like this can only end in paid placement or another predominantly off-page method, so that it's the SEs that benefit from their algo and their marketing efforts, rather than the SEOs. Anybody that continues to advocate these practises is in the industry to make a fast buck while it's still possible, because the industry won't be around in five years unless it changes now. Those advocates have no place in working out how the search industry should move forward, they should just concentrate their efforts on making those bucks.
The main research I can think of is keyword research, but we all agree that the keywords have to be on the uncloaked page.;) Development involves writing well structured copy for a search engine to read, and building a reputation for the site.
Henki, have you ever had a bank, insurance company or pharmaceutical company as a client? These guys have legal requirements to publish certain information on every page. Do you publish that information on a cloaked page, even if it means you lose out on a top ranking to someone who doesn't publish the information? How certain are you that your pages will never be accidentally uncloaked? You can't be certain, because you've given those pages to others (the search engines, for example), without getting their agreement that they won't make it generally available. If those pages end up being viewed by Web users, who bears the reputational cost?
Here's another example: a user types into a search engine "Tell me all the potential dangers of living next door to a nuclear power station". Would you have any problem capturing the entire range of queries that might stem from that for the nuclear industry, such that an alternative point of view did not appear in the top 30 results? Maybe your personal ethics would prevent you from doing that, but some SEOs would do it. SEs need a reputation for relevance and impartiality. How do SEOs help them to achieve that?
Openness and accountability is the way forward ... it's the way the Web got started in the first place, after all.
There's been little of the promised input from search engines here, and I think I've breathed enough of this hostile atmosphere. I've started to repeat myself. I'm retiring to a daily lurk and I'll be interested to see whether any SEs actually take part...
Alan it seems to me that your intent here was to promote agent-based delivery over IP delivery.. now i (and most of the more experienced people here) know that IP delivery is the way to go.
"It's all designed purely to get a boostings rank". I dont think you've been paying attention.
I mean that is what SEO is all about. And i (and most here) totally dislike cloaking spam.. i.e. irrelevant keywords being targeted.
But i dont spam when i cloak.. all i do is to ensure that when a user searches on my clients' targeted keyword.. that the user will find my clients' site in the top rank(s).
The SE nor the user (and definitely not my client) is harmed as the user and the SE are satisfied because s/he got relevant results from the keyword used.
Needless to say my clients are extremely happy as they are tops for their relevant keywords.
So Allan (sometimes i cloak, other times i dont need to).. where's the harm???
When thinking of spam/cloaking/SEO, I can't help but think of Altavista, which is a great engine. Their problem, though, is that for 3 term phrases, if their technology doesn't default to considering it a complete phrase, it will que the database for the first 2 words, and also the 3rd, seperately.
This, to me, is where seo comes in: delivering relevant mathces to those keywords/phrases, that otherwise the engine might be incapable of delivering.
Is this an acceptable use of seo? :) Now, if only AV would actually let me in, and let me stay there...
From an exchange with Alan yesterday:
">I would like to think that the page delivered to the search engine could also be retrieved by a browser equipped with a text-to-speech converter and used by the blind.
Why? A blind searcher finds an SE listing and visits that page. If it's relavent to his search and he finds the information he sought, who cares what the SE saw!"
I reflected on this a quite a bit last night and Bates reenforced it this morning as in:
"The SE nor the user (and definitely not my client) is harmed as the user and the SE are satisfied because s/he got relevant results from the keyword used."
I'll make one last attempt to turn the discussion from "nuts and bolts" to addressing the underlying issue, providing relavent content. If we can establish a way to do that, consistently, in spite of what is submitted or how it's submitted, we've advanced the technology.
As a result of foregoing discussions, my thinking has evolved to a possibly workable system like this:
I. Assume an SE adopts redzone's definition of spam
A. The content is targeted to the optimized keyword phrase
B. One listing per SE per keyword phrase
Meet these criteria and it's not spam
II. Assume an SE adds a "Report Spam Here" button on every search results page
A. The button leads to a submit form
B. The form defines what spam is or isn't and cautions that reported pages that the SE doesn't consider spam will be ignored
C. An incentive could be offered to encourage reporting
III. A spammer gets one warning by email and purged from the index
A. If he cleans up his act, he's reindexed upon next submit
1. The submit page should include submission guidelines, ie. "How often and how many pages may be submitted"
B. If he spams again he's banned
C. If he subsequently wants to get unbanned, he emails and pays the admin costs
That's all we need! It won't matter what is submitted or how! Cloaked? Who cares! The user's are policing things for the SE, the SE needs to implement a few changes and Viola!
Admittedly, I don't know how convienently an SE can purge a spam listing. And there will likely be an initial flood of reports to address until the index is purged... Mikkel and/or others with SE expertise will need to comment on the feasibility of implementing such a system.
This approach will work for all pages regardless of their sophistication, doesn't care if or how pages are cloaked and provides non spammed results (the relavency still depends on the quality of the SE's algo) to the searchers, which is what we all want.
Dave you summed it all up. Excellent. That is a realistic approach that can be implemented and reduce spam.
Go for it Mikkel!
Dave, I was writing my long post and then you posted before me. I think it basically it sounds right. We may need to add a few more details but all in all I can personally agree to it.
Alan, I have been here all the time and as you know I represent a SE – the largest one in Scandinavia and I actually think I posted a lot (some may even say too much <g>)
As I stated before I raised this question here, in other forums and at IMS2000 in Stockholm to get as many different views on the subject as possible so we don’t take the wrong decisions. It won’t end here, though. We will have to continue the discussion behind closed doors – my team - and maybe even invite selected experts to help us out make the best choices.
I believe that when we have made our new policy on cloaking and definitions of spam we should put it on our site. I want webmasters to know what we consider spam. How can we expect that we are not getting spamed unless at least we tell webmasters what not to do?
I hope we will get less spam that way so we can use our resources on more possitive things :)
I am still not sure how we will end up handling cloaking yet but I think that this discussion have covered many of the views and issues involved. This is very valuable information – it will help us (and hopefully many of the other SEs lurking here :)) make better decisions.
However, I personally have no problem with SEOs targeting their pages to our engine if they work with high ethics – like most of the people here – and don’t spam. If they spam, well, then we need to fight that – but we always did that. We will just need to get better and one way could be better reporting systems.
I would also like to repeat myself from the IMS2000 conference:
People using cloaking should be professionals. If you want me to accept cloaking then you need to act with high ethics – like I believe most of you here do. Anyone using cloaking will be considered professionals. If a poor amateur webmaster makes a mistake and end up spamming I don’t think we should be so hard on him but if he cloaks he is a professional and should know how to avoid spamming, so I will get more mad if he do. Sounds OK?
If I trust you then I will get mad if you try to cheat me?
I think that if selected SEO companies prove to be working with very high ethic standards I don’t see why we should not take advantage of the work they do and work together with them to get all their good content based and target pages into our index as fast as possible. Our users will get more targeted results that way (hopefully), so why should we not? I don’t see anything wrong in working closer together in the future with the SEO companies that proof to be worthy the trust.
A personal note on the legal issue...
In Denmark – where I live – I am pretty sure that we won’t be able to prosecute anyone for reverse analysis on our algo. On the other hand I know that if we cache a users page it could be illegal. I know the copyright laws of Denmark very well after running a music publishing company for 4 years. Caching a page like that is what we define as a digital copy and that is not legal here. Many people here actually believe that what Google is doing is could be illegal here – but there have never been such a case so we don’t know for sure ...
There have been a *lot* of words in this thread, I think you have just pointed out the way to move forward in one little sentence.
If I trust you then I will get mad if you try to cheat me?
Dave .. Mikkel .. i really like what you two have just said!
Just a quick comment from a novice.
The problem with cloaking could be likened to that of bribery. Whenever an official accused of bribery goes to court, their defense is always the same: "Yes, I took the money, but it didn't influence my decision."
The court's response is also always the same: "We don't care whether or not the money influenced your decision, we only care if you did/did not take the money, because it's assumed that sooner or later the money WILL influence your decision."
I just have a feeling that the SE's will sooner or later take the same general position on cloaking: "We don't care if you cloaked for all the right reasons, we only care if you did/did not cloak, because it's assumed that sooner or later the temptation will be too great for many SEO's to cloak/spam."
Just a though, humble as it were...
I don't agree - and I am running a large SE :)
Claoking has NOTHING to do with bride I must say. It would be if it was the SE taking money to rank pages better without telling anyone (I believe AV once tried that, remember ...?).
... And once agin I must emphazise that cloaking is NOT what gives you good ranking - good content, high linkpopularity and well writen HTML is still what will do that for you - cloaking just help you taget and protect your work.
| This 97 message thread spans 4 pages: < < 97 ( 1 2  4 ) > > |