Forum Moderators: open
Q: What do you see in the future for your relationship with the Open Directory?
A: The thing that strikes me about the Open Directory Project (ODP, also called DMOZ sometimes) is what a great job they’ve done on a volunteer basis. I think I also read somewhere that they’ve just recent instituted a system for reporting abuse by editors, so that’s another way that they’re improving as well. Ah here’s the story (I often enjoy reading Pandia’s take on things):
[pandia.com...]
It’s impossible to say what the future holds in this industry, but in my mind, it’s a good sign that the ODP is taking steps that will continue to improve its quality.
Also, once we start allowing responses, it is important that everyone sticks to the questions he addressed. There is no way we can let this thread turn into a free-for-all.
A: I think shaadi asked this question. So shaadi, I think if you check the site that you reported for hidden text, you’ll find that it’s in the penalty box. Let me take this chance to talk a bit about the spam report system. From a webmaster point of view, it doesn’t take much time to file a report if you feel something unfair is going on. I would definitely read through our guidelines first to make sure it’s something that we would agree is spam:
[google.com...]
Reporting a site that you feel is spamming certainly won’t hurt. Now let’s talk about what sort of actions Google might take. There are some blatant things that we may take immediate action on. For example, if off-topic porn shows up for a search on someone’s name, that’s often worth doing something on a short time-frame. I noticed that you also did some other reports of things like duplicate content and sites that may be mirrors. Those are the sort of things that we probably wouldn’t take manual action on; we would instead look at using that data to write better algorithms. Our ultimate goal is to improve quality using only automated algorithms. Those algorithms may take longer to get right, but the nice thing is that when they’re done, they can often shut down an entire type of spam. So it doesn’t hurt to do a spam report, it gives us feedback about how to improve our search, and many spam reports end up as data that we use when testing our new algorithms.
[edited by: GoogleGuy at 5:24 am (utc) on June 12, 2003]
A: I don’t think it’s a fundamental algorithmic change. I don’t recall hearing about any changes would bring about long-term behavior like this. I’m pretty sure that it’s more of a transient issue, and I wouldn’t be concerned about this.
A: Let me answer a more interesting question: have you ever taken action on a relative/friend’s site that had spam? And the answer to that is yes. :) Our hidden text detection recently found hidden text on the page of someone I knew from college. That page got the same treatment as any other page. When the white-on-white text was removed, the page came back just fine and everyone was happy. The take-home message is that the spam guidelines apply uniformly.
A: WebmasterWorld is definitely a good source of feedback. For example, the recent “extreme geolocation” thread was a good prompt to make sure that we went back and made sure everything worked better. Several months ago, there was also a test crawl that was chopping the last character from urls, and a few people noticed that pretty quickly. It’s also helpful when there are issues that affect Google but aren’t because of Google: we saw a buggy caching server in the UK that ended up showing garbled graphics to users on that UK ISP, even though Google was serving the correct images. At the same time, I’ve also seen threads take on a lifespan of their own, even after plenty of people post to say “there’s no merit to this theory.” And I’ve also seen conspiracy theories and hot-headed threads go past the point where every polite person has stopped posting a long time ago. :) So I would say that we treat it as a good source of feedback, but I might take any given thread with a large grain of salt. Just like in the real world, I’m more likely to take serious suggestions from people who have earned my trust with level-headed, solid points in the past.
A: I would love to see more reports of spam methods rather than just individual instances. If a picture is worth a thousand words, then reports of spam methods can be almost as valuable in the same way. Since we’re most interested in working on scalable algorithms, it helps to have descriptions of methods rather than pointing out a specific instance of spam.
A: Absolutely. I think two important challenges for the future are discovering user intent and uncovering webmaster intent. User queries are often short, and it can be really hard to determine what the user is looking for. In the same way, a webmasters might not think of what words a user would type, or they might have to work under constraints that they can’t change (e.g. the title for every page might have to be the same). Lots of web designers don’t think about how search engines will see a site (lots of session ID’s, or framesets, or dynamic urls, etc.). I think one of search engines' big jobs will be indexing a site intelligently even if the site wasn’t designed with search engines in mind. Our bots do a pretty good job, but it would always be nice to do more so that people (users and web designers) don’t have to think as much about search engines and how they work.
A: There are things that need a manual review before they’re lifted. If a webmaster is pretty sure that they did something wrong, they can mail to webmaster at google.com with the subject line “reinclusion request.” It helps to describe what you think happened, and what you changed on the site to make sure that everything is in good shape now.
A: I think that there will always be a need for consultants that help site owners make their site more useful for surfers and search engines. That might include advice on site architecture, explaining what sort of pages would be crawlable by search engines, or giving help on the copy on a site. Lately, I also see more SEOs broadening their offerings by managing PPC for clients as well.
I think the thing that *doesn’t* work well is when an SEO gives bad advice, or does things well outside quality guidelines from search engines, or takes advantage of their client’s lack of knowledge. That happens less than it used to, but sadly it still happens pretty often. For example, earlier today our hidden text algorithms detected a spam network run by a really bad SEO. There was hidden text stuffed at the bottom of the page, and hidden links to old-style doorway pages. There was proof that the SEO did rank checking on Google; basically this company was breaking almost every guideline you can imagine. The really nasty part is that unknown to their clients, the SEO also inserted 7-8 hidden links back to the SEO, so roughly half the PageRank that each customer had earned was getting routed to the SEO! Because the SEO put hidden text, hidden links, and doorway pages on their clients’ pages, the clients may have temporary trouble now. That SEO has basically shot their credibility with both Google and all of their customers. That’s the stuff that we really hate to see. You see that happen less often than it used to, but unfortunately it does still happen sometimes.
A: I don’t think that there are tours in general. Although once I was looking out a window and I saw something neat. A car pulled up pretty fast and stopped right outside the Googleplex. Three people piled out and gathered out front in front of the sign for our building. One person snapped a picture of the other two guys standing in front of the “Google” sign. Then they hopped back in the car and drove off. It didn’t take over two minutes. :)
A: Google does consider cloaking to be outside our guidelines. Truthfully, the use of cloaking seems to be in decline. I’ve seen several SEOs serve up pages that do JavaScript redirects or other types of redirects, but it’s getting to be pretty rare to see actual textbook cases of cloaking.
A: If you’re interested, you might try doing a search for “googlebar.” It’s a really nice bar for Mozilla that gives you a lot of the features that the Google Toolbar has. Ah, here it is:
[googlebar.mozdev.org...]
Pretty neat stuff.
A: The main difference is that when we find a site after crawling the web, we know that there’s at least some credible person on the web who is “voting” for your site by linking to it. So we have a PageRank value for that page, even if the PageRank is very low. With a submitted page, we really don’t have any external verification that anyone but the submitter thinks the site is good. So it never hurts to submit your site, but I would also take the time to see if you can find some related sites or a part of the Open Directory to link to your new site.
Q: Where do you get your hair cut?
A: Okay, gather around. A little closer. Just a little more.. good. Here goes: Pretty much wherever it’s cheapest. :) Now I wonder if any other Googlers will read that and tease me. ;)
Let's open up the thread. If people want to start new topics, that's cool too, but feel free to post on here too..
Regarding your post - do penalties last forever you mentioned that if a webmaster is pretty sure they did something wrong they could email Google and with confirmation of what they thought they did wrong and the remedies they have put in place.
What if the webmaster has recieved a penanlty but they did know what they did wrong - by emailing Google will Google on occassion confirm to the webmaster why the penalty applies....? (this can also go with the question regarding webmasters who do not design a website around search engines - they could have designed a page for functionailty reasons that may have caused a penality - yet this webmaster may have no idea why - I think we do occassionally see posts from webmasters who truelly have no idea why they have a penalty)
Anyway thanks for the great posts :)
Sometimes we run out of time or space in our crawl and can't get to every single web site, but I think the people who see that happening can often wait for another crawl/index cycle and hopefully we'll pick them up then.
WebGuerrilla:
Also, once we start allowing responses, it is important that everyone sticks to the questions he addressed. There is no way we can let this thread turn into a free-for-all.
Thanks.
[edited by: GoogleGuy at 6:20 am (utc) on June 12, 2003]
A: I don’t think it’s a fundamental algorithmic change. I don’t recall hearing about any changes would bring about long-term behavior like this. I’m pretty sure that it’s more of a transient issue, and I wouldn’t be concerned about this.
---------------------------------------------
Googleguy, in your opinion, would the transient effect be eliminated with the 1 more traditional update that you said 1000 posts or so ago would take plus sometime in the future. (presumably this month)
Thanks for answering the questions!:)
once we start allowing responses, it is important that everyone sticks to the questions he addressed. There is no way we can let this thread turn into a free-for-all.
If there are topics not covered by these responses, let's just wait on those and stay with that we have now.
[edited by: Marcia at 6:34 am (utc) on June 12, 2003]