Awesome info Bingdude. I've so far only been artificially manipulating the bounce rate on google. From what you are telling me, Bing is even more susceptible to this :insertevilgrin:
I suppose you wouldn't mind sharing what factors you compare to dismiss manipulative searches? I mean, how well do you trace proxies?
I've been working real hard at extracting statistics of natural bounce rate per niche, to use as a baseline from which to distort bounce rates for my competitors and lower my own bounce rate, but does bing even use that (baseline bounce rate per niche)? I'm pretty sure Google does... I also know Google doesn't use bounce rate on all niches, so it won't help my dictionary spam sites. But does bing have that too?
I normally only fool with google, as they are so easy to manipulate, but now that Bing is gaining some market share, I think I'll be expanding my experiments.
Bing leans rather too heavily against keywords in URLs, so much so that I tend to call it the strongest signal. Reminds me the Google of 2003.
It should be interesting to see how Bing deals with finding new ccTLD websites. This is one area that in which Google is vulnerable.
I had switched to a couple of Bing APIs (surprisingly very powerful!) some time ago when Google sneakingly -- with a deceptively 'Googly' cute face -- announced to pull the plug on theirs (only Google's playful hubris can get away with 'awesome' and 'deprecated' in one breath.)
Please promise Bing won't that! :)
And please always keep API code backward-compatible for extended periods when/if updating!
My pages are well written yet I see only perhaps a hundredth of a percentage of my visitors originate from Bing and usually for some very odd miscellaneous term.
I've also noticed Bing has a difficult associating words. I often put words in quotes on Google though Google tends to not require that too often.
I noticed a serious drop in Bing visitors after Microsoft employees started attacking me on the IE blog because Microsoft employees did not like my criticisms of Windows 7 on non-Microsoft sites. When someone takes a well-intended criticism and fixes a product they make their product better, when someone takes a well-intended criticism and insults the critic they lose a customer.
Coincidentally I also have seen very few visitors originate from Yahoo since after the whole Yahoo debacle ended up with Yahoo using Microsoft algorithms which is a shame since Yahoo is now nothing more than a glorified Excite/Hotbot.
|Except that in order to SEE the code letters I need to enable GOOGLE in NoScript! |
dstiles, thanks for posting this.
That somewhat clears up for me Bingbot's apparent identity crisis or acute suffering from split personality disorder in believing that it is Googlebot.
In what way is it split, Staffa?
It's about a new site. The robots.txt file is empty except for Googlebot disallowed everything.
Every bot that visits reads the file (or not, as may be the case, but that's my purpose) then happily crawls the content of the site.
Bingbot visits 7-10 times a day, reads the robots file and walks away. In nearly two weeks of passing by it has not yet taken any page from the site.
An identity crisis ;o)
|Bingbot visits 7-10 times a day, reads the robots file and walks away. In nearly two weeks of passing by it has not yet taken any page from the site. |
Sounds to me like a bot that honors the rules! :)
Tangor, I may not have been very clear. Only Googlebot is mentioned in the robots.txt file and disallowed all access to the site but everything else has full access.
Should be a short robots.txt... post it. You've made me curious!
Here it is and it has one blank line at the bottom :
# Example -- http://www.example.com
# Robot Exclusion File -- robots.txt
# Last Updated: 01/08/2011
Thank you for the valuable insights into how Bing interprets certain actions and signals.
Quick observation on my part:
Sometimes when I'm looking for something on Bing, I tend to click on three or four URLs to compare the information and find what is most relevant and useful for my purpose.
Does this mean that I'm sending incorrect information to Bing saying that they're all useful when out of the three I found only one to be useful and close the other two?
I open all my links on new tabs and rarely ever do a back button and re-click action.
Food for thought?
# Example -- http://www.example.com
# Robot Exclusion File -- robots.txt
# Last Updated: 01/08/2011
See if that doesn't make a difference...
Good idea Tangor, I had already thought if Bingbot needed a special road sign or something.
I've put your suggestion in the file, however, it's late here and I'll let you know the result tomorrow, thanks.
Theoretically there should be no mysteries in robots.txt... THEORETICALLY... but I've found that if you want results, you ask for or define the results desired.
In my robots.txt I have the (less than a half-dozen) bots allowed listed first and a deny to all others last. Has worked for the last 8 years. Above (per your specification) you have a denial first and all others allowed last. Please, most seriously, let us know how this works!
According to Bing's robots.txt instruction, it shouldn't make a difference. There's no special instruction requiring placement of the deny at the beginning or bottom of the file.
Agreed. I only have real world results upon which to base my determination. YMMV!
The first problem I find with Bing is the UI. You really need to work hard to see the results from ads and stupid suggestions. This is a major destruction for the users.
The second is that slow to index. Boy it take months to index new fresh and original content and data, images and whole shebang. As this data could very much help the user, but the time Bing indexed it, it's probably too late or useless.
I tried to SEO for bing, but spammers always out ranked me for competitive terms. It happens with G, but they get wiped very quickly.
I think the main core difference between Google and Bing are that the first was built to solve the ongoing problem of search, then it was monetized, big time while looking after the user experience. Bing was built to make money G makes and so far I didn't see it stand out or come up with anything new that could convert good number of users.
Saying that search engine can pick a quality document based on content is a lot of crap. Even thin content is defined by who you talk to. Some simple images on the web could be considered thin content, but they are valuable to many users. Again, spammers can abuse those.
The problem of search continues. For now, G owns the lion share.
Tangor, since I put the allow rule in, at the bottom as per your example ;o), Bingbot has visited 5 times, read the robots.txt file and moved on. NO page was taken.
I have now switched the allow rule to the top of the file and will keep you posted on how it goes from there.
Another 5 visits since my previous post, this time with the allow rule at the start of robots.txt but still the same result. Bingbot moves on without looking at a single page.
I have now removed the allow rule in order not to confuse the other bots on account of one illiterate one. It's open doors for everyone but Gbot so no rule is necessary.
Tangor, does that clear anything up for you, in what you are trying to find out ?
I don't mean as in finding a solution to make Bbot crawl my site, because "Frankly, my dear, I don't give ....." whether or not it does.
I find it confusing in that Bing aggressively indexes all sites UNLESS DISALLOWED or .htaccessed away. That Bingbot is NOT indexing your site is more suggestive there's something else at work, and might be something to look at!
Tangor, I don't doubt for a moment that you either have or know of sites which are "crawled aggressively" by Bingbot but just like flashdash states on the previous page, I only see the bot come by once in a blue moon and maybe fetch a page once in a yellow moon on any of my sites. Yet none have anything specific to discourage its visits.
The site we are talking about here, well you have seen the robots.txt and the pages are as vanilla as they come, though Bbot doesn't know that yet ;o)
Another site for instance has some 3600 pages. In the year that it has been online google has crawled the whole site many times over. However, I would be surprised if Bing, in the same time span, has managed to fetch as much as 10% of the total pages only once.
As for something else to look into, I don't run anyone's ads, nor anyone's analytics, nor anyone's whatever, therefore nothing to vex any bot. So there is little to look into. Nevertheless, I appreciate your suggestion.
As you have noted, and I have agreed, it does seem unusual that Bingbot hasn't indexed your site. I've yet to see Bing not index... and all too many time we've had folks complain/rant about Bing's frequency, so, again, I'm perplexed!
If this a shared host or dedicated? And, I'm afraid we've turned away from the original post... so if mods want to cut out the last few messages for a new thread "Why is Bing NOT indexing?" that's okay with me.
Staffa, do you have a Bing webmaster tools account? If not try creating one and set the crawl rate.
Also check whether you have something in .htaccess file that blocks their bots. More importantly, ensure that what you see as bing bot (MSNBot) are those from their range of IPs.
Since the launch of the site, at the beginning of this month, Bbot has visited exactly 109 times, all with genuine IP numbers and proper UAs. One might consider that aggressive "visiting" but no crawling since no page has been read.
Indyank, at that rate of visiting Bbot knows exactly where the site is and anyone would be quite happy with this type of crawling (that is, if it were to read more than just robots.txt). Therefore, I see no need to open an account with them but thank you for the suggestion.
Tangor, the site is on shared hosting. I'm very happy with this host and I don't intend to move (Win server). How and where it is hosted should not be a factor to influence Bing, they don't pay the hosting bill.
Thank you all for your input, it's very much appreciated.
I'll leave it till the end of this month and if there is no change I'll add four more lines to robots.txt just for Bingbot, then it WILL have something to read and I can forget about it.
A visit to the home page from msnbot-207-46-204-159.search.msn.com
Fetched the page and css, no image.
A precursor for Bingbot or an accidental visit ?
robots : no
User Agent :
I have replaced all the spaces in the UA by tildes (~), note the number of double spaces.
|Bing likes quality, original content. Skip syndicated content and articles as a way forward. |
A search for How to catch brook trout [bing.com] results in the same copied article in several places, plus results from websites that are not authoritative for this specialized information. The quality of the results is further ruined by featuring an eHow article at the top, which begins with this misleading bit of information:
|Brook trout live in the mountain waters of the Rocky Mountains west. |
While technically correct, the information is misleading because brook trout are a species native to the northeastern part of North America (mainly though not exlusively New England and Canada). Any brookies that live as far west as the article notes were artificially introduced.
The rest of the article is also misleading because it assumes that the higher up to the mouth of the creek you go the colder the water, which isn't always true.
While Bing cannot understand fishing techniques, it can more accurately deliver results by taking into account the authority and relevance of the incoming links to the page. Google doesn't do much better with this search either, but the focus of this discussion is Bing and I expect better from Bing if I'm going to switch from Google. Simply being different is not enough. Bing must be better.
Here is another search where the tires fall off of Bing:
best way to catch large stripers in Massachusetts? [bing.com]
That search is geographically specific but Bing ignores that portion of the query and can't resist placing a non-relevant eHow article that is overly general in the top three of the results. Google scores better. Striper fishing in Massachusetts is a very specific query. A result for striper fishing in general is not good enough.
Getting back to the quality of the results, Bing returns an article on ArticleBase that appears to be rewritten from the eHow article. It features many of the same tips, in the exact same order, only rewritten with poorer English.
|Skip syndicated content and articles as a way forward... |
Bing has a way to go in weening itself from poor quality content farm articles as well as syndicated content and content that was meant to be syndicated.
In defence of Bing, these results seem perfectly valid to me as someone who knows nothing about fishing in North America. Clearly there are always going to be experts around in any field that know much more than Bing about their subject. But then if they already know the answers they won't be searching for this info will they?
BeeDeeDubbleU, the point is that Bing is relying on syndicated content and content farms, contrary to his assertion that they are not.
Whether the info is useful or correct is secondary. However, if one wishes to scratch deeper, content farms and syndicated content are non-authoritative sources of information. Which is why bingdude himself said, "Bing likes quality, original content. Skip syndicated content and articles as a way forward."
I am simply pointing out that bing is falling short of that goal. BDW, don't take my word for it. Try bing out for yourself and see if if they're relying on syndicated content. Start with some searches you can relate to:
where to see the queen of england [bing.com] shows two wikipedia entries, one from ehow, another from wikianswers and another from associated content.
I did a search for "what are the best pubs in edinburgh scotland [bing.com]?" and bing did not return syndicated content. As someone outside of Edinburgh I can't vouch for the quality of the content, but at least it's not syndicated or factory farm content.
Nevertheless, the point is that bing is showing a significant amount of syndicated and factory content on many of the random searches I'm looking at- contrary to bingude's stated goal. While that's not scientific, I think it's notable and for bing I hope it's useful feedback.
I'm not on the bing team. But I make quite good coin from a lot of sites, which by my own admission, should be deindexed; sites you would complain about, possibly even report.
The problem here really boils down to:
Targeted Niche sites
Google used to LOVE targeted niche sites. I could create a blog called brooktroutfishing.net or something, and get it ranking in no time. All I needed was a couple of articles of low quality unique content ($10 domain + $20 for 2000 words of unique content), a few links and blam! Adsence money would roll in. I'd maybe add a 'news' section with auto posted, scrapped content, so it would seem 'fresh'.
Now Google has wised up to this. Now, targeted niche sites are a lot less effective (not that they don't work). Large, 'authority' sites are now preferred.
So what do I have to do now to game the 'authority' preference? Simple. I buy an expiring domain. I then make it on a generic topic, like fishing. And I have sections for each type of fishing:
And blam! I play into the 'large authority site' preference Google now has.
The sad fact of the matter is; SERPs will always have low quality results as long as it is profitable to game the algorithms.
The only thing SEs can do is constantly 'tweak' what works, so that those who don't stay up to date (that is 99.9999% of all SEOs) fail. But they will never be able to fully get all the SERPs to be 'highly relevant'
And things will only get worse. Right now, most people who have money weren't born with computers. But wait until my generation grows old. We will ALL spend our money online. And there will be more reason than ever for people to want to game the SERPs.
| This 70 message thread spans 3 pages: < < 70 ( 1  3 ) > > |