Welcome to WebmasterWorld Guest from 23.20.179.211

Forum Moderators: mack

Message Too Old, No Replies

Bing - the worst experience ever as a webmaster

bing webmaster

     
8:39 am on May 8, 2016 (gmt 0)

New User

joined:May 8, 2016
posts:5
votes: 0


Hi,

Has anyone had any joy getting Bing Webmaster to actually work? We have an ongoing problem for our ecommerce website. Last year we noticed that customer purchase codes (that are not published anywhere on our public site) were appearing in the Bing index. We scanned the whole site and found that there was not a single link to such a code. Yet Bing was indexing tem. We later worked out that Microsoft was scanning customers links in customer email messages, where the purchase code was placed - very underhand indeed (Microsoft use URL discovery in tools like Outlook and Skype without your knowledge that feed back the URLs to Bing). That's another story.

We are now trying to get the purchase codes and links out of the Bing index. Bing has decided to create random URL parameters and pass these to pages that we do not have, and have never had, this URL parameter. So they pass purchase codes to our home page, to our contact us page, and to other totally unrelated pages as well!

We used Bing's Remove URL parameter and Remove URL from index tool, al to no avail - it visits throughout the day and still passes the parameter and purchase codes. We set up some code to detect Bing and the purchase code and issue a META NOINDEX and a 404. Still no good, the codes and URLs are in the index months later.

What a nightmare Bing is. It is really the worst search engine ever invented and its support is horrific. The support staff just blame the webmasters and the coders, and never give a straight answer.

Has anyone actually ever successfully removed content/URLs removed from the Bing index using their "webmaster tools"?
3:40 pm on May 8, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:7505
votes: 504


We later worked out that Microsoft was scanning customers links in customer email messages, where the purchase code was placed


Have you actually confirmed that with Bing/MS?
3:56 pm on May 8, 2016 (gmt 0)

New User

joined:May 8, 2016
posts:5
votes: 0


@tangor, yes we did. Afrer a monster three month ticket, repeated escalations and complaints, and full disclosure of our web logs and code portions, we got Microsoft to finally admit that web spidering was not the reason why the codes were leaked and that Microsoft have other "resource discovery" mechanisms which work _outside_ of web spidering. Microsoft argued with us for months, blaming us at every opportunity, and yet we could (and did) disprove every accusation. They were totally found out and had no technical argument left to use. They repeated the same mistruths over and over again. After all the times Microsoft have accused Google of unethical behaviour, here they are doing exactly the same, if not worse. We emailed the CEO many times too, to no avail. Google have not indexed these links.

In our case the purchase codes were in a hyperlink emailed to the customer (a private email). We have since modified all the emails. It was the only place a purchase code was shown, and so the links had to have been intercepted by the user when they clicked on them, and ended up being sent to Microsoft and given to Bing Bot, which then added them to its index and is, as I speak, is sending hundreds of them to our website every hour of every day for no reason whatsoever! We considered legal action, but how can you fight a denial monster like Microsoft? They are a truly evil company with a total hate of webmasters who have genuine issues. It's a privacy nightmare for us because those codes can be used for fraudulent purpose and phishing etc. Yes, we put the code in a link, in a email, but we did not expect Bing to ever index them. Never, ever, put any such information in a link is my advice.

Meanwhile we are left to add the URLs to the URL removal tool manually, and there could be thousands of them. The URL parameter tool is broken. Despite adding the parameter, Bing still does not ignore it, and sends hundreds of requests with the parameter. Don't ask me why Bing Bot is sending purchase codes to our home page - it's totally random behaviour. Bing Bot is not fit for purpose and should be totally rewritten.
4:20 pm on May 8, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Feb 12, 2006
posts:2596
votes: 68


i'm pretty sure you can but query strings in the robots.txt file -- without any directories or filenames. you can just put a wildcard before them. once Bing sees that then hopefully they won't index any more of the URLs.

...it won't help you with the existing ones, though
4:23 pm on May 8, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:7505
votes: 504


We emailed the CEO many times too, to no avail.

Have you tried SNAIL MAIL, CERTIFIED, RETURN RECEIPT REQUESTED? That is the legal standard for most business correspondence. (Yes, I know email is recognized in courts these days but does not yet meet the level of certified mail)

In our case the purchase codes were in a hyperlink emailed to the customer (a private email).


Why? That's something I'd never do. Send the code, yes, but not in a LINK! There is no such thing as "private mail". Too many systems to pass through and unless it is encrypted/secure it can be (and is) scanned for various reasons upon entry into any system.

Google has been scanning links in emails for years. Perhaps they are more sophisticated in that regard than Bing.

I'm not aware of any reliable bulk or fast method of removing links from Bing. Wish I could help in that regard.
6:10 pm on May 8, 2016 (gmt 0)

New User

joined:May 8, 2016
posts:5
votes: 0


@tangor, Microsoft eventually conceded that they obtained these details elsewhere and not from our website, and that was the only point we wanted to make really. It wore us out arguing with them - they were giving nonsensical replies on purpose in this respect. We did not send a formal letter; we emailed the CEO (who does have a dedicated address). You say never put information in a link, but this is a link in an EMAIL, not a link on a public, 'spiderable', website. We know (normal) email is not secure, but we were not really trying to secure or encrypt the data itself, and I understand that links are passed to virus scanners etc, but to index it? Google may also be scanning links, but it did not add them to its index. Also, the page where the link was submitted to was META NOINDEX,NOFOLLOW - again ignored by BingBot totally. It took the link and didn't care where it went and what robots directive was provided. This should never have been a search engine issue; it was an email issue primarily which has bizarrely become a search engine issue. Easy in hindsight to say not put a order code in a link, but do you know what, I've just looked in my inbox and I have emails from Amazon with order IDs and a customer ID in a hyperlink - it's much more common an issue than you might think, it's just annoying these have been indexed.

@londrum, thanks, I tried what you said but when I tested in Google WT the pattern was still allowing the content. I used the pattern

/*?*

Is that correct for the root folder (index.html) with any URL parameter? Does Bing WT have a robots.txt tester?
6:14 pm on May 8, 2016 (gmt 0)

Senior Member from NL 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 25, 2005
posts:1426
votes: 178


The SmartScreen Filter in their web browsers is another mechanism by which they could see the URL.

Bing is messy, it's true. I just noticed they're happily ignoring my noindex meta tags. Spidering 101.

Rather than fight the beast, which seems rather pointless unfortunately, see if you can find a way to minimize the damage, if there is any real (or potential) damage. Do you have rel=canonical tags in place, by the way? Perhaps they do know how to handle those.
6:17 pm on May 8, 2016 (gmt 0)

Senior Member from NL 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 25, 2005
posts:1426
votes: 178


/*?*

Is that correct for the root folder (index.html) with any URL parameter? Does Bing WT have a robots.txt tester?

In regular expressions (though I'm not sure that's strictly what they use), the question mark indicates the previous character (set) is optional, so you may need to escape it with a backslash. And /*?* would also include any and all files and subfolders, not just the homepage.
6:35 pm on May 8, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:7505
votes: 504


You say never put information in a link, but this is a link in an EMAIL,

To clarify. I don't put LINKS other than to a log in page where the code can be inserted by the email recipient and never both in the same email (I use two). Not perfect, but way better than giving it all away in a link anyone can use.

All email is "read" by many third parties, or at least scanned for malware/virii/spam, and a LINK will always show up. Just me. YMMV
7:24 pm on May 8, 2016 (gmt 0)

New User

joined:May 8, 2016
posts:5
votes: 0


@robzilla, we do have canonical tags in place. The /*?* is fine if it covers all folders since it affects lots of folders. Alas, we could not get it to validate in GWMT so we have added a global test for the code and the BingBot UA, and then served a 404 and a META NOINDEX,NOFOLLOW (and the Bing variants just for luck). Hopefully, one day, it will stop indexing the content. The URL parameter tool in BingBot is totally broken for us - it doesn't do what it says it should do. BingBot is NFFP.

@tangor, lesson learnt. Trust no one :( No problem with links being scanned for malware, but scanned and passed to Bing for indexing (and making up fake URL parameters and passing them to pages across the whole site) is a step /way/ too far.
8:29 pm on May 8, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:7505
votes: 504


@optimusprime: Sometimes those links are found on spam lists like Spamhaus etc. That is (or could be) one of the other sources. I am NOT defending Bing, so don't take it that way. Just saying that the web is many parts and all of them are messy gooey mixed together.
8:36 pm on May 8, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13678
votes: 440


In regular expressions (though I'm not sure that's strictly what they use)

It isn't, if by "they" you mean robots.txt syntax. In addition to the question mark issue, the asterisk * itself in Regular Expressions means "zero or more of the preceding character"; it's not a wild card. Happily, this need not concern you. ("Great! Only 499 to go!" -- paraphrase of J D Morgan).

The peril with robots.txt is that the standard has never been updated. So if you say anything whatsoever beyond the bare-bones "Disallow: /directoryname", even a law-abiding spider can ignore (i.e. fail to understand) the directive and there's not a thing you can do about it.
6:07 pm on May 10, 2016 (gmt 0)

New User

joined:May 8, 2016
posts:5
votes: 0


Bing are now writing a special script to get these links out of their index. Some progress at last.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members