1. What have you tried so far?
2 and possibly more important: Where's the percent-encoding coming from? Normally it shouldn't be necessary to do anything about query strings. Have you got some %26 in there too or is it always a single query?
I think that google "auto-replace" ? and = in my links.
When I try to test in Google webmaster tools "Fetch as google" I paste there url with ? and = but after few second when google returne result...he say Page not found and I see that he "covert" that original url with ? and = to url with %3F and %3D
It's not google, it's the internet as a whole. All "special" characters are percent-encoded in transit.
3. Do the percent encodings show up in referers or in the primary request?
4. Do the pages, with normally formatted query strings, exist? Not via "fetch as googlebot" but if you paste them in directly.
Original URLs looks like this one:
Url exist on this page (when you click on zoom icon on top of image):
but google translate that original url to url with %3F and %3D
can I somehow set in htacces "auto-replace" all incoming encoded urls to be corect with ? and =
Backtrack here, because now I see the problem. It's interpreting your query string as part of the URL. Or, in the alternative, your site is coded so the part beginning in ? isn't getting interpreted as a query.
What's the "real" file format behind the extensionless URL?
A moderator will come along presently to change your domain name to example dot com. But the underlying problem will still be visible. Meanwhile I have been to the site and confirmed that the unescaped version works, the escaped version doesn't,* and the queryless version has a link to the troublemaking version.
I assume you have lots and lots of these and the same problem occurs everywhere. Is it always just one question mark and just one equals sign? If so, the fix is trivial. But I want to get at the underlying issue.
yes .. it is always one ? and one = at end of url
that parameter at end of url is mean "if is sent parameter at end of url ?full=1 then show page with full size image"
Well, "parameter" is the key word, because I get the impression it isn't being read as a parameter (Query String in htaccess-speak).
What is your "real" page?
is not the name of an actual file on your actual server. The extension .html would come at the very end of any "real" filename. All the intervening / are directories and I really doubt you have a directory called something.html.
:: insert boilerplate about directory paths and the part of the URL up through "example.com" ::
So something is already being rewritten. You can't simply add another RewriteRule without knowing what the existing rules are and what they do. Otherwise it would be a simple matter of
and I can tell you right now that neither of those will work.
I'm not sure what the OP meant by "upcoming links" but I'm going to assume incoming.
Scraper bots are the main cause of these types of googlebot queries imo, they malform the link back to your site (if they link at all) and CMS systems like wordpress flub the 404 or 301. I see Googlebot requests for urls ending in the above more than I'd like in wordpress especially, but in all CMS systems.
rel=canonical tags tell Google which version you want indexed, it's a start.
I found way to "solve":
I chamged way of sending parameters via links with wordpress "endpoints", so now on end of permalink I added "full-size" word, so ? and = are removed from url
Tnx you guys for your time!