Forum Moderators: Robert Charlton & goodroi

Google Thinks My Site Is a Copy and Assigns Wrong Canonical URLs

Google is incorrectly assigning canonical URLs to my site

         

Pirilin

12:31 am on Mar 10, 2025 (gmt 0)

Top Contributors Of The Month



I’m facing a strange issue with my website example.com.br: Google is assigning completely incorrect canonical URLs, making it look like my site is a duplicate of random domains. These URLs don’t exist, aren’t accessible, and all follow the pattern ?p=123493. Here are some examples that have appeared:
•example.it/?p=123493
•www.example.cl/?p=123493
•www.example.org/?p=123493
•example.com/?p=123493
•example.com/?p=123493
•www.example.com/?p=123493

The weirdest part? Real-time tests in Google Search Console show everything as correct, with no errors. The Google Rich Results Test also reports no issues.

What I’ve checked and fixed so far:

Clean and validated code (W3C, no errors or warnings)
Correctly configured rel="canonical" on all pages
Properly set redirects in .htaccess
No other unexpected canonical URLs in the code
Pages follow SEO best practices and correct structure

Even so, Google keeps recognizing completely wrong canonical URLs, making my site look like a copy of unrelated domains.

I’ve double-checked everything and still can’t find a solution. Has anyone encountered this? What could be causing it? Any help would be greatly appreciated!

not2easy

12:20 pm on Mar 10, 2025 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Hi Pirilin and welcome to WebmasterWorld [webmasterworld.com]

Have you examined your site's logs? It is possible for other sites to show your content if you are not preventing it. Since you don't mention your environment, I can't suggest the best method for that, but if you see evidence that your pages are being served remotely, you can prevent it. On Apache servers you can add a line in .htaccess for it. Are the pages static html or database driven like WordPress? Is the server Apache or IIS?

Pirilin

1:47 pm on Mar 10, 2025 (gmt 0)

Top Contributors Of The Month



Hello, thank you for your reply.

My environment is APACHE
Static HTML without Wordpress and without a database.
I use shared hosting from HOSTGATOR.
I don't know if I can get Apache logs on a shared server, can I get local logs?

Follow my .htaccess code

# Activate the Rewrite Engine
RewriteEngine On
RewriteBase/

# Security headers (must come before everything)
Header always set X-Frame-Options "DENY"
Header always set X-Content-Type-Options "nosniff"
Header always set Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
Header always set Content-Security-Policy "default-src 'self'; \
***I hid it here.
object-src 'none'; \
base-uri 'self'; \
form-action 'self'; \
frame-ancestors 'none';"


# Define custom MIME types for specific files
AddType image/webp .webp
AddType application/manifest+json .webmanifest

# Remove trailing slash from non-directory URLs
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} (.+)/$
RewriteRule ^ %1 [R=301,L]

# Prevent rewriting if file or directory exists
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L]

# Directories without slash → with slash
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{REQUEST_URI} !/$
RewriteRule ^(.+)$ %{REQUEST_URI}/ [R=301,L]

# Remove the .php extension from URLs
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteCond %{REQUEST_URI} !\.php$
RewriteRule ^(.+)$ $1.php [L]

# Cache Policies for Static Resources
<IfModule mod_expires.c>
ExpiresActive On

# Images: 1 month
ExpiresByType image/jpeg "access plus 1 month"
ExpiresByType image/png "access plus 1 month"
ExpiresByType image/gif "access plus 1 month"
ExpiresByType image/webp "access plus 1 month"
ExpiresByType image/svg+xml "access plus 1 month"
ExpiresByType image/x-icon "access plus 1 month"

# Videos: 1 month
ExpiresByType video/mp4 "access plus 1 month"
ExpiresByType video/webm "access plus 1 month"

#CSS: 1 month
ExpiresByType text/css "access plus 1 month"

# Scripts: 1 week
ExpiresByType application/javascript "access plus 1 week"
ExpiresByType application/x-javascript "access plus 1 week"

# Sources: 1 month
ExpiresByType font/woff "access plus 1 month"
ExpiresByType font/woff2 "access plus 1 month"
ExpiresByType font/ttf "access plus 1 month"
ExpiresByType font/otf "access plus 1 month"
</IfModule>

# Error pages personalized
ErrorDocument 400 /errors/error.php
ErrorDocument 401 /errors/error.php
ErrorDocument 403 /errors/error.php
ErrorDocument 404 /errors/error.php
ErrorDocument 500 /errors/error.php
ErrorDocument 502 /errors/error.php
ErrorDocument 503 /errors/error.php
ErrorDocument 504 /errors/error.php

# Enable Brotli compression
<IfModule mod_brotli.c>
AddOutputFilterByType BROTLI_COMPRESS text/plain
AddOutputFilterByType BROTLI_COMPRESS text/html
AddOutputFilterByType BROTLI_COMPRESS text/xml
AddOutputFilterByType BROTLI_COMPRESS text/css
AddOutputFilterByType BROTLI_COMPRESS application/xml
AddOutputFilterByType BROTLI_COMPRESS application/xhtml+xml
AddOutputFilterByType BROTLI_COMPRESS application/rss+xml
AddOutputFilterByType BROTLI_COMPRESS application/javascript
AddOutputFilterByType BROTLI_COMPRESS application/x-javascript
AddOutputFilterByType BROTLI_COMPRESS application/json
AddOutputFilterByType BROTLI_COMPRESS font/ttf
AddOutputFilterByType BROTLI_COMPRESS font/otf
AddOutputFilterByType BROTLI_COMPRESS font/woff
AddOutputFilterByType BROTLI_COMPRESS font/woff2
</IfModule>

# Enable Gzip compression
<IfModule mod_deflate.c>
AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE application/x-javascript
AddOutputFilterByType DEFLATE application/json
AddOutputFilterByType DEFLATE font/ttf
AddOutputFilterByType DEFLATE font/otf
AddOutputFilterByType DEFLATE font/woff
AddOutputFilterByType DEFLATE font/woff2
</IfModule>

not2easy

2:49 pm on Mar 10, 2025 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Most shared hosting comes with a ControlPanel. You should find your access logs using the control panel. We don't discuss hosting companies, they are fairly uniform in what they offer. If you own the domain you should have a CP included where you set up email and such optional settings.

It is a good idea to create custom pages to serve for various errors, particularly the 404, 403 errors because humans may see those error pages and you can offer help rather than a dead-end error page.

In htaccess, the rewrite rules generally come at the end so that they are the last read after the cache and MIME stuff. I don't see a rule that sets with or without 'www' and https, to ensure your content is only shown on one URL and in your preferred format. To discuss .htaccess, it should be done in the Apache forum here: [webmasterworld.com...] I can move this discussion there if you want haccess advice.

lucy24

4:42 pm on Mar 10, 2025 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<tangent>
# Remove the .php extension from URLs
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteCond %{REQUEST_URI} !\.php$
RewriteRule ^(.+)$ $1.php [L]
Uhm, this ruleset doesn’t remove php from URLs. It adds .php to all requests that don’t already have it. Removing the .php extension from requests that have it would be a different rule. (And all those -f and -d tests are horribly inefficient. It's generally possible to write rules that bypass this step.)

I hope all those /errors.php rules were renamed for posting; it is not helpful for humans to see the same text in response to, say, a 403 and a 404.

But that’s a matter for a different subforum.
</tangent>

Apache access logs may be independent of the Control Panel (whether generic CPanel or specific to the host). You just need to find out where they are stored--generally not in the same location as the site files--and possibly set up a password. Your host may also provide “analog stats”, but they’re not awfully useful.

Pirilin

1:11 am on Mar 11, 2025 (gmt 0)

Top Contributors Of The Month



I still wouldn’t like to move to another subforum.

I passed the .htaccess because you mentioned that, on Apache servers, it was possible to add a line to prevent my pages from being served remotely.

I also adjusted your considerations.

Regarding the rewrite function, it’s because it internally rewrites the PHP but doesn’t show it in the address bar. Is there a more efficient way to do this, by the way?

And about the .htaccess line you mentioned, what would it be?

Could this be causing Google to block my indexing and flag my site as a copy? But if that were the case, wouldn’t Bing do the same? I have no issues with Bing.

Answering your question about the error pages, all of them generate the respective error message; it is not a generic message.

not2easy

2:28 am on Mar 11, 2025 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



You have a line that should prevent remote usage, sorry I did not address that point. The
Header always set X-Frame-Options "DENY" 
is intended for that, though I had not seen it in that format before. I've used this for the same purpose:
Header set X-Frame-Options "deny" 

A look at your logs could tell you whether that is something to be concerned about. There is a very long discussion from a few years ago (2021) that has much more information about the use of framing your content: [webmasterworld.com...] and another (older) thread that links to tools you can use to verify your Security Headers are doing what you expect: [webmasterworld.com...]

lucy24

5:23 am on Mar 11, 2025 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I had not seen it in that format before

Quick run to Apache docs reveals that "always" is an optional condition. (Never used it myself either.) The options are "onsuccess" or "always", where the former is the default if you don’t specify:
The difference between the two lists is that the headers contained in the latter [i.e. "always"] are added to the response even on error, and persisted across internal redirects (for example, ErrorDocument handlers).
This is probably worth mulling over, though not necessarily in the case of the current thread. I mean, who cares if they frame your 403 page ;)

Pirilin

11:57 pm on Mar 11, 2025 (gmt 0)

Top Contributors Of The Month



Hello, I requested indexing again from Google today.

It keeps saying that my page is a copy, and it keeps bringing up a canonical that is not mine and that if I try to access it, it gives me a 404 error.

However, I got the logs from CPANEL, and I ask for help analyzing them, because I didn't find anything unusual.

PS: I changed my IP to 10.10.10.10 and the domain to example.com

66.249.68.35 - - [11/Mar/2025:20:28:31 -0300] "GET /robots.txt HTTP/2.0" 200 252 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:32 -0300] "GET / HTTP/2.0" 200 24684 "-" "Mozilla/5.0 (compatible; Google-InspectionTool/1.0;)" example.com 10.10.10.10
66.249.68.33 - - [11/Mar/2025:20:28:32 -0300] "GET / HTTP/2.0" 200 24684 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0;)" example.com 10.10.10.10
66.249.68.35 - - [11/Mar/2025:20:28:33 -0300] "GET /assets/css/lp.css?v=2.0.9 HTTP/2.0" 200 13835 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:33 -0300] "GET /assets/css/carousel.css?v=2.0.9 HTTP/2.0" 200 700 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.35 - - [11/Mar/2025:20:28:33 -0300] "GET /assets/css/carousel.css?v=2.0.9 HTTP/2.0" 200 700 "https://example.com/" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:33 -0300] "GET /assets/css/cookie-banner.css?v=2.0.9 HTTP/2.0" 200 762 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:33 -0300] "GET /assets/css/main.css?v=2.0.9 HTTP/2.0" 200 3588 "https://example.com/" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:33 -0300] "GET /assets/images/logo-960x276.webp HTTP/2.0" 200 10174 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:33 -0300] "GET /assets/css/main.css?v=2.0.9 HTTP/2.0" 200 3588 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.35 - - [11/Mar/2025:20:28:33 -0300] "GET /assets/css/lp.css?v=2.0.9 HTTP/2.0" 200 13835 "https://example.com/" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:33 -0300] "GET /assets/images/logo-960x276.webp HTTP/2.0" 200 10174 "https://example.com/" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:33 -0300] "GET /assets/css/cookie-banner.css?v=2.0.9 HTTP/2.0" 200 762 "https://example.com/" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.35 - - [11/Mar/2025:20:28:33 -0300] "GET /assets/js/cookie-banner.js?v=2.0.5 HTTP/2.0" 200 301 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:34 -0300] "GET /assets/js/cookie-banner.js?v=2.0.5 HTTP/2.0" 200 301 "https://example.com/" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.35 - - [11/Mar/2025:20:28:34 -0300] "GET /assets/videos/video03.mp4 HTTP/2.0" 200 494420 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.35 - - [11/Mar/2025:20:28:34 -0300] "GET /assets/videos/video02.mp4 HTTP/2.0" 200 453277 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.33 - - [11/Mar/2025:20:28:34 -0300] "GET /assets/videos/video02.mp4 HTTP/2.0" 200 453277 "https://example.com/" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.35 - - [11/Mar/2025:20:28:34 -0300] "GET /assets/videos/video01.mp4 HTTP/2.0" 200 456813 "https://example.com/" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:34 -0300] "GET /assets/videos/video01.mp4 HTTP/2.0" 200 456813 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:35 -0300] "GET /assets/videos/video03.mp4 HTTP/2.0" 200 494420 "https://example.com/" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.33 - - [11/Mar/2025:20:28:37 -0300] "GET /assets/images/foto01.webp HTTP/2.0" 200 31902 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:38 -0300] "GET /assets/images/foto02.webp HTTP/2.0" 200 33994 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:38 -0300] "GET /assets/images/foto03.webp HTTP/2.0" 200 55010 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:40 -0300] "GET /assets/images/foto02.webp HTTP/2.0" 200 33994 "https://example.com/" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.34 - - [11/Mar/2025:20:28:40 -0300] "GET /assets/images/foto03.webp HTTP/2.0" 200 55010 "https://example.com/" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0)" example.com 10.10.10.10
66.249.68.35 - - [11/Mar/2025:20:29:08 -0300] "GET / HTTP/2.0" 200 24684 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.6943.141 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" example.com 10.10.10.10

NickMNS

2:50 am on Mar 12, 2025 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The canonical link may not be yours, but does that link show a copy of "your" content? Your sample of links above shows that the files in question are videos and photos, in which case hot-linking is likely an issue. Hot-linking is when another website points an <img> tag to an image hosted on your website eg:
<img src="https://www.example.com/assets/images/foto03.webp" />
. I don't think the htaccess files and headers will prevent hot-linlinking.

Finally, the question needs to be asked, it's not meant to accuse you of anything, but it needs to be asked. Are those images and videos yours do you own the copyrights or are they taken from a third party under the assumption that it is ok?

Pirilin

3:05 am on Mar 12, 2025 (gmt 0)

Top Contributors Of The Month



the links that canonical points to are not valid, it directs to an existing page but with a 404 error.

regarding the images, I searched for them in an image bank with a free license, but I will check all the images.

could it be an image that is causing this problem?

NickMNS

3:29 am on Mar 12, 2025 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



it directs to an existing page but with a 404 error.

A 404 error means not found, so that page does not exist. Do website exist? they must if they are returning a 404. Did you try a site: search to see what Google has indexed for those domains?

Pirilin

10:58 am on Mar 12, 2025 (gmt 0)

Top Contributors Of The Month



The site exists, but the page within the site does not exist.

remember that all random canonicals redirect to
/?p=123493

  • example.com/?p=123493

    I will remove all images from the site and do a new indexing test.
  • lucy24

    5:53 pm on Mar 12, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    I don't think the htaccess files and headers will prevent hot-linlinking.
    Blocking hotlinks is trivial. When there is a request for an image, check the referer. If it is neither blank (not just crawlers but some human browsers) nor a short list of authorized referers (your own site and the standard search engines), you can either 403 the request outright, or--less work for the server--rewrite to your chosen NO HOTLINKS image.

    Pirilin

    11:38 pm on Mar 12, 2025 (gmt 0)

    Top Contributors Of The Month



    Hello Lucy24, I blocked hotlinks through Cloudfare.
    Thanks for the tip, I had this feature disabled.

    NickMNS:
    Regarding the issue of copyright on images, from what I've researched, this wouldn't cause confusion in my canonical, I would just receive a DMCA notification in my Google Search Console.
    So I believe that's not the problem.

    Now I'm back to square one again, I have no idea why my site is being accused of copying and getting random canonicals.

    What makes me a little upset is that Google doesn't have a contact to open a ticket and check this.

    Does anyone have any other ideas of what I could check?

    Brett_Tabke

    10:30 pm on Sep 28, 2025 (gmt 0)

    WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month Best Post Of The Month



    What does running screamingfrog (free version) against it do?

    Sounds like you have some sort of wild card running on the dns. Go to a dns validator and see what it says.

    Kendo

    11:14 pm on Sep 28, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Google reports can have hissy-fits.

    For example I get reports about https://example.com/page.php not being canonical when there is a canonical meta-tag in the page defining https://example.com/page.php

    These reports come in spats, sometimes reporting that is was indexed in 2022 but last indexed yesterday. Nothing was changed over that time.

    thecoalman

    12:41 pm on Oct 3, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    regarding the images, I searched for them in an image bank with a free license, but I will check all the images.


    The images in question are available to anyone?

    mcneely

    5:47 pm on Oct 3, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    example.com/?p=123493


    Previous incident with a client involved the ?p=123493 type of url with his site -- We determined that his platform had been accessed (either a weak pswd or brute not sure) and some pages were injected with code involving a *drive-by download or redirect to malicious site scheme (the affected pages all routed to the similar ?p=123493 url) Google even listed the urls as indexed but no actual page could be tracked down ... One image was found in his images folder that didn't belong there with a name similar to 123493.png.

    Long story short, we deleted the infected pages and loaded new - removed the image - installed a backup database - changed to a much stronger user/pswd for access to his site and his database.

    Only new visitors could see the redirect to the ?p=123493 type pages - You would have had to delete all of your cache and cookies in order to be redirected to the malicious official looking page that the perpetrators of the scheme had intended for you to see.

    The site exists, but the page within the site does not exist.


    The logs will tell you the url is ?p=123493 but it's actually not the actual url - the ?p=123493 url is more of a hijack to pages that appear, but not in your domain .. I think of it as more of a piggy-back or a mirror of sorts ... using your domain to present pages that exist elsewhere via an injected script placed somewhere w/in your domain space.

    It's important to remember that whether it's your platform, database, or even your actual cPanel/Hosting account, the guys that break in to do this stuff are quiet. They don't make noise. The go in, inject or install, without making a sound. The last thing they want is to disrupt what you might have going on. They aren't going to destroy anything or give you any clue that something's wrong ... the last thing they want is for you to go in and fix stuff.

    I might suggest and highly, that you change usernames and pswds to anything you might have to login to ... next, (clear out your img/media directories) install clean backups of each including your WWW/public_html directory, if you have them. Try to identify the ip ranges/blocks that these guys might be operating from and write them out in your .htaccess.

    [edited by: mcneely at 6:23 pm (utc) on Oct 3, 2025]

    thecoalman

    6:09 pm on Oct 3, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Google even listed the urls as indexed but no actual page could be tracked down ...


    Wouldn't that show up in Search console as redirect?

    In any event one quick way you might be able to check for that is user agent switcher with the browser to use Google's. Clear the cache for you site and see if it redirects. It's not 100% because a malicious script might check for valid Google IP or whatever. Also it might get blocked by some security software and services also because it's not valid Google IP.

    mcneely

    6:29 pm on Oct 3, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    no actual page could be tracked down


    Couldn't discover the ip or hosted service the page was on.

    I think the intended targets were those who might have had browsers that were pretty light in the loafers as far as security was concerned.