How it happens is quite simple - when others link to you, they may arbitrarily leave off or add the "www". They also may arbitrarily leave off or add "index.html". You have no control over what others type-in when they manually ad links to you on their websites.
The "www" is an unfortunate artifact, and there is really no reason for it's existence today. While you are fixing your canonacal problem please follow the modern canon - drop the "www".
www, ftp, smtp, etc. identify SERVERS, not SERVICES. In most cases, small websites have them all resolve to the same address. There actually is a bit in the DNS system to identify services - but - guess what? It has almost zero support and is almost never used.
At one time, when computing power was much less, and at the time time, the Internet was much smaller, it was commonplace to run FTP on one physical server, SMTP on another, WWW on another, DNS on another, etc. At the same time, the Internet was so small, that there were very few who actually need more than ONE server to handle a particular service. In earlier days, those few who needed more had www2, www2, etc.
Now we have Google with 50,000 servers behind "www.google.com". They use every trick in the book to get you routed to the nearest, most available server.
The notion of "here is my FTP server, here is my WWW server" is now anachronistic. How about "here is my company"? You are "example.com", period. The technology is there to sort it all out.