Cyrillic Letters Rejected by Certain Fields

Post

Posted
Rating:
#1647 (In Topic #375)
Cyrillic letters are rejected by certain fields which are designed to reject special symbols.

So far I had problems with the "URL Moniker" field and the "Codename" field when I edit pages.

Letters like Р, М, Ч cause the fields to turn red and the form cannot be submitted.

Post

Posted
Rating:
#1649
Hi,

Yes, these particular fields need to be ASCII, that's an intentional limitation. But they're not visible content fields so this should be okay. These fields basically form part of the URL which by convention is just ASCII characters (most web software assumes that).

Post

Posted
Rating:
#1652
Yes, but URLs with cyrillic letters in them are not only possible, but good for SEO, if you are from a Cyrillic-using region, like me.

I managed to trick Compo into accepting them and my site works just fine. All major browsers navigate through the Cyrillic URL's just fine and google.bg indexes my site better because of the more descriptive, keyword-rich page names.

Maybe you would like to consider lifting this limitation, or at least providing an Add-On, which removes it for those who desire to work with Cyrillic.

Post

Posted
Rating:
#1653
Hi Alexander,

(I transliterated your name :D )

You may know more than me about this in some aspects. My main concern is URL encoding, which is one of the original web standards, is not going to play nicely.

For example if you put your Cyrllic name into here it mangles:
Online urlencode() function - Online PHP functions

That's why we're normally using transliteration instead.

It may be okay for the main URL path, outside the GET parameters (?foo=bar&something=false kind of stuff.)

It may be that even for GET parameters some people do reduced URL encoding, only converting certain core symbols (?, &, = for example).

Do you have any further thoughts on it?

Post

Posted
Rating:
#1654
Maybe some web browsers show the URLs in the address bar without any unnecessary % encoding, as a way to make them appear nicer to the user, but behind the scenes it may still use them?

Post

Posted
Rating:
#1655

Chris Graham said

Maybe some web browsers show the URLs in the address bar without any unnecessary % encoding, as a way to make them appear nicer to the user, but behind the scenes it may still use them?

Yeah, I think browsers are going out of the way to make sure non-latin characters look good in address bars. I type in utf-8 and it calls using %-encoding but still displays in clean utf-8:
969 views (110 KB)
However, when spaces are typed it turns those into %-encoding immediately:
969 views (110 KB)
The browser must be treating non-latin as a special case to make things nicer for you :) .

So, maybe we can allow through all non-latin characters, possibly only if a config option is enabled to allow that.

Post

Posted
Rating:
#1660
Hi again,

Further testing showed me that even %-encoded URLs clicked will show non-latin characters nicely in the address bar. Then if I copy and paste it out of the address bar, e.g. to a text file, the %-encoding shows. So, I learnt something here that I was not aware of :) . The browser is doing a very good job of making sure things look nice while sticking to the standards.

It's a trade-off between having the URLs look nice in the address bar vs having them look nice in code. If we allow Cyrillic it will look nice in the address bar but very ugly in code. If we transliterate it will not look as nice in the address bar but it will look okay in code.

Having a good Bulgarian experience for you is important to us. I've made a new release for you to test where I've made many changes so please let me know how it is for you. You'll want to disable "Moniker transliteration" from Admin Zone > Setup > Configuration > Site options > SEO (because we're giving everyone the choice whether to use transliteration or to keep the Cyrillic). These changes will be in RC30 also when we release that.

ERROR: A resource, -1, requested within some Comcode (the attachment tag) does not exist.

Post

Posted
Rating:
#1664
Thank you very much, but how do I do an Upgrade ? If I use the quick installer, I will have to erase my site.

Post

Posted
Rating:
#1665
I have just provided this for test purposes, please test on a separate test site.

Post

Posted
Rating:
#1708
Did you get a chance to test? I was hoping for some testing before releasing the RC, in case I accidentally broke anything or didn't get what you needed right.

Post

Posted
Rating:
#1868
All works well! Thank you.

3 guests and 0 members have recently viewed this.