#3587 - Internationalised e-mail addresses and URLs
0 guests and 0 members have recently viewed this.
The top 3 point earners from 7th Dec 2025 to 14th Dec 2025.
| PDStig |
|
|
|---|---|---|
| Gabri |
|
|
| Master Rat |
|
|
There are no events at this time
* The Composr webmaster controls whether monikers are made using Unicode, or transliteration.
THAT SAID. It may be the case that our URL encoded URLs to downloads overflow our available database field space. In such a case we bend the rules and allow non-ASCII URLs to be saved into our database instead. That is the best compromise in such a case and has no practical bugs relating to it.
Additionally we have the capability for transliteration. On old PHP versions on Windows we have to transliterate filenames (and hence URLs to those files) due to no PHP Unicode filesystem support.
We always transliterate directory names due to poor PHP support.
rawurlencode - PHP function for standardised URL encoding.
urlencode - PHP function for URL encoding specifically for GET parameters. It's the same as rawurlencode except spaces become "+'.
cms_urlencode - A layer around urlencode that provides Composr-specific encoding that stops Apache's mod_rewrite from corrupting certain special characters during it's "smart" processing.
cms_rawurlrecode - Shortens URLs that are too long for the database by intelligently cheating in our encoding. The URLs are not technically valid but will work.
HarmlessURLCoder - Simplifies/desimplifies URLs trading human-readablity for non-compliance. Similar to what browsers do in their address bars. It is a non-destructive operation that doesn't allow for double encoding or double decoding. Non-latin characters in URLs encodes with HarmlessURLCoder are much easier to use.
I'm leaving email along for now as e-mail validation is a mess:
http://emailregex.com/email-validation-summary/
And I'm happy to reinforce the consensus of simple addresses for now.