1

Topic: Use HTML Entities or not?

I've been looking thrue some i18n files and noticed then some authors use HTML Entities and others are not.
Since UTF-8 is supported, I wonder if the use of HTML Entities is needed at all.
I have to admit that I find this whole encoding stuff complex material.

To cut it short:
does it matter if one uses either é or é in i18n files?
what would best or savest to use?

2

Re: Use HTML Entities or not?

If you can ensure that:

*no unicode unsafe functions are used in your scripts
*files are saved correctly as utf-8
*web sites have a meta charset="utf-8" declaration and that the visitors' browsers understand and use it correctly
*server's locale is set to utf-8
*and database's collation is utf-8

then you should be fine.

I haven't had a problem for ages :)

Last edited by andrej (2011-05-25 23:50)

I accept Bitcoins @ 15Wvo3JTVNJsH2AGHqTgQ25TzXteqRERiC

Thumbs up

3

Re: Use HTML Entities or not?

Agree with andrej.
I think it's easier if you only use the entities, some editors works well with this.
Usually we can not guarantee almost none of the topics mentioned by him.

Thumbs up

4

Re: Use HTML Entities or not?

More specifically for Wolf CMS and PHP in order of priority:

andrej wrote:

* The http server sends back a default encoding of utf-8
* Files are saved correctly as utf-8 without BOM (byte order mark)
* Database's collation is utf-8
* No unicode unsafe functions are used in your scripts
* Web sites have a meta charset="utf-8" declaration and that the visitors' browsers understand and use it correctly

The locale of the server doesn't matter. The reason why that *tends* to work is because the default encoding usually is equal to the current locale of the server. It is perfectly valid for the server to have a locale that is different from the default encoding.

As for the BOM... PHP can't handle BOM characters correctly which leads to problems. Since a lot of applications don't add the BOM, it can go correctly depending on which editor you use. However, making the distinction is important since some programs do add the BOM, leading to confusion.

The HTTP META charset is actually usually only used as a fallback in case the http server does not send the default encoding. It is then used to prevent the browser from having to guess after the encoding.

(btw: PHP's relience on what locale's are installed on the server is retarded) smile

Wolf CMS founder and lead developer
Please always check the Support forums and Wiki before asking. (My Ohloh account.)
Like Wolf CMS? Consider making a financial contribution or see our financial report first.