Forum Statistics

  • Forum Members: 4,245
  • Total Threads: 3,572
  • Total Posts: 3
There are 1 users currently browsing forums.

Partner Sites

Reply
Old 02-22-2010, 07:11 AM   #1 (permalink)
asdfghjkl
 
Status: Guest
Posts: n/a



Post advantages/disadvantages

Hi,

I've always used:

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />

on my web pages. I have done so for years and never questioned it. However, now the text editor I use has updated itself it complains about the encoding if I use the above meta tag. I have noticed now a lot of people use utf-8 instead and if I use this my text editor no longer generates a warning.

Is there any one I should be using? I'm guessing UTF 8 is a much more up-to-date character set. Should I be using that instead? Are there any advantages/disadvantages to using one or the other?
  Edit/Delete Message Reply With Quote
Old 02-23-2010, 03:52 AM   #2 (permalink)
c69
 
c69's Avatar
 
Status: On a mission
Join Date: Dec 2009
Location: Kiev, Ukraine
Posts: 32

Rep Power: 12 c69 is on a distinguished road



Arrow

yes, UTF8 will save you a lot of hair ;) use it whenever you can.

Finaly after years of inertia - the world is switching to unicode, - and probably this year utf8 will pass 50% global usage mark, according to google:
http://googleblog.blogspot.com/2010/...50-of-web.html
c69 is offline Add Infraction for c69   Reply With Quote
Old 12-16-2011, 05:07 AM   #3 (permalink)
Mallory
 
Status: Guest
Posts: n/a



Default

UTF-8 is a superset of ASCII. Since a plain ASCII string is also a valid UTF-8 string, no conversion needs to be done for existing ASCII text. Software designed for traditional code-page-specific character sets can generally be used with UTF-8 with few or no changes.

Sorting of UTF-8 strings using standard byte-oriented sorting routines will produce the same results as sorting them based on Unicode code points. (This has limited usefulness, though, since it is unlikely to represent the culturally acceptable sort order of any particular language or locale.) For the sorting to work correctly, the bytes must be treated as unsigned values.

UTF-8 and UTF-16 are the standard encodings for XML documents. All other encodings must be specified explicitly either externally or through a text declaration

Any byte oriented string searching algorithm can be used with UTF-8 data (as long as one ensures that the inputs only consist of complete UTF-8 characters). Care must be taken with regular expressions and other constructs that count characters, however.

UTF-8 strings can be fairly reliably recognized as such by a simple algorithm. That is, the probability that a string of characters in any other encoding appears as valid UTF-8 is low, diminishing with increasing string length. For instance, the octet values C0, C1, and F5 to FF never appear. For better reliability, regular expressions can be used to take into account illegal overlong and surrogate values
  Edit/Delete Message Reply With Quote


Reply

Bookmarks

Tags
None


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may post new threads
You may post replies
You may post attachments
You may edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Search Engine Optimization by vBSEO 3.5.0 RC2