Skip to content Skip to sidebar Skip to footer

What Is The Default Charset/encoding Of Text Messages On Android Devices?

If necessary to keep it simple I am primarily concerned with English handsets in North America. Specifically- when sending/recieving SMS and MMS messages, how are the characters en

Solution 1:

The short answer for the US is GSM 03.38 and UTF-16BE if you use Emojis or text that GSM 03.38 cannot encode directly.

When sending/receiving SMS the encoding is definitely not UTF-8 since that isn't supported by the PDU or the SMPP protocol. Search for the SMPP spec for clarification on what is supported. Out of all supported encodings, the only Unicode compatible option is UCS-2BE. My observation is that most phones (includes all Android and iPhone) just assume this is actually UTF-16BE because it allows for the complete Unicode character set (including things like Emojis 🗺️).

SMS also have special mandatory encodings under the GSM03.38 specification which is based on septets. They allow up to 160 characters per PDU (as with many encodings not all characters are 1 code unit).

MMS is another animal entirely which isn't supported well outside of North America. But for MMS encoding the following are available (big endian or network byte order assumed when not specified):

  • US-ASCII
  • ISO-8859-1
  • ISO-8859-2
  • ISO-8859-3
  • ISO-8859-4
  • ISO-8859-5
  • ISO-8859-6
  • ISO-8859-7
  • ISO-8859-8
  • ISO-8859-9
  • SHIFT-JIS
  • UTF-8
  • BIG5
  • UCS2
  • UTF-16

MMS, however, isn't typically used unless you send a very long message (longer than 4 PDUs which is 560 bytes on Android) or if your message includes an image or something that cannot be encoded as a plain SMS.

Worth noting also is that MMS is much slower than SMS because it uses the SMTP protocol with special addressing (not based on DNS) and special multipart content types (see MM4 for details on this).

Post a Comment for "What Is The Default Charset/encoding Of Text Messages On Android Devices?"