Understanding the SMS character limit is one of the most practical things a marketer can do to control campaign costs. A single misplaced emoji or special character can double the number of message segments your campaign consumes — and at scale, that translates directly into thousands of dollars in avoidable spend. This guide breaks down exactly how SMS character counting works, how encoding determines your limits, how messages get split into segments, and what you can do to optimize every send.
The Basics: SMS Character Limits and Why They Exist
The SMS protocol was designed in the 1980s with a fixed payload size of 140 bytes per message. That constraint has never changed. What has changed is how those 140 bytes get used, depending on the character encoding applied to your message content.
There are two primary encoding standards used in SMS:
- GSM-7 — A 7-bit encoding that supports 128 standard characters (Latin alphabet, digits, common punctuation). Because each character uses 7 bits, you can fit 160 characters into a single 140-byte SMS.
- UCS-2 — A 16-bit encoding that supports the full Unicode character set, including emoji, non-Latin scripts, and special symbols. At 16 bits per character, you are limited to 70 characters per message.
The encoding is not something you choose manually. It is determined automatically by the content of your message. If every character in your text falls within the GSM-7 character set, the message is encoded as GSM-7. The moment you include a single character outside that set — a curly quote, an emoji, a character from a non-Latin alphabet — the entire message falls back to UCS-2 encoding.
For a deeper look at how these two encoding standards differ and the practical implications for deliverability, see our guide on GSM-7 vs UCS-2 encoding and how it affects SMS cost and deliverability.
What Is an SMS Message Segment?
When your message exceeds the single-message character limit (160 for GSM-7 or 70 for UCS-2), it gets split into multiple segments. Each segment is transmitted as a separate SMS and billed individually by your carrier or messaging provider. The recipient's phone reassembles the segments into a single message, so the end user typically sees one continuous text — but you pay for each segment.
Here is where it gets counterintuitive: when a message is split into multiple segments, a portion of each segment's payload is reserved for a User Data Header (UDH). The UDH contains metadata that tells the receiving device how to reassemble the segments in the correct order. This header consumes 6 bytes per segment, which reduces the available character space.
Character Limits Per Segment
| Encoding | Single Message | Per Segment (Multi-Part) | UDH Overhead |
|---|---|---|---|
| GSM-7 | 160 characters | 153 characters | 7 characters (6 bytes) |
| UCS-2 | 70 characters | 67 characters | 3 characters (6 bytes) |
This means a 161-character GSM-7 message does not cost "one segment plus a tiny bit." It costs two full segments, with the second segment carrying just 8 characters of actual content. Understanding this threshold is essential for cost control.
How Segment Count Scales with Message Length
The following table shows how many segments are consumed at various message lengths for both encoding types. This is the reference table worth keeping handy when planning campaigns.
| Segments | GSM-7 Characters | UCS-2 Characters |
|---|---|---|
| 1 | 1–160 | 1–70 |
| 2 | 161–306 | 71–134 |
| 3 | 307–459 | 135–201 |
| 4 | 460–612 | 202–268 |
| 5 | 613–765 | 269–335 |
Notice the non-obvious math. Two GSM-7 segments give you 306 characters (153 × 2), not 320. Three UCS-2 segments give you 201 characters (67 × 3), not 210. The UDH overhead compounds with each additional segment.
The Hidden Cost Multiplier: Encoding Fallback
The most common source of unexpected SMS costs is unintentional UCS-2 encoding. A message that fits comfortably in one GSM-7 segment can balloon to two or three UCS-2 segments when a non-GSM character sneaks in.
Characters That Trigger UCS-2 Fallback
These are the most frequent culprits in marketing messages:
- Curly (smart) quotes — “, ”, ‘, ’ — These are automatically inserted by word processors, Google Docs, and many CMS platforms. The GSM-7 set only includes straight quotes (", ').
- Em dashes and en dashes — — and – — GSM-7 supports the hyphen-minus (-) but not typographic dashes.
- Emoji — All emoji require UCS-2 (technically UTF-16). A single 😊 at the end of a 140-character message turns one segment into three.
- Non-Latin characters — Chinese, Arabic, Cyrillic, Korean, and other scripts require UCS-2.
- Special symbols — Characters like ©, ®, ™, •, and the ellipsis character (…) fall outside GSM-7.
A Real-World Cost Scenario
Consider a promotional message that reads:
Flash sale — 30% off everything today! Use code SAVE30 at checkout. Shop now: https://example.com/sale
This message is 98 characters. With a standard hyphen instead of the em dash, it fits in a single GSM-7 segment. But that em dash (—) forces UCS-2 encoding, and at 98 characters under UCS-2, the message consumes two segments instead of one.
Now scale that across a campaign. Sending to 500,000 subscribers at $0.01 per segment:
| Scenario | Segments per Message | Total Segments | Total Cost |
|---|---|---|---|
| GSM-7 (hyphen) | 1 | 500,000 | $5,000 |
| UCS-2 (em dash) | 2 | 1,000,000 | $10,000 |
A single character — one that most marketers would not even notice — doubled the campaign cost. This is not a theoretical risk. It happens routinely when copy is drafted in tools that auto-format punctuation.
The GSM-7 Extended Character Set
There is an additional nuance worth understanding. The GSM-7 standard includes a base character set and an extended character set. Characters in the extended set — including common symbols like {, }, [, ], |, \, ^, ~, and € — are encoded using an escape sequence that consumes two character slots instead of one.
This means a message containing the euro sign (€) uses two of your 160-character budget for that single symbol. It still encodes as GSM-7 (so you avoid the UCS-2 penalty), but your effective character count is reduced. If your message is right at the 160-character boundary, a few extended characters could push you into a second segment.
Practical Strategies for SMS Character Optimization
Reducing segment count is one of the highest-leverage optimizations available in SMS marketing. Below are concrete techniques that deliver measurable savings.
1. Validate Encoding Before Sending
Never send a campaign without confirming the encoding and segment count. Platforms like Trackly include GSM-7 encoding validation and segment counting directly in the message composer, so you can see exactly how many segments a message will consume before it goes out. This prevents the kind of surprise cost overruns described above.
If your platform does not offer this, open-source libraries and online tools can parse your message against the GSM-7 character table and flag any characters that would trigger UCS-2 fallback.
2. Draft in Plain Text Editors
Avoid composing SMS copy in word processors, Google Docs, or rich-text editors. These tools automatically convert straight quotes to curly quotes, hyphens to em dashes, and three periods to an ellipsis character. Instead, draft in a plain text editor (Notepad, VS Code, Sublime Text) or directly in your SMS platform's composer.
3. Replace Non-GSM Characters with GSM Equivalents
Most non-GSM characters have visually similar GSM-7 alternatives:
| Non-GSM Character | GSM-7 Replacement |
|---|---|
| “ ” (curly quotes) | " (straight quotes) |
| ‘ ’ (curly apostrophes) | ' (straight apostrophe) |
| — (em dash) | - (hyphen) or -- (double hyphen) |
| – (en dash) | - (hyphen) |
| … (ellipsis character) | ... (three periods) |
| • (bullet) | - or * (hyphen or asterisk) |
4. Be Strategic About Emoji
Emoji can improve engagement rates, but they come with a real cost. A single emoji forces UCS-2 encoding and cuts your single-message limit from 160 to 70 characters. For a typical marketing message of 120–150 characters, that means jumping from one segment to two or even three.
The decision to use emoji should be deliberate and data-driven. Run A/B tests comparing emoji vs. non-emoji variants, measuring not just click-through rate but cost-adjusted ROI. If an emoji variant gets a 5% higher CTR but costs 100% more in segments, the math may not favor it. For more on calculating these tradeoffs, see our breakdown of how to calculate and maximize SMS marketing ROI.
5. Shorten URLs
URLs are often the longest single element in an SMS. A raw URL like https://www.example.com/collections/summer-sale?utm_source=sms&utm_medium=text&utm_campaign=july consumes 85 characters — more than half a GSM-7 segment. Using a link shortener or custom short domain can reduce that to 20–25 characters.
Trackly's built-in link tracking with custom short domains handles this automatically, generating short tracked URLs that preserve click attribution without consuming excessive character space.
6. Write Tighter Copy
SMS copywriting is a discipline unto itself. Every word must earn its place. Some techniques for reducing character count without losing meaning:
- Remove filler words ("just," "really," "very," "that")
- Use numerals instead of spelled-out numbers ("30" not "thirty")
- Use common abbreviations where appropriate ("info" not "information")
- Lead with the value proposition, not the greeting
- Cut redundant phrases ("free gift" → "gift," "new innovation" → "innovation")
For a comprehensive guide to writing effective SMS copy within tight character constraints, see our SMS creative copywriting guide.
7. Use Merge Fields Carefully
Dynamic personalization fields like {first_name} add variable length to your messages. A message that is 155 characters with a 4-letter name becomes 163 characters with a 12-letter name — pushing it from one segment to two. When using merge fields, calculate your segment count based on the longest plausible value, not the average.
Segment Counting and Concatenation: Technical Details
For teams building SMS systems or integrating with APIs, understanding the technical mechanics of message concatenation is important for accurate cost forecasting.
How the UDH Works
When a message exceeds the single-segment limit, the sending platform splits it into parts and prepends a User Data Header to each segment. The UDH contains three key pieces of information:
- Reference number — A unique identifier linking all segments of the same message
- Total parts — The total number of segments in the concatenated message
- Part number — The sequence position of this particular segment
The UDH consumes 6 bytes (48 bits) of the 140-byte payload. For GSM-7, where each character uses 7 bits, this translates to approximately 7 characters of lost capacity per segment (48 ÷ 7, rounded up). For UCS-2, where each character uses 16 bits, it translates to 3 characters (48 ÷ 16).
Segment Splitting and Character Boundaries
Messages must be split at character boundaries, not byte boundaries. For GSM-7, this is straightforward since most characters are single-byte. However, GSM-7 extended characters (which use an escape sequence) cannot be split across segments — the escape character and the following character must stay in the same segment. This can occasionally reduce the effective capacity of a segment by one additional character.
For UCS-2, surrogate pairs (used for emoji and some Unicode characters above U+FFFF) must also be kept together, which can reduce effective segment capacity by one character in edge cases.
Calculating SMS Campaign Costs at Scale
Accurate cost forecasting requires multiplying three variables: audience size, segments per message, and cost per segment. Below is a framework for modeling campaign costs across different message configurations.
Cost Modeling Example
Assume a list of 250,000 subscribers and a per-segment cost of $0.0085 (a common rate for US domestic SMS):
| Message Type | Characters | Encoding | Segments | Total Segments | Campaign Cost |
|---|---|---|---|---|---|
| Short promo | 95 | GSM-7 | 1 | 250,000 | $2,125 |
| Standard promo | 155 | GSM-7 | 1 | 250,000 | $2,125 |
| Long promo | 280 | GSM-7 | 2 | 500,000 | $4,250 |
| Short promo + emoji | 95 | UCS-2 | 2 | 500,000 | $4,250 |
| Long promo + emoji | 155 | UCS-2 | 3 | 750,000 | $6,375 |
The difference between the most efficient message (95 GSM-7 characters, one segment) and the least efficient (155 UCS-2 characters, three segments) is a 3x cost multiplier for the same audience. Over a year of weekly campaigns, that compounds into tens of thousands of dollars.
A/B Testing for Segment Efficiency
Character optimization should not come at the expense of performance. The goal is to find the message length and format that maximizes ROI — which means balancing segment cost against conversion rate.
A structured approach to testing:
- Test one-segment vs. two-segment variants — Write two versions of the same offer: one that fits in a single GSM-7 segment and one that uses two segments with more detail. Measure conversion rate and calculate cost-per-conversion for each.
- Test emoji vs. no-emoji — Compare a GSM-7 message with a UCS-2 emoji variant. Track whether the engagement lift from emoji justifies the segment cost increase.
- Test short URL vs. long URL — Measure whether branded long URLs perform differently from short URLs in terms of click-through rate.
Trackly's A/B testing and algorithmic creative selection capabilities make this kind of testing straightforward. You can set up multiple message variants with different character counts and encoding profiles, and the platform will automatically allocate more traffic to the top-performing creative based on real-time results.
Common Pitfalls and How to Avoid Them
Copy-Paste from External Sources
Copying text from websites, emails, or documents almost always introduces non-GSM characters. Curly quotes and special dashes are the most common offenders. Always paste into a plain text editor first, or use your SMS platform's encoding validator to catch these before sending.
Template Variables Exceeding Estimates
If your message template is 158 characters with a {first_name} field, and you estimated names averaging 5 characters, you are at 153 characters — safely within one segment. But a subscriber named "Christopher" adds 11 characters, pushing the total to 159. Still safe. However, a slightly longer name combined with a longer offer code could push past 160. Always test with maximum-length values.
Carrier-Added Content
Some carriers and aggregators append opt-out language (e.g., "Reply STOP to unsubscribe") to messages. This appended text counts toward your character and segment totals. If your carrier adds 30 characters of compliance text, your effective single-segment limit drops to 130 characters for GSM-7. Check with your provider to understand whether this applies to your traffic.
Invisible Characters
Zero-width spaces, byte order marks, and other invisible Unicode characters can hide in copied text. These trigger UCS-2 encoding without any visible indication in the message. A robust encoding validator will flag these characters.
Building a Character Optimization Workflow
For teams sending SMS at scale, character optimization should be a systematic part of the campaign workflow, not an afterthought. Here is a practical process:
- Draft in a plain text environment — Avoid rich-text editors for initial composition.
- Run encoding validation — Use your platform's GSM-7 checker to identify any non-GSM characters. Trackly's deliverability tools flag these automatically in the composer.
- Check segment count with dynamic fields — Calculate the worst-case segment count using the longest plausible merge field values.
- Review shortened URLs — Ensure link shortening is applied and the shortened URL is included in the character count.
- Calculate projected cost — Multiply segments × audience size × per-segment rate to confirm the campaign is within budget.
- A/B test when in doubt — If you are debating between a one-segment and two-segment version, test both and let the data decide.
The most cost-effective SMS campaigns are not necessarily the shortest. They are the ones where every character is intentional, encoding is validated before send, and segment count is a deliberate choice rather than an accident.
Key Takeaways
- A single GSM-7 SMS supports 160 characters. A single UCS-2 SMS supports only 70.
- Multi-part messages lose 7 (GSM-7) or 3 (UCS-2) characters per segment to concatenation headers.
- One non-GSM character — a curly quote, an em dash, an emoji — forces the entire message into UCS-2, potentially doubling or tripling segment count.
- At scale, encoding mistakes translate to thousands of dollars in unnecessary spend per campaign.
- Always validate encoding and preview segment count before sending. Platforms with built-in GSM-7 validation and segment counters eliminate guesswork.
- A/B test message length and emoji usage to find the cost-adjusted performance optimum for your audience.
If you are looking for a platform that provides full visibility into encoding, segment counts, and cost projections before you hit send, Trackly's deliverability tools are built for this kind of optimization.