The 5 Thai tones, with audio for every one

14 min read

Published

If you've already read the tones primer, you know the shape of the problem. Thai has five tones, each one changes the meaning of a syllable, and English speakers tend to get them wrong in predictable ways.

This piece is the reference. One section per tone — what it sounds like, what shape it traces, the minimal pairs that catch you out, and the specific exercise that fixes it. Audio for every example. Bookmark this and come back to it; nothing about tones gets fixed in one read.

How to read this piece

Each tone has the same four parts:

  1. The shape. What the pitch contour looks like, in plain language.
  2. The example word. A common Thai word with that tone. Audio.
  3. The minimal pair. A second word that differs only in tone — the kind of pair that gets you misunderstood.
  4. The fix. The specific thing English speakers do wrong on this tone, and the drill that retrains it.

There's a final section on tones in connected speech — the bit textbooks skip, and the part that actually trips up real conversation.

A note on transliteration. The pronunciations below use phonetic-tone notation: ǎ for rising, â for falling, à for low, á for high, plain a for mid. Long vowels are written doubled (aa). If you've been reading romanised Thai elsewhere and the marks look different, we wrote about why. Same word, different system.

Tone 1 — Mid

The shape

Flat. Steady. Held at your normal speaking pitch from start to finish, neither rising nor falling. If you said the syllable on a single piano note, that's mid tone.

This is the easy one. English doesn't have tones, but English speakers default to roughly mid pitch when they're reading text aloud. So mid tone is the one you'll already produce by accident, most of the time.

Example

ตา (taa) — eye

Listen for: a single steady pitch, no movement up or down. If your "taa" rises at the end like an unfinished question, you've drifted off mid into rising-tone territory.

Minimal pair

  • ตา (taa, mid) — eye
  • ต่า doesn't exist as a word, but ตี (tee, mid) — to hit — is a useful comparison; same tone, different vowel. Pure-mid drill.

The fix

Most English speakers don't break mid tone — they break it around mid tone, by drifting into a question-rising shape on the last syllable of a sentence. The fix here isn't tone-specific; it's prosody-wide. Read sentences aloud and listen for the unwanted rise at the end.

Drill: hum your normal speaking pitch on a single sustained note. Then say taa on that exact note. Then say taa-tee-too on three consecutive notes, all the same pitch. Boring is correct.

Tone 2 — Low

The shape

Steady, like mid, but pitched at the bottom of your speaking range. Imagine the lowest note you can comfortably say without dropping into a growl. That's low tone.

The crucial detail: low tone doesn't move. It's flat-low, not falling-low and not sad-and-descending. English speakers tend to render it as a melancholy descent from mid, which is wrong — that's actually closer to falling tone (Tone 4).

Example

ไข่ (kài) — egg

Listen for: a low, flat pitch. The syllable starts low and stays low. No descent, no contour, no sigh.

Minimal pair

  • ไข่ (kài, low) — egg
  • ไข้ (kâi, falling) — fever

Same consonant, same vowel, same finishing position. Different tone, completely different word. If you mistake your friend's stomach bug for breakfast, this is why.

The fix

The English speaker's instinct is to mix in some descent — to make low tone feel like the end of a sad sentence. Don't. Hold the pitch.

Drill: hum the lowest note in your speaking range. Stay on that note. Now say kài while holding that pitch. Repeat ten times. Then say a sequence of low-tone words back-to-back: kài_, kài_, kài_. The pitch should be identical on each one. If it dips, you're falling, not low.

Tone 3 — Falling

The shape

Starts high — higher than your normal speaking pitch. Drops fast to low. The whole movement happens within a single syllable. It feels physical, almost percussive.

Falling tone in Thai is the closest analogue to the English "no!" said with energy — the kind where pitch starts high and crashes down. Not a bored "no, whatever" — that's mid-with-a-sigh, which is wrong. The right falling tone has weight at the start.

Example

ไม่ (mâi) — not / no

Listen for: a clear high-to-low movement. The "ai" diphthong starts at the top of your range and ends at the bottom. The whole thing takes about a third of a second.

Minimal pair

  • ไม่ (mâi, falling) — not / no
  • ไหม (mǎi, rising) — question particle ("Is it...?")

Same vowel, completely different tone. gin mâi and gin mǎi are different sentences — one is a fragment ("not eat"), the other is a complete question ("are you eating?"). The contour is doing all the disambiguation.

The fix

English speakers fall too gently — they begin at mid pitch and descend slightly, which sounds like Tone 1 with a sigh. The fix is to start higher than feels natural.

Drill: say the English exclamation "no!" — really commit, like someone just suggested something terrible. Notice the pitch starts high. Now apply that same starting pitch to mâi. Then taper it down. The first half-second should feel almost too high.

A second drill: alternate mâi (falling) with mǎi (rising) ten times. The contour shapes are mirror images of each other; training one trains the other.

Tone 4 — High

The shape

Pitched higher than mid, held there. A flat, elevated tone. Like singing a single note at the top of your speaking range.

This one is rare in everyday Thai — most words you'll meet use mid, low, falling, and rising. High tone shows up in certain particles and a smaller set of vocabulary, and it's often confused with rising tone (Tone 5) by learners because both are above mid.

The difference: high tone stays high. Rising tone arrives high.

Example

ม้า (máa) — horse

Listen for: a syllable held at an elevated pitch, flat, no movement. If your máa starts low and rises up, you've crossed into rising tone.

Minimal pair

  • ม้า (máa, high) — horse
  • มา (maa, mid) — come
  • หมา (mǎa, rising) — dog

The famous trio. Three syllables that sound nearly identical to an untrained English ear, three completely different words. If you've ever seen a tonal-language YouTube clip about "horse, dog, come", this is the one.

The fix

The instinct on high tone is to start in your mid-range and rise up — which produces rising tone, not high tone. The fix: start high and stay there. Don't move.

Drill: pick a high note in your speaking range. Hum it. Now say máa on that note without changing pitch. Then say máa, máa, máa — three consecutive high tones, identical pitch on each. If you can hear yourself "leading in" with a lower pitch and rising, slow down and sing the syllable purely on the high note.

Then alternate: máa (high) → mǎa (rising) → máa (high) → mǎa (rising). Feel the difference between starting-high and arriving-high.

Tone 5 — Rising

The shape

Starts low, moves up to high. A clear upward contour over the syllable. The pitch should genuinely rise, not just sit at mid.

The trap: English uses rising pitch to signal questions. So when an English speaker tries to produce rising tone, the result sounds like a question even when the sentence is a statement. The contour shape is right, but the pragmatic reading is wrong.

Example

ขาว (kǎao) — white

Listen for: a clear arc from low pitch to high pitch over the syllable. The "aao" portion should be doing the bulk of the rise.

Minimal pair

  • ขาว (kǎao, rising) — white
  • ข้าว (kâao, falling) — rice
  • ขาว vs ข่าว (kàao, low) — news

Three words, three tones, all on the same vowel. Rice, white, news — three things you'll talk about in any first week in Thailand. Get the tones wrong and you're saying "white fried" when you wanted "fried rice".

The fix

The "rising tone sounds like a question" problem doesn't go away by trying to suppress the rise — the rise is necessary, the tone literally is rising. Instead, separate the contour from the question-feeling.

Drill: practise rising tone on declarative sentences. kǎao at the end of a sentence is a fact: "It is white". Read whole sentences aloud where the last syllable carries rising tone, and force yourself to stop the sentence there — no question intonation on top of the tonal rise. Just: rise, period.

A second drill: combine the rising-tone word with a confirming particle. kǎao châi mái — "it's white, right?" The rising tone sits on kǎao; the question feeling is in mái. Keeping these separate in your mouth trains the discrimination.

The thing textbooks skip — tones in connected speech

Single-word tone production is the easy part. The hard part is keeping tones in shape across a sentence.

Three things happen to tones in connected Thai that don't show up in word lists:

Sandhi — tones change in compounds

When two words combine into a compound, the first word's tone sometimes shifts. The classic example is saw-ǎat (clean) — written as if both syllables carry their declared tone, but in fluent speech the first syllable assimilates.

This isn't a thing you need to memorise rules for. It's a thing you need to hear hundreds of times until your ear stops expecting word-list-style pronunciations from connected speech.

Stress patterns flatten weak syllables

Thai isn't a stress-timed language the way English is, but unstressed syllables in fluent speech do compress. The tone shape becomes shorter; the contour gets sketched rather than fully drawn.

This is normal and native. You shouldn't try to slow down to enforce textbook contours — that's how you sound like a textbook learner. Trust the rhythm; the contours come back when the syllable carries weight.

Particles often soften their tones

Polite particles like khrap and technically have specific tones (high and falling, respectively). In real speech, especially in casual register, they often flatten toward mid. This is fine, and trying to enforce the "correct" tone on a polite particle in casual speech can sound stilted.

The lesson: train tone production hard on stressed words, then let it relax in particles and unstressed syllables. The shape is right when you say a tonal word in isolation; the shape can shift when the word is one of many.

How to use this article

You won't fix all five tones in one session. Or one week. The realistic timeline is six to twelve weeks of consistent imitation-and-recording practice before tones stop being the conscious bottleneck.

The pattern that works:

  1. Pick one tone per week. Drill it on three or four words. Record yourself.
  2. The next week, add the second tone. Drill both. Record both. Listen back-to-back with native audio.
  3. By week five, you've touched every tone. Then start cycling through pairs — falling-vs-low, rising-vs-high — to train discrimination.
  4. By week ten, start mixing tones in short sentences and listening for tone integrity in connected speech.

This article is the reference you come back to. Read once for orientation; come back when you find yourself getting one tone wrong in real conversation.

The pronunciation mistakes piece covers the six most common tone errors English speakers make, with fixes. The tones primer covers why tones are the bottleneck in the first place. Together with this reference, that's the tones cluster on ThaiDai.

If you want the audio drilled into a daily-practice loop rather than read on a page, the getting-started guide walks through the ThaiDai practice deck — every word has native-trained Thai audio and tone-marked romanisation so the pitch shape is encoded in the spelling itself.

Drill the tones, hear the difference.

Every word in the deck plays in a native-trained Thai voice. The minimal-pair drills (มา / ม้า / หมา) train your ear directly. Free tier covers the foundational set — get started in two minutes.

Open the deck →

Read next