Tag Archives: kikitori

Kikitori: Japanese listening – the Dolly Sentences Method – Technical how-to-do-it

I have talked a bit about the dolly sentences method of kikitori or Japanese listening study. Or rather, general Japanese study with an emphasis on listening. Now we are going to look at the technicalities of how it is actually done.

I am using a Mac, and as we will see, there is a certain advantage in that, but for the most part the method will be similar for other operating systems.

First, in your Anki you need to install an addon (Tools > Addons > Browse & install) – one or both of Awesome TTS and Google TTS. Once you have installed it/them, you will see one or two speaker icons added to the top bar of the add card window:

kikitori-japanese-listening-2If you click one of these it will give a window like this.

kikitori-japanese-listening-3Here you can type or paste the text you want spoken and click the preview button. If you get silence, it is probably because you haven’t changed the language. The language drop-down must be Japanese for Google or Kyoko for OSX’s voice system (unless you are using an Apple device don’t worry about Kyoko).

The text will be read back to you. You may need to make some changes. Sometimes the synthesizer will read kanji incorrectly. Kyoko – while in most respects the best consumer-level voice-synthesizer available – is particularly bad about this. She reads 人形 as ひとかたち, for example – particularly galling to a doll. If that happens just re-spell the word in kana. You may also need to add or delete commas to get the sentence read in a way that is clear and understandable.

When you are happy with the synthesis, just hit OK and the file will appear wherever your cursor was when you opened the box (I have an Audio field on my cards as you can see in the first screenshot).

This is really all there is to adding spoken sentences to Anki. For my method I have the audio file play on both the front and the back of the card.

You can then harvest the sentences to put them on your MP3 player in accordance with the Dolly Sentences Method. It really is easier than you might think. First you need to find them, and they are in your Anki folder in a sub-folder called collection.media. Here it is (you can click the image to enlarge it):

kikitori-japanese-listening-1Let’s look at the red rings.

1. (top and bottom) shows you the file-path on a Mac. It will be similar on Windows. Just search “collection.media” if you have trouble finding it.

2. shows you the actual sentences. They are small MP3 files. If you are using Mac OSX’s Kyoko voice, the title of the file is conveniently the sentence itself.

3. If you are using the Google voice synthesizer the title is a lengthy code. But don’t worry because:

4. You can always use the date of the file to show you what you have added recently.

What you will do is simply copy-drag all your recently-added sentences into a folder and add this to your MP3 player. It really is as simple as that.


Update: Google TTS no longer supports Anki, but Awesome TTS now gives access to the vastly superior Acapella TTS engine among others. This also has the advantage that you now only need one addon to choose between Acapella and Apple’s Kyoko voice if you have her. You can also now choose human-readable filenames for everything. The instructions in this tutorial still apply.


Here is a sample, complete with recommended 3-second break, to show how the sentences actually sound:

These sentences are spoken by Mac OSX’s Kyoko voice, which, apart from her problems with Kanji reading, is in my view the best Japanese voice synthesizer available. The third sentence is spoken by the Google synthesizer, so you can hear the difference.

I use about 95% Kyoko with the Google alternative for the minority of occasions when Kyoko really won’t read a sentence well (this also mixes up the speech a bit, which I think is good). Google’s synthesizer will be installed automatically when you install the Google TTS addon. If you are using an i-device (iPad, iPhone etc) you should be able to use Kyoko too, though I am not certain about this (please let me know in the comments if you find out).

Kyoko speaks well and naturally for the most part, and actually knows the difference in tone between many Japanese homophones. For example if you type 奇怪 kikai (strange, mysterious) and 機械 kikai (machine), Kyoko will pronounce them each with the correct syllable raised, which is what differentiates them in spoken Japanese.

If you end a sentence with a ? Kyoko will raise her tone into a question intonation very naturally. The Google synthesizer does not do this, neither is it aware of tone differences between homophones. On the other hand, type a ! and Kyoko ends the sentence with a funny noise, and she makes far more kanji errors than Google. Neither of these problems really matters (just avoid ! and re-spell mispronounced kanji in kana).

The Apple synthesizer is considerably ahead of Google’s alternative and yet is in some minor respects surprisingly unpolished. But if you have a device that supports her you should definitely use her.

So there you have the technical aspects of the Dolly Kikitori sentences method. If you have any questions or want to share your experiences, please use the comments section below. For the method itself, please go here.

Kikitori – the Dolly Sentences Japanese Listening Method

kikitori-japanese-listeningI was a little hesitant in writing about my kikitori Japanese listening sentences method, because it may be somewhat idiosyncratic. However, it is working well and friends have taken some interest, so I’ll go ahead.

I have read about the 10,000 sentences method of Japanese learning which is recommended by some immersion-inclined sites. Frankly, I could never fully understand it — but then I am just a doll. I fiddled with it for some time and never really got to grips with it.

I did, however, like the idea of learning Japanese in sentences rather than just words. After all, that is how children learn language, and it gives one the feel of what just “sounds right”, rather than merely knowing grammar rules. I am by no means saying one shouldn’t know grammar rules — often one needs to — but I have always argued that grammar is a quick-and-dirty shortcut by which adult learners half-learn a language from the outside rather than actually knowing inside what feels right. Shortcuts can be good. They can even be necessary. But you don’t know a language till you can feel it. You only know about it.

My new assault on the sentences method came about partly as a result of my looking for new ways to improve my kikitori — Japanese listening. I started turning the sentences into digitized speech and putting them in Anki. I would review with my eyes shut and only count myself correct if I got the sentence first time without looking. If I couldn’t, I would give it a second hearing, and if I still couldn’t get it I would open my eyes to see the Japanese text. Only as a last resort do I turn the card over to see the furigana. This rarely happens (after all I can both see and hear the text if I need to open my eyes). The very last resort is to scroll down to where I have (sometimes — when I think I might need it) hidden a translation. This I try not to use and rarely do.

There is a second phase to this method, and that is putting the sentences on an MP3 player. I then play them on a random loop in any spare times (when cooking or walking, for example, and often when resting).  I use this a lot, which means I get a lot of exposure to the sentences.

I put a three-second gap between sentences. This is the most my player allows (annoyingly, I don’t think iPods have a means of doing this at all). This gives me a little time to think about a sentence after hearing it, and I think this is important. It is true that in the wild you don’t get any thinking time. But if you are at the stage when you can’t catch much in the wild (in anime or regular non-foreigner-directed conversation), this is what you need in order to get there.

In those three seconds you do certain things and one of them in particular is, I believe, fundamentally important to kikitori or hearing Japanese (or any other language). You correct what you hear. In our familiar language I believe we do this all the time. We hear the word “bubble”, realize that doesn’t fit the sentence we are hearing, and correct it to “double”. We hear the word “wise” and correct it to “wives” (or if we actually don’t understand the context, we don’t — which is why so many people make the blooper “old wise tales” in writing — Google finds over four thousand instances of “old wise tales”).

This common slip (and many others) underlines my point. It really isn’t easy, even in one’s native language, to tell “wise” from “wives”. Ninety-nine percent of the time we understand what the sentence should be and correct mishearings so fast we don’t even know we’ve done it. It is one of the key subliminal skills that makes kikitori — in any language — possible.

With a three-second gap between sentences, we are able to perform this correction-hearing in slow-ish motion, which, at this stage, we need to.

Now, as I have pointed out before, language consists to a very large extent of set phrases and collocations. Words go together in the same groups most of the time. That is a large part of the reason that kikitori is actually possible in any language.

Hearing sentences and auto-correcting (in slow motion at first) lets one go through the same process a child goes through. She hears words together. At first she mispronounces them, and even when she knows what a common word-group means she may not fully understand what the component words are. Slowly it all starts to make sense.

During sentence-listening we think we hear “kaishite”, for example, and realize it must be “taishite”. We also start to get the feel for the fact that in hundreds of similarly-constructed sentences we will hear in our Japanese-language life “kaishite” is actually going to be “taishite”. After a while it won’t even matter, because just as in reading we don’t need to see (and don’t, as studies have shown, even look at) all the letters, so in kikitori we don’t need to hear all the sounds. We get the pattern and fill in or auto-correct the gaps. If we don’t know or fully understand a phrase (such as “old wives’ tales”), even in our first language, we can’t auto-correct and we may go through life hearing it wrongly, as many people in fact do.

The vital point to grasp here is that while our natural, “naive” view of native-language kikitori is that we hear correctly and therefore understand, to a large extent the reverse is true: we understand and therefore hear correctly. Of course both are going on at once, and it is the interplay of the two that makes language-understanding possible.

With one phrase (like old wives’ tales) mishearing doesn’t matter very much. In fact we end up knowing what the phrase means even while consistently mishearing it. But when, in Japanese, we are faced with dozens of phrases that we can’t auto-correct, or can’t auto-correct quickly enough to keep up, then we can’t understand what is being said.

So the three seconds between sentences gives us a kind of middle ground. Doing the sentences in Anki we can ponder the sound at our own pace. In the wild we have almost no time. On the MP3 player we have three seconds to auto-correct anything in the sentence that needs it as well as to muse on the grammar, realize, perhaps on the 30th hearing, “ah, so that‘s why…” and so on.

These are all things a child does. Those of us who grow up continually pondering the ins and outs of language are probably more childish than odd. Children have to do a lot of that for the first several years. Some of us just never stop.

There are many important and interlocking benefits to this sound-sentence method. When we learn vocabulary via Anki, we only know the definitions of words — not how they are actually used. Now when I am going through my vocab Anki I am continually stopping with “that one needs a few sentences”. Once I have become familiar with several sentences using the word, I am much clearer on its range of uses and its nuances. I am also much less likely to forget the word.

In doing all this we are going through the process a child goes through. We are learning how words fit together, what they imply, what their near neighbors are likely to be in a sentence. We are also building up a fund of examples in our mind which we will use, sometimes consciously, but often — and this is where language starts to become natural — unconsciously, to compare with new sentences and new uses of the same words in different contexts. You build up your feel for the language. You start to hear what “just sounds right” without necessarily knowing why.

Surprisingly, the hearing sentences can even help with kanji, since one will sometimes in the Three Seconds think「あぁ。それは緊張の緊ですね」(“Ah, that’s the 緊 kin of kinchou, isn’t it”). Because that is part of how Japanese words fit together and mean what they mean.

Currently I am at 1,600 sentences using this method and I am finding it extremely useful, not only for kikitori but for every aspect of Japanese.

The “throw ’em in at the deep end” school may complain at the three seconds recognition-time, but I am not suggesting  that sentences should be our only listening practice. Full-speed native Japanese materials should definitely be used. But using this method, I think you will find that your ability to process that full-speed Japanese progresses a lot more rapidly.

How do I get spoken sentences in Anki? How do they actually sound? How do I get them to my MP3 player? Find all the answers in our sister article on the technical tricks of the Japanese-listening sentences method. It’s easier than you think – even a doll can do it!