RMeCab

capture_002_24052014_101858

RMeCab is an interface from R to MeCab.

% R
> result <- RMeCabFreq("Abe.txt")

The above code gives you a word count table from the text file “Abe.txt”, Policy Speech by Prime Minister Shinzo Abe to the 186th Session of the Diet, Friday, January 24, 2014.

From the table, you can draw graphs such as the most used nouns in the text. The most used noun was “world” shown as red in the graph, which appears 29 times in the speech. Other words in descending order is: “local regions (in green)”, “economy (in pink)”, “my fellow Japanese (in blue)”, and so on.

MeCab

640px-Boiled_wakame

MeCab is some lower part of “wakame”, edible seaweed, but is also a “Yet Another Part-of-Speech and Morphological Analyzer” for Japanese text segmentation.

Let’s see what happens when you put a Japanese sentence, “My name is Mike.”, into MeCab.

capture_001_24052014_080409