Name: Suzume
Author: libraz

Question 1

What is Suzume?

Accepted Answer

Suzume is a lightweight, feature-driven Japanese tokenizer that runs on WebAssembly. Unlike dictionary-based analyzers like MeCab, it works without large dictionary files and is robust to unknown words. It runs in browsers, Node.js, Deno, and Bun.

Question 2

How does Suzume handle unknown words?

Accepted Answer

Suzume generates candidates from character patterns (kanji sequences, katakana sequences, alphanumeric compounds) and evaluates them alongside dictionary entries using Viterbi algorithm. This makes it robust to neologisms and domain-specific terms.

Question 3

Can I use Suzume in the browser?

Accepted Answer

Yes, Suzume runs entirely in the browser via WebAssembly. No server required. You can load it from npm or directly from a CDN like esm.sh. The entire package is under 450KB gzipped.

Question 4

How do I add custom words to Suzume?

Accepted Answer

Use loadUserDictionary() to add custom words at runtime. Format: "word,pos" (e.g., "ChatGPT,noun"). You can add brand names, technical terms, or domain-specific vocabulary without rebuilding the dictionary.

Question 5

What is the difference between Suzume and MeCab?

Accepted Answer

MeCab requires large dictionary files (50MB+) and server-side installation. Suzume uses feature-based analysis with a minimal dictionary, runs on WebAssembly in the browser, and handles unknown words gracefully. Choose Suzume for client-side processing without server infrastructure.

Question 6

How is Suzume different from kuromoji.js?

Accepted Answer

kuromoji.js requires downloading a 20MB+ dictionary on first load, causing slow initial page loads. Suzume is under 450KB gzipped and loads instantly. Suzume also handles unknown words better and has a simpler API.

Question 7

Can I use Suzume for SEO keyword extraction?

Accepted Answer

Yes, Suzume can extract nouns and compound words from Japanese text, making it ideal for auto-tagging blog posts, generating hashtags, or building keyword analysis tools - all without server infrastructure.

Question 8

Is Suzume suitable for production use?

Accepted Answer

Yes, Suzume is production-ready. It is compiled from C++ to WebAssembly for near-native performance, includes full TypeScript support, and works in all modern browsers, Node.js, Deno, and Bun.

Question 9

Does Suzume work offline?

Accepted Answer

Yes, once loaded, Suzume works completely offline. All processing happens locally in the browser or runtime. No API calls or internet connection required after initial load.

Question 10

How do I install Suzume?

Accepted Answer

Install via npm: npm install @libraz/suzume. Then import and use: const { Suzume } = await import("@libraz/suzume"); const suzume = await Suzume.create(); const result = suzume.analyze("日本語テキスト");

フィールド	必須	説明
`表層形`	はい	テキスト中に現れる単語
`品詞`	はい	品詞

フィールド	必須	説明
`表層形`	はい	テキスト中に現れる単語
`品詞`	はい	品詞
`コスト`	いいえ	単語コスト（低いほど選択されやすい）
`基本形`	いいえ	辞書形/基本形

値	説明	日本語名
`NOUN`	名詞、固有名詞	名詞
`VERB`	動詞	動詞
`ADJ`	形容詞	形容詞
`ADV`	副詞	副詞
`PARTICLE`	助詞	助詞
`AUX`	助動詞	助動詞
`PRON`	代名詞	代名詞
`DET`	連体詞	連体詞
`CONJ`	接続詞	接続詞
`INTJ`	感動詞	感動詞
`PREFIX`	接頭辞	接頭辞
`SUFFIX`	接尾辞	接尾辞
`SYMBOL`	記号	記号

ユーザー辞書

実行時の読み込み

フォーマット

基本形式

完全形式

品詞の値

例

IT用語

ブランド名

複合語

活用のある動詞

コストの調整

ユースケース

検索インデックス

チャットアプリケーション

EC サイト

ベストプラクティス

バイナリ辞書

.dic フォーマット概要

永続化

ユーザー辞書 ​

実行時の読み込み ​

フォーマット ​

基本形式 ​

完全形式 ​

品詞の値 ​

例 ​

IT用語 ​

ブランド名 ​

複合語 ​

活用のある動詞 ​

コストの調整 ​

ユースケース ​

検索インデックス ​

チャットアプリケーション ​

EC サイト ​

ベストプラクティス ​

バイナリ辞書 ​

.dic フォーマット概要 ​

永続化 ​