Name: Suzume
Author: libraz

Question 1

What is Suzume?

Accepted Answer

Suzume is a lightweight, feature-driven Japanese tokenizer that runs on WebAssembly. Unlike dictionary-based analyzers like MeCab, it works without large dictionary files and is robust to unknown words. It runs in browsers, Node.js, Deno, and Bun.

Question 2

How does Suzume handle unknown words?

Accepted Answer

Suzume generates candidates from character patterns (kanji sequences, katakana sequences, alphanumeric compounds) and evaluates them alongside dictionary entries using Viterbi algorithm. This makes it robust to neologisms and domain-specific terms.

Question 3

Can I use Suzume in the browser?

Accepted Answer

Yes, Suzume runs entirely in the browser via WebAssembly. No server required. You can load it from npm or directly from a CDN like esm.sh. The entire package is under 350KB gzipped.

Question 4

How do I add custom words to Suzume?

Accepted Answer

Use loadUserDictionary() to add custom words at runtime. Format: "word,pos" (e.g., "ChatGPT,noun"). You can add brand names, technical terms, or domain-specific vocabulary without rebuilding the dictionary.

Question 5

What is the difference between Suzume and MeCab?

Accepted Answer

MeCab requires large dictionary files (50MB+) and server-side installation. Suzume uses feature-based analysis with a minimal dictionary, runs on WebAssembly in the browser, and handles unknown words gracefully. Choose Suzume for client-side processing without server infrastructure.

Question 6

How is Suzume different from kuromoji.js?

Accepted Answer

kuromoji.js requires downloading a 20MB+ dictionary on first load, causing slow initial page loads. Suzume is under 350KB gzipped and loads instantly. Suzume also handles unknown words better and has a simpler API.

Question 7

Can I use Suzume for SEO keyword extraction?

Accepted Answer

Yes, Suzume can extract nouns and compound words from Japanese text, making it ideal for auto-tagging blog posts, generating hashtags, or building keyword analysis tools - all without server infrastructure.

Question 8

Is Suzume suitable for production use?

Accepted Answer

Yes, Suzume is production-ready. It is compiled from C++ to WebAssembly for near-native performance, includes full TypeScript support, and works in all modern browsers, Node.js, Deno, and Bun.

Question 9

Does Suzume work offline?

Accepted Answer

Yes, once loaded, Suzume works completely offline. All processing happens locally in the browser or runtime. No API calls or internet connection required after initial load.

Question 10

How do I install Suzume?

Accepted Answer

Install via npm: npm install @libraz/suzume. Then import and use: const { Suzume } = await import("@libraz/suzume"); const suzume = await Suzume.create(); const result = suzume.analyze("日本語テキスト");

Option	Type	Default	Description
`wasmPath`	`string`	`undefined`	Custom path to WASM file
`preserveVu`	`boolean`	`true`	Preserve ヴ (don't normalize to ビ etc.)
`preserveCase`	`boolean`	`true`	Preserve case (don't lowercase ASCII)
`preserveSymbols`	`boolean`	`false`	Preserve symbols/emoji in output

Property	Type	Description	Example
`surface`	`string`	Surface form as it appears in text	`"食べ"`
`pos`	`string`	Part of speech in English	`"VERB"`
`baseForm`	`string`	Dictionary/base form	`"食べる"`
`reading`	`string`	Reading in katakana	`"タベ"`
`posJa`	`string`	Part of speech in Japanese	`"動詞"`
`conjType`	`string \| null`	Conjugation type (for verbs/adjectives)	`"一段"`
`conjForm`	`string \| null`	Conjugation form	`"連用形"`

`pos`	`posJa`	Description
`NOUN`	名詞	Nouns
`VERB`	動詞	Verbs
`ADJ`	形容詞	Adjectives
`ADV`	副詞	Adverbs
`PARTICLE`	助詞	Particles
`AUX`	助動詞	Auxiliary verbs
`PRON`	代名詞	Pronouns
`DET`	連体詞	Adnominal adjectives
`CONJ`	接続詞	Conjunctions
`INTJ`	感動詞	Interjections
`PREFIX`	接頭辞	Prefixes
`SUFFIX`	接尾辞	Suffixes
`SYMBOL`	記号	Symbols
`OTHER`	その他	Other/Unknown

API Reference

Suzume Class

`Suzume.create(options?)`

`analyze(text)`

`generateTags(text)`

`loadUserDictionary(data)`

`version`

`destroy()`

Morpheme Interface

Properties

Part of Speech Values

Error Handling

Memory Management

API Reference ​

Suzume Class ​

Suzume.create(options?) ​

analyze(text) ​

generateTags(text) ​

loadUserDictionary(data) ​

version ​

destroy() ​

Morpheme Interface ​

Properties ​

Part of Speech Values ​

Error Handling ​

Memory Management ​