About

Hear, speak, read, and remember Taiwan.

OpenFormosa is a Taiwan-rooted open AI foundation-model initiative. The goal is not a model that merely answers in Traditional Chinese, but an AI infrastructure that understands Taiwan's context, voices, culture, documents, and everyday expression.

Why "OpenFormosa"?

The name carries the whole idea: an open, inspectable Taiwan AI foundation model — built from Taiwan, facing the world.

Open · 開放

Open source, open collaboration, open data governance, and an inspectable technical route. Taiwan's local AI should not depend only on closed APIs, and its key language and speech capabilities should not be owned entirely by a few overseas platforms.

Formosa · 福爾摩沙

The beautiful island and its diversity: Traditional Chinese, Taigi, Hakka, Indigenous languages, Bopomofo, Tailo, local speech, internet language, historical memory, and the natural environment, all woven together.

OpenFormosa = an open Taiwan AI foundation model. Not a single product, but a model family, a data-engineering effort, a culture-preservation effort, and open infrastructure.

Why the jia-zhi bag?

The woven market bag is an everyday Taiwanese object — cheap, durable, and instantly recognizable. It is the perfect symbol for what OpenFormosa wants to build.

Carries · 可承載

It carries Taiwan's corpora, voices, culture, documents, knowledge, and applications.

Circulates · 可流通

It lets model capability flow to researchers, developers, enterprises, and educators.

Reusable · 可重複使用

Not a one-off demo, but a base model to fine-tune, deploy, distill, and extend over and over.

Local, not closed · 在地但不封閉

A Taiwan symbol that does not shut others out — rooted in Taiwan, facing the world.

Woven · 編織感

Its weave maps to a token lattice and data weave: text, language, speech, and signals woven into capability.

Taiwan is not a translation patch

Traditional Chinese is only the surface — Taiwan's accents, scripts, local terms, and documents all have their own texture.

The blind spot

Generic models often understand Traditional Chinese only as text. OpenFormosa focuses on the real texture of Taiwan: accents, code-switching, local terms, Bopomofo, Taigi, receipts, menus, public notices, and the living culture that appears in audio and documents.

OpenFormosa's approach

It publishes training recipes, model cards, benchmark methods, and release evidence, so Taiwan-local capability can be inspected, not just asserted.

Taiwan context is foundation-model work.

This is not about translating a generic model into Traditional Chinese. Taiwan terms, speech, documents, and cultural memory need to be designed into the tokenizer, training recipe, evaluation suite, and task adapters from the beginning.

Belief 01

Not just Chinese

A model can write Traditional Chinese and still miss Taiwan: local institutions, place names, speech habits, internet tone, addresses, forms, and public-sector language.

Belief 02

Small can be infrastructure

A compact 1B-class model is not a toy if it is cheap to run, easy to fine-tune, private-deployable, and useful across ASR, TTS, OCR, RAG, and education workflows.

Belief 03

Open builds trust

Open means people can inspect model cards, evaluation sets, tokenizer choices, training recipes, benchmark artifacts, and release notes instead of guessing what happened.

Belief 04

Adapters, not chaos

Speech and document models should share a Taiwan language backbone, while ASR, TTS, and OCR keep their own adapters, heads, and evaluation rules.

OpenFormosa is an open AI model family rooted in Taiwan. Inspired by the jia-zhi bag, we weave Taiwan's voices, texts, images, and memories into open AI infrastructure.

We build compact, deployable, Taiwan-native language and multimodal models that understand Traditional Chinese, Taiwanese Mandarin, local speech, Bopomofo, Taigi, Taiwanese internet language, documents, and cultural context — so the world can see that Taiwan is not only using AI, but building its own.

— Hear Taiwan, speak Taiwan, read Taiwan, and remember Taiwan.