iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🧠

Is Generative AI Weak Against 'No'? The Surprising Reason Japanese Prompts Fail

に公開
2

Why avoiding continuous "no" (の) drastically improves Generative AI response accuracy

Introduction: Why are Japanese prompts difficult?

"Tell me about the color of the leaves of the apple tree" and "Tell me about the 'leaf color' of the 'apple tree'."
Have you ever tried to see what difference these two prompts make in the AI's response?

In fact, this small difference greatly affects the AI's understanding accuracy.
If you feel that "Japanese prompts don't work well" or "I can't get the expected responses," the cause might lie in the linguistic characteristics of the Japanese language.

In this article, I will explain why simply being mindful of punctuation and parentheses can drastically improve your interaction with AI, using specific examples.

AI Confusion Caused by the Japanese Particle "No"

Humans vs. AI: Differences in Language Understanding

In the Case of Humans

  • Understand meaning by integrating context, experience, and common sense.
  • Can infer the "general meaning" even from ambiguous expressions.

In the Case of AI (Large Language Models (LLM))

  • Calculate meaning based on statistical relationships between words.
  • Determine "which words to focus on" using the Attention mechanism.
  • Without a clear structure, accuracy decreases due to dispersion of attention.

Problems with Continuous "No"

The Japanese structure "A no B no C" is difficult for AI to process for the following reasons:

  1. Ambiguity of Relationships: The modification relationships between A, B, and C are unclear.
  2. Dispersion of Attention: It is difficult to judge which element should be emphasized.
  3. Context Dependency: Interpretation relies heavily on the preceding and succeeding context.

Practice: Improving Accuracy through Structuring

Experimental Example: Comparing Apple Prompts

Let's compare how the AI's responses actually change with the following three prompts.

Pattern 1: Ambiguous Prompt

青森の農家のリンゴの収穫について教えて

AI Processing Status

  • The four elements "Aomori," "Farmer," "Apple," and "Harvest" are lined up flatly.
  • It is difficult to judge which element is the primary information.
  • Tends to result in a generalized answer.

Pattern 2: Partial Structuring

青森の"農家のリンゴの収穫"について教えて

AI Processing Status

  • Separates the location "Aomori" from the series of actions "Farmer's apple harvest."
  • Recognized as a specific activity associated with a location.
  • Answers containing more specific regional information can be expected.

Pattern 3: Full Structuring

"青森の農家"による"リンゴの収穫"について教えて

AI Processing Status

  • Clearly separates "Aomori farmer" (subject) and "apple harvest" (action).
  • Processing emphasizes the relationship between the subject and the action.
  • More logical and structured answers can be expected.

Effective Use of Parentheses and Punctuation

1. Clarification through Grouping

❌ システムの設計の問題の解決方法
⭕ "システムの設計"における"問題の解決方法"

2. Explicit Prioritization

❌ 東京の会社の新しい製品の開発状況
⭕ "東京の会社"が手がける"新しい製品の開発状況"

3. Organizing Modification Relationships

❌ 最新の人工知能の技術の応用例
⭕ "最新の人工知能技術"の応用例

Why is This Method Effective?

Impact on Tokenization

Large language models process text by breaking it down into units called "tokens."
Parentheses and punctuation function as signals indicating "semantic chunks" during this tokenization process.

Concentration of Attention Weights

Structured prompts bring about the following effects in the AI's attention mechanism:

  • Concentration of attention on relevant elements
  • Reduction of unnecessary associations that act as noise
  • Realization of more precise context understanding

Practical Techniques You Can Use Starting Today

1. "No" Chain Analysis

Identify parts in your prompts where the particle "no" (の) occurs three or more times consecutively and group them using parentheses.

2. Clarifying Subject and Predicate

Clearly indicate "who" is doing "what" to "whom" by using parentheses.

3. Step-by-Step Structuring

For long prompts, break them down into semantic units and organize them accordingly.

Conclusion: A Prompt is a Blueprint

A prompt is not just a simple question or command.
It is a "blueprint" for collaborating with AI.

While the flexibility of the Japanese language is convenient for humans, it can lead to processing difficulties for AI.
However, by adding visual cues such as punctuation and parentheses, you can significantly improve this challenge.

Action proposals for today:

  • Check your existing prompts for consecutive "no" (の) instances.
  • Try structuring them using parentheses.
  • Observe and record changes in the responses.

Please share your prompt improvement experiences in the comments.
Let's all aim to improve the accuracy of Japanese prompts together!


References and Related Links

  • Transformer Architecture and Attention Mechanisms
  • Challenges and Prospects of Japanese Natural Language Processing
  • Best Practices for Prompt Engineering

Next Preview
"Explaining 'Transformers' at the Heart of the AI Revolution in a Way Anyone Can Understand"


This article is a refined version of a past post from my blog, rewritten for Japanese readers.

https://imkohenauser.com/on-punctuation-and-parentheses-in-japanese-prompts/

If you have any other insights, please share them in the comments.

Discussion

Kohen, or Kohei SaitoKohen, or Kohei Saito

長いプロンプトは「コードのように」書くのが基本

英語圏のAI開発者や熟練したユーザーの間では、プロンプトをまるでプログラムのコードのように構造化することが一般的です。これにより、AIがプロンプトの意図を正確に理解し、期待通りのアウトプットを生成しやすくなります。

具体的にどんなテクニックがあるか見ていきましょう。

  • XMLタグやJSON形式: <task>, <context>, <persona> といったタグや、{"role": "...", "content": "..."} のような形式を使って、プロンプトの各要素を明確に区切ります。
  • 特殊な記号: Markdown形式の ###, ---, * といった記号でセクションを分けたり、リストを整理したりします。
  • 役割(ペルソナ)の指定: あなたは親切なアシスタントです。プロのコピーライターとして振る舞ってください。 のように、AIに特定の役割を与えることで、回答のトーンやスタイルをコントロールします。

形式は自由

しかし、形式は自由、大切なのはプロンプトの意図がAIに「伝わること」です。
また、将来的には、キーボードを必要としなくなる可能性があります。

<theme>
...
</theme>
テーマは以下、
...
テーマはここまで。
<テーマ>
...
</テーマ>

これらは、AIが「この囲まれた部分がテーマに関する情報だ」と認識するための デリミタ(区切り文字) です。どの形式を使うかは、個人の好みやチームの共通ルールで決めて問題ありません。

最も大切なのは以下の2点です。

  • 一貫性: 同じ種類のプロンプトでは、同じ形式を使い続けることでAIの学習を助け、安定した出力が得られます。
  • 明確性: 人間が読んでも、プロンプトの構造や意図がすぐに理解できる、論理的でシンプルな形式を選ぶことです。