iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🔐

Considerations for Using AI in Business: Navigating Contracts, Data Privacy Laws, and Technical Architecture

に公開

Introduction

As Generative AI, MCP (Model Context Protocol), and agent-based development tools rapidly permeate business operations, many businesses seem to lack clear answers to questions such as "Is it okay to put client information into AI?", "Is a local LLM safe?", and "Is email integration really okay?".

This article is an attempt to organize the issues likely to be encountered when using AI in business operations from three perspectives: contracts, the Act on the Protection of Personal Information (APPI), and technical configuration. The intended audience includes small and medium-sized enterprises, freelancers, information security personnel, and executives considering AI adoption.

Note that this article was created while collecting, organizing, and editing information using generative AI services (see "About the Creation Process of this Article" at the end for details). Given that the theme itself is AI, I felt it was appropriate to be transparent about the creation process.

Chapter 1: The Basic Premise — Distinguishing Between Three Layers

Issues when using AI for business do not seem to be resolved by a single law or a single contract. It is likely that at least three layers are operating simultaneously.

First, there is the layer of the Act on the Protection of Personal Information (APPI). Issues such as obtaining the individual's consent, supervision of outsourced parties, cross-border transfers, and understanding the external environment are involved.

Second, there is the layer of contracts. The interpretation of "confidential information," "third-party disclosure," and "sub-outsourcing" in NDAs, basic business agreements, privacy policies, and terms of service become the issues.

Third, there is the layer of technical configuration. Data residency, the entity processing the data, and the implementation differences between APIs, MCP, agents, and local LLMs are involved.

These are not independent but seem to be intertwined. Changing the technical configuration changes the organization of APPI compliance, and changing contract terms changes the permissible technical configuration. Deciding "whether to use AI" cannot be judged in a single step; it seems necessary to evaluate it in multiple stages.

Chapter 2: The Boundary Between "Cloud Exceptions" and "Outsourcing"

When using cloud services for business, the most important distinction in terms of the Act on the Protection of Personal Information (APPI) seems to be the difference between "Cloud Exceptions" and "Outsourcing."

The Concept of Cloud Exceptions

In the Q&A 7-53 of the Personal Information Protection Commission's guidelines, it is indicated that if a business operator "is not to handle the personal data in question," it does not constitute "provision" or "outsourcing" in the first place. Typically, cloud storage services are considered to potentially fall under this category.

To meet this, two conditions appear to be necessary: that the service provider's contract stipulates that they will not handle the personal data stored on the servers, and that appropriate access control is implemented.

If these are met, it is interpreted that neither the individual's consent nor the obligation to supervise outsourced parties arises.

Organizing as Outsourcing

If a business operator actively processes data, the cloud exception cannot be used, and it must be organized as outsourcing under Article 27, Paragraph 5, Item 1 of the APPI. In this case, while individual consent is not required, the obligation to supervise the outsourced party (Article 25) is considered to arise.

Perspectives to Distinguish the Two

The key to distinguishing the two is likely whether the service is "merely storing" or "processing" the data. If data is simply being placed there, it is storage; if it is being actively interpreted, analyzed, or converted, it is processing.

Storage services are generally considered closer to storage. Accounting SaaS likely falls under processing as it involves calculation and aggregation. Generative AI can clearly be said to perform processing (inference and generation). Business card management services are also considered to fall under processing because they perform OCR and data conversion.

Operating without clarifying this distinction carries the risk of treating what should be organized as outsourcing under a cloud exception, thereby leading to the failure to obtain individual consent or fulfill the obligation to supervise the outsourced party.

Chapter 3: How Generative AI Services Should Be Organized

When using generative AI services for business, the current conventional understanding within the framework mentioned above is that it should be organized as "outsourcing." This is because AI actively analyzes and processes input data to generate outputs, making it difficult to claim that the service provider "does not handle" the data.

Commercial Plans Are Effectively the Minimum Requirement

Organizing this as outsourcing requires the safety management system of the outsourced party as a prerequisite. In the case of generative AI services, commercial plans (API, Team, Enterprise, etc.) are considered the de facto minimum requirement. Specific elements likely to be required include:

  • Contractual guarantees that input data will not be used for model training.
  • Limitation of data retention periods (limited to the scope necessary for operation).
  • Conclusion of a DPA (Data Processing Addendum).
  • Acquisition of certifications such as SOC 2 Type II or ISO 27001.
  • Technical safety management measures such as TLS encryption for communication.

Consumer versions (Free/Pro, etc.) do not meet these conditions and are generally unsuitable for business use.

Issues That Remain Even with Commercial Plans

This is a point often misunderstood. It is safe to assume that using a commercial plan does not solve everything.

Issues regarding Article 28 of the APPI (Cross-border transfer): Since many AI operators run on US infrastructure, it is considered necessary to respond via either obtaining individual consent, establishing a system conforming to standards, or domestic processing (via cloud, as described later).

Issues regarding third-party disclosure under NDA: Contractually, the AI provider is considered a "third party" as a separate legal entity. It seems necessary to organize this by obtaining prior consent from the client, comprehensive consent, or by relying on industry practice.

Relation to Information Security Regulations: If a client has explicitly stated that "use of external AI is prohibited," it is highly likely that this would constitute a violation, even if a commercial plan is used.

In short, my personal view is that a commercial plan is merely a "condition to stand at the starting line of AI utilization," and it does not mean that "you can use it freely just because you have introduced it."

Chapter 4: Is Local LLM a Panacea?

One might intuitively think, "What if we use a local LLM (a large language model running within our own company environment) that doesn't send data externally?" Technically, with an offline-complete local LLM (a combination of an open-source model and an inference server), it is possible to guarantee that data does not leave the organization.

However, there seem to be several points to keep in mind.

Points to Verify Technically

  • Source and verification of the model download.
  • Storage location for inference logs and cache.
  • Unintended behavior of backups and cloud synchronization settings.
  • Presence of error reporting or telemetry transmission.
  • Communication specifications of the frontend tools.

Since configurations that perform some form of communication, even if advertised as "local," are not rare, it is essential to technically eliminate these risks.

Contractual Issues

Even if there is no external transmission, there remain points where contractual violations could occur.

Prohibition of unauthorized use: Does the purpose of use in the contract include "processing by AI"? Whether AI input falls within the scope of business objectives needs separate consideration.

How to handle confidential information: If there are "restrictions on reproduction, alteration, or analysis," it could potentially be interpreted that inputting data into an LLM technically constitutes reproduction, alteration, or the creation of derivative works.

Interpretation of "AI utilization prohibited": Is the client against "data leaving the organization" or "processing by AI itself"? If it is the latter, even a local implementation could be out of scope.

Handling of derivatives and outputs: It appears necessary to organize how the output of the LLM is handled contractually.

Chapter 5: Reality of Agent-based Tools — External Transmission Occurs Even When Running Locally

Recent AI development support tools and agent-based CLIs are often distributed as "locally running applications." It is easy to conclude that "it is safe because it is local," but the reality appears to be different.

The Core of an Agent Likely Relies on External APIs

It is believed that the internal structure of agent tools typically follows this loop:

  1. The local side assembles a prompt (system prompt + history + tool definitions + user input) and sends it to an external API.
  2. The API returns a response (determining whether a tool usage request or termination is needed).
  3. In case of a tool usage request, the local side executes file read/write operations according to the instructions in the response.
  4. The result is added to the prompt, and the process returns to step 1.

In other words, it seems that "what to do next," "which tool to use," and "whether to continue processing" are all decided by an external API's LLM, and the local side is merely an I/O harness executing those instructions.

What Is Being Sent Externally?

As soon as the local side reads a file, its content is included in the prompt as part of an API request and sent externally. Similarly, command execution results, search results, and responses from MCP tools are structured in a way that the necessary scope for the agent's context is passed to the external API.

Backend Options

Many agent tools seem to be designed to allow switching the inference backend. For instance, by using options to call models via major cloud vendors, it may be possible to choose a structure where data remains within the boundaries of your own company's cloud account. Under enterprise requirements, this might be a more realistic option.

Chapter 6: Organizing by Data Location — Options via Cloud Vendors

If we reorganize the structure of AI usage from the perspective of data residency, I think some options start to become clearer.

In Case of Direct API Usage

Data flow: In-house environment → Internet → AI provider's infrastructure (mainly the US) → Model inference → Response.

This is a structure where the data location falls within the "AI provider's management boundary," and the third parties involved are the AI provider (+ the cloud vendor as the hosting platform).

In Case of Using via Cloud Vendors

Data flow: In-house environment → Service endpoint within your company's cloud account → Model inference under that cloud vendor's management → Response.

Although it depends on each company's policy, data should be contained within the cloud vendor's management boundary. I understand it is explained that the data does not reach the AI provider itself (only the model is provided to the vendor).

Potential Impact on Contractual Arrangements

Using cloud vendors may offer the following advantages:

  • Since many companies already have the cloud service on their allowlist, it may eliminate the need for new vendor vetting.
  • By specifying the region, you may be able to keep data residency domestic, making it easier to avoid issues regarding cross-border transfers.
  • It may be possible to handle it within existing cloud contracts, potentially eliminating the need to conclude new DPAs.
  • It is possible to build configurations that avoid internet transit, such as using VPC Endpoints.

These advantages might be particularly significant for businesses serving financial, medical, or public sector clients.

Chapter 7: Why Are Business Card Management Services Widely Accepted?

Business card management services, which are widely used in the industry, are likely classified as "delegation-type" services in terms of structure. Because the business operator actively processes the user's email data to extract information, it is difficult to apply the "cloud exception."

So why are they widely accepted in the industry? I think there are a few reasons.

  1. Predictability of processing scope: The use cases are fixed, such as "digitizing business cards" and "extracting email signatures," creating a structure where it is clear what will be processed.

  2. Accumulation of industry practice: It can be interpreted that an industry practice has been formed over many years that "business card management SaaS is considered to have comprehensive consent as a delegation." They are reportedly on the allowlists of many companies' information security policies.

  3. Contractual structure of operators: It appears that a contractual structure optimized for Japan's Act on the Protection of Personal Information and industry practices has been established, such as domestic operators, domestic server processing, and the acquisition of various certifications.

  4. Granularity of processed data: The design handles only signature information, which is limited and business-customarily shared.

Contrast with Generative AI Usage

Generative AI services are structurally similar "delegation-type" services, but the level of acceptance seems to differ significantly due to the differences mentioned above.

  • Processing scope: Generative AI enables "all kinds of text processing," making it difficult to predict.
  • Industry practice: It is a new category that has spread rapidly in a few years, and practices are still in the process of being formed.
  • Contractual structure: Most are US-based operators, making it easy for cross-border transfer issues to inevitably arise.
  • Granularity of processed data: It can handle any information entered arbitrarily by the user.

In other words, attempts to justify generative AI usage within "the same framework as business card management SaaS" may have structural similarities, but they lack persuasive power due to differences in industry practices and the maturity of contractual arrangements, so I believe it depends on future industry trends.

Chapter 8: Issues When Handling Information Entrusted to You

AI usage for company information might be within a scope you can freely decide according to your own internal policy. However, when handling information entrusted by other companies, the other party's rules will likely take precedence.

Typical Scenarios

If you have concluded an NDA with a client, the following items are considered typical prohibitive clauses:

First, prohibition of disclosure to third parties. Since AI operators are separate legal entities under contract, they would most likely be classified as "third parties" in principle.

Second, prohibition of use for purposes other than intended. Generally, entrusted information should only be used within the scope of business purposes defined in the contract.

Third, restrictions on re-delegation or processing delegation. Acts of entrusting the processing of confidential information to third parties often require prior consent.

It is safer to assume that a policy stating "our company is free to use trusted AI services" applies only to our own information, and such a stance does not hold water for items entrusted by clients.

The Weight of "Confidential Information = Any Information Learned During Business"

Depending on the NDA, extremely broad definitions of confidentiality, such as "confidential information = any information learned during the course of business," are sometimes adopted. In this case, it must be assumed that all of the following are treated as confidential:

  • Contact information and account details of business partners
  • Transaction terms (amounts, payment dates, items)
  • The existence of the business relationship itself
  • Past email exchanges

Independent interpretations such as "it's not confidential due to industry practice" or "the amount should be fine" are unlikely to hold up in the event of a dispute.

The Unique Domain of Email Processing

Email processing is one of the areas where issues are most concentrated in business AI usage. The reasons can be organized as follows:

  • Email bodies are free-form text, making it difficult to predict confidentiality.
  • Confidential information can be contained in senders, recipients, subjects, and attachments.
  • It contains information sent by the other party (i.e., things the other party sent "to your company").
  • In operations that reference long threads, the volume of information becomes vastly larger.

It can be said that the act of passing email content to generative AI involves many issues in terms of both contracts and the Personal Information Protection Act. In practice, operational design is likely necessary, such as excluding external business emails, limiting targets to internal information, excluding highly confidential information even within the company, making human review mandatory, or providing prior notice when incorporating generative AI into one's own service's email processing.

Chapter 9: What Should Be Disclosed in the Privacy Policy?

When using cloud services or AI services for business, disclosure in the privacy policy becomes a key point.

Elements Likely Required by Law

First, understanding of the external environment. If you handle personal data using the services of a business operator located in a foreign country, you are required to understand the system of that foreign country and disclose that you have implemented safety management measures in accordance with it (Act on the Protection of Personal Information, Articles 23, 32(1)(iv); Cabinet Order Article 10(i)).

Second, response to cross-border transfer (provision to a third party in a foreign country). You must disclose the name of the destination country, the protection system of that country, and the measures taken by the recipient, and respond either by obtaining the individual's consent or by establishing a system that meets the standards.

Third, external transmission regulations under the Telecommunications Business Act. This refers to the notification and disclosure obligations when transmitting user information to third parties through cookies or similar technologies.

Potential for Different Disclosure Levels Depending on "Storage Only" vs. "Processing Involved"

If the cloud exception applies (meaning the business operator does not handle the data), compliance with Article 28 is considered unnecessary. However, understanding the external environment under Article 23 and its disclosure are considered to remain necessary.

Elements that should be disclosed are said to include the following:

  • The name of the foreign country where the personal data is handled
  • An overview of that country's protection system
  • Safety management measures in that country

It is considered that listing individual business names is not necessarily required.

Looking at published privacy policies, the following trends seem to exist:

  • Large enterprise trend: Often uses abstract expressions without listing specific business names.
  • Mid-sized IT/SaaS business trend: Tends to list specific business names.
  • AI product business trend: Examples showing detailed disclosure of AI business names are seen.

Businesses will need to decide at what level of granularity to disclose information, based on their company size, business type, and track record of passing business partner audits.

Clauses Worth Considering

For businesses that handle entrusted information, it may be worth considering the inclusion of clauses such as the following:

Note that in cases where we handle personal or confidential information entrusted by customers, if restrictions regarding the use of generative AI services are stipulated in the contract with the customer or by the customer's instructions, such contracts and instructions shall take precedence, and that information shall be excluded from the scope of generative AI use based on this policy.

Having this might allow the company to point to a policy that explicitly prioritizes contracts when asked by a client, "You use AI, so is our information safe?"

Chapter 10: Implementation Patterns — Structurally Managing Risks in Agent Design

When using agent-based AI tools for business, I believe it is effective to structure risks within the technical architecture.

Direction of Hybrid Configurations

One approach is to segment agents by use case and assign models and skill visibility to each.

Agents for External Information Only: Realized using local LLMs prepared in a dedicated environment, utilizing web search or SaaS integration, so that even if risks like prompt injection manifest, there is no sensitive data to leak.

General-Purpose Task Agents: Utilize cloud-based frontier models or local LLMs to improve internal business efficiency. These are separated from the "External Information Only" agent and its local LLM so that the agent cannot communicate directly with them, thus protecting against attackers.

Implementation Notes

Local LLM resistance to prompt injection. Small models or quantized models are considered to have lower resistance than cloud models. I believe a design that limits the range of damage by narrowing down the specific information processing is safer.

Chapter 11: Summary — A Proposed Judgment Framework for Business AI

Let's organize the issues discussed so far into a framework that can be used for practical judgment. Following these questions in order may be one approach to deciding how to utilize AI.

Q1: What is the information being handled?

  • Is it only in-house company information?
  • Does it include information entrusted by other companies?
  • Does it contain personal information (customer personal info, employee personal info, third-party personal info)?
  • Does it contain sensitive personal information?
  • Does it contain business confidential information?

Q2: Are there any contractual restrictions?

  • Is it explicitly prohibited by NDAs or basic transaction agreements with clients?
  • Breadth of confidentiality definitions (e.g., "any information learned during business")?
  • Existence of clauses permitting sub-delegation?
  • Restrictions on the purpose of use?

Q3: How is it organized under the Personal Information Protection Act?

  • If personal information is included, is it within the scope of the purpose of use?
  • Is the contractual structure organized as an entrustment?
  • Cross-border transfer applicability and response (individual consent, system meeting standards, or domestic processing)?
  • Understanding the external environment and disclosure in the privacy policy?

Q4: Is the technical architecture appropriate?

  • Are commercial plans, DPA, and non-use for training guaranteed?
  • Does data residency meet the requirements?
  • Is the input range minimized?
  • Is human review incorporated?

Q5: What are the operational safeguards?

  • Disclosure in the privacy policy
  • Development of internal guidelines
  • Limitation of access privileges
  • Acquisition of audit logs
  • Handling of departing employees

Checking these in order should help reduce the chance of overlooking issues.

In a Nutshell

先行導入(Early introduction) driven by convenience may create heavy debt in terms of contract and policy development later. I believe that advancing both simultaneously is the fastest path in the end.

Supplement: Limitations and Creation Process of This Article

Expert verification is considered essential, especially in the following areas:

  • Interpretation of individual contract clauses
  • Specific wording in privacy policies
  • Compliance with industry regulations (finance, medical, public, professional services, etc.)
  • Specific procedures for cross-border transfer response (obtaining individual consent, building a system that meets standards)
  • Response to incidents

About the Creation Process of This Article

This article was created by utilizing generative AI services to collect information, organize issues, consider structure, and edit text during the process of organizing points I was personally handling in my work, and finally by confirming and adjusting the content. I have continuously used generative AI for referencing official documents and explanatory articles, comparative analysis of issues, and refining the text. Since the theme of this article itself is "AI usage in business," I felt it was appropriate to disclose the creation process itself as useful decision-making material for the reader.

Note that because output from generative AI can contain factual errors or logical leaps, when using the content of this article in practice, it is strongly recommended to check primary information from the links to reference materials.

I hope this article serves as a useful "roadmap of issues" in judging business AI usage.

Discussion