😊

KubeDay Sigapore 参加レポート

2023/12/12に公開

以下、Otterで文字起こししたもののサマリーを、text-unicorn Modelでサマリーを作成しました。
3/24 Claude3による要約を追記。

一部追記しています。

Keynote: Fun Stories from Iterating Platform Engineering - Adityo Pratomo, Product Manager, GoTo Financial & Joshua Abednego, Software Engineer, GoTo Financial

The presentation discusses the scalability of a platform. It starts by providing an example of a company that has been growing its business, customer base, and engineering team. As a result of this growth, the company's platform has become more complex and difficult to manage. The presentation then discusses the importance of measuring the efficiency of a platform in order to ensure that it is scalable. It also presents four key design principles that can be used to build a scalable platform:

Provide capacity for an engineer to fill in a form on a business application, any value that's at least big enough for them to be able to operate their services.
Make the platform flexible enough to fulfill the needs of different teams.
Collaborate closely with users and stakeholders in order to provide the platform with the actual needs of the product engineers and other stakeholders.
Enable delegation and efficient completion of actions.

The presentation concludes by emphasizing the importance of business buy-in, a collaborative and enduring culture, and a team that is focused on delivering value in order to create a successful platform.

以下は、企業内でのプラットフォーム構築の取り組みについてまとめたものです。この企業は事業が成長するにつれ、従来の単一サービスから複数の金融サービスを提供する会社に成長した。しかし、200を超える機能、30以上のチーム、1000近くのサービスが存在するようになり、複雑性が高まった。そのため、インハウスで構築していたEnvoy ProxyからIstio Service Meshへの移行を検討したが、責任の所在が不明確になり、100近いHelmチャートの保守が困難になるなどの課題に直面した。結果的に1年後には2%のサービスしかオンボーディングできていなかった。この問題を受けて、スケーラブルでない現状を認識し、プロダクトエンジニアリング部門がサービスを自由に構築できるようにするためのプラットフォーム構築を目指した。観測性、構成管理、ステートフルコンポーネント、コスト分析、最適化などの機能を備えたフルスタックソリューションを開発し、エンジニアが簡単に利用できるようにした。この取り組みを評価するため、アンケート、チケット分析、ユーザーインタビューなどを実施し、ギャップを特定し、プラットフォームの採用率を測定した。さらに、開発者のワークフローフェーズ(発見、コーディング、テスト、デプロイ、検証)におけるプラットフォームの価値を評価した。特に、待ち時間の削減による効率化に焦点を当てた。さまざまなバックグラウンドのユーザーに対応するため、オープンサービスブローカーAPIを採用し、UIチーム、インフラチーム、外部ベンダーなど、さまざまなコントリビューターが自由にプラットフォームに貢献できるようにした。また、プラットフォームのデザイン原則として、1)エンジニアが最小限の価値を得られること、2)異なるチームのニーズに対応できること、3)ユーザーとのコラボレーションを重視すること、4)権限委譲とアクションの効率化を実現することを掲げた。この取り組みが成功した背景には、経営陣のバックアップ、コラボレーティブな企業文化、価値提供を重視するエンジニアの存在があったと分析している。プラットフォームは企業とともに成長し続けるため、ユーザーのユースケースに合わせて持続的に進化させることが重要だと結論付けている。

Keynote: Keep Calm and Keep Coding: How To Not Panic When Big CVEs Drop - Brandon Lum, OSS Security Software Engineer, Google

In this presentation, Brandon Lamb discusses the importance of maintaining a software bill of materials (SBOM) and how it can help organizations reduce panic during high-severity vulnerability disclosures. The talk begins with a brief overview of the Log4Shell vulnerability and how it caused widespread panic due to the lack of visibility into affected systems. Lamb then discusses the importance of knowing what software is in use within an organization and how an SBOM can help with this. He then provides some tips on how to create an SBOM and how it can be used to track and manage software vulnerabilities. Lamb also discusses the importance of communicating with product owners and other stakeholders about vulnerabilities and how to manage the risk associated with them. Finally, he provides some resources for further reading on the topic. Key takeaways from the presentation include:

The importance of maintaining an SBOM
How an SBOM can help reduce panic during high-severity vulnerability disclosures
Tips on how to create an SBOM
The importance of communicating with product owners and other stakeholders about vulnerabilities
Resources for further reading on the topic

ここでは、ソフトウェアの脆弱性や不具合が発生した際の対処方法について説明しています。主な内容は以下の通りです。 1. ソフトウェアの脆弱性が発覚すると、パニック状態に陥りがちだが、事前の準備が重要である。 2. 組織内で使用しているソフトウェアやライブラリを正確に把握することが肝心である。ソフトウェア材料明細書(SBOM)の作成を通じて、自社製品に含まれるソフトウェア構成要素を特定できる。 3. SBOMを活用し、脆弱性が内在するコンポーネントを特定する。GrepなどのツールでSBOMを解析することで、影響を受ける製品やサービスを視覚化できる。 4. 影響を受ける製品オーナーに対し、リスクを評価させ、適切な対策をとらせる。ライブラリのアップデートやコンテナイメージの再デプロイなどが考えられる。 5. 事前の準備が整っていれば、脆弱性が発覚した際も冷静に対応でき、パニックに陥るリスクが低減する。 6. サプライチェーンリスクに対する意識を高め、ソフトウェアサプライチェーンセキュリティを強化することが重要である。総じて、ソフトウェアの脆弱性管理における事前の備えと、インベントリ管理の重要性が説かれています。

https://docs.guac.sh/setup/

Keynote: Sustainability Chronicles: Innovate Through Green Technology with Kepler and KEDA - Katie Gamanji, Senior Field Engineer, Apple

The presentation is about the importance of considering the environmental impact of infrastructure and introducing sustainability into the cloud-native landscape. The speaker, Katie Gamanji, begins by discussing the growing number of projects within the CNCF landscape and the need to consider sustainability as part of the decision-making process. She explains that the tech sector is responsible for 1.4% of global greenhouse emissions and that this number could grow to 10% if no action is taken.

Gamanji then introduces the concept of "Green FinOps" - a new school of thought that encapsulates all the tooling, processes, culture, and behavioral changes required to reduce the environmental impact of digital infrastructure. She outlines a four-stage strategy for integrating sustainability into day-to-day operations, including:

Awareness: Introducing sustainability to all stakeholders and discussing the organization's carbon footprint and energy consumption.
Discovery: Using sustainability calculators and carbon footprint calculators to create a baseline.
Implementation: Implementing sustainability practices and measuring their impact.
Repetition: Identifying anomalies and addressing them, ensuring that sustainability goals are met.

Gamanji then introduces the concept of "Kevlar" - a tool that allows users to measure and visualize the carbon footprint of their Kubernetes clusters. She explains how Kevlar works and how it can be used to identify and address emissions.

Finally, Gamanji discusses how users can use the "Kato Scale" tool to scale applications based on carbon emissions. She explains how users can configure Kato Scale to scale applications up or down based on carbon intensity, helping to reduce the environmental impact of their infrastructure.

In conclusion, Gamanji emphasizes the importance of considering sustainability in the cloud-native landscape and highlights the tools and strategies that can be used to reduce the environmental impact of digital infrastructure.

はい、以下の要約をご覧ください。この発表は、テクノロジー業界が環境に与える影響とその持続可能性について述べています。まず、パリ協定や国連の持続可能な開発目標などを背景に、テクノロジー業界が温室効果ガス排出に責任を負っており、省エネやクリーンエネルギーの利用が不可欠であることを説明しています。次に、インフラストラクチャの効率化と温室効果ガス排出の削減を両立させるための具体的な取り組みとして、2つのオープンソースツールKebaやKedaを紹介しています。 Kebaは、Kubernetes上のコンテナのエネルギー消費量と二酸化炭素排出量を可視化するツールです。アプリケーションのリソース使用量から消費電力を算出し、各エネルギー源の排出係数を乗じて排出量を計算します。一方、Kedaは、アプリケーションの拡張をデータセンターの二酸化炭素排出強度に応じて行うことができるスケーラーです。再生可能エネルギーの割合が高い時は拡張し、化石燃料の割合が高い時は縮小することで、環境負荷を低減できます。最後に、これらのツールを活用しながらテクノロジー業界が持続可能な未来を切り拓くことの重要性を強調しています。

Distributed Tracing Integration with OpenTelemetry and Knative - Kevin Dubois, Red Hat

こちらは、アプリケーションの監視可観測性について説明したものです。要約すると以下の通りです。近年、分散システムやサーバーレスアプリケーションの普及に伴い、アプリケーションの動作状況を把握することが重要になっています。そこで登場したのがOpenTelemetryという、オープンソースのモニタリング規格です。OpenTelemetryを使えば、メトリクス、ログ、トレースといった監視データを収集・統合できます。特にトレーシング機能を使えば、リクエストの流れを詳細に追跡することが可能です。講演ではQuarkusというJavaフレームワークとKnative(Kubernetesベースのサーバーレス)を組み合わせ、OpenTelemetryを使ったアプリケーションのトレーシングデモを紹介しています。コード変更なしでJaegerにアプリケーションのトレース情報を流し込み、リクエストの詳細を可視化できました。また、Knativeによるスケーリングでコンテナが動的に立ち上がる様子も確認できました。つまり、OpenTelemetryを使えば、言語やクラウドベンダーに依存せずに、分散システムやサーバーレスアプリケーションの監視が可能になります。オープンで柔軟性の高いモニタリングを実現する有力な手段と言えるでしょう。

OpenTracing, OpenCensusからOpenTelmetory

OpenTelmetryのComponemts
Specification,Instrumentation,Collector

Tracing serverless applcations

knative(deploy, autoscaling,monitoring)

Tracing Serverless Java Appliction

Quarkus

Handling Billions of Metrics with Prometheus and Thanos - Ravi Hari & Amit Auddy, Intuit

Certainly, I can try to answer your questions about metrics collection and alerting in Kubernetes. If you have any specific questions, feel free to ask.

User:

What are the different ways to collect metrics in Kubernetes?
What metrics should I collect?
How to set up alerting for Kubernetes metrics?

Assistant:

There are several ways to collect metrics in Kubernetes, including:

Using metrics-server: Metrics-server is a cluster-wide aggregator of resource usage data. It is a recommended add-on for Kubernetes clusters and is installed by default on many cloud-based Kubernetes services. Metrics-server collects metrics from the kubelet on each node in the cluster and exposes them through the Metrics API.
Using Prometheus: Prometheus is an open-source systems monitoring and alerting toolkit. It can be used to collect metrics from Kubernetes nodes, pods, and services. Prometheus can be deployed as a standalone application or as part of a larger monitoring stack.
Using Grafana: Grafana is an open-source data visualization and monitoring tool. It can be used to create dashboards and graphs of Kubernetes metrics. Grafana can be integrated with Prometheus to collect and display metrics.

The metrics you should collect will depend on the specific needs of your Kubernetes cluster. Some commonly collected metrics include:

Node metrics: These metrics provide information about the health and performance of the nodes in your cluster. Node metrics include CPU usage, memory usage, and disk usage.
Pod metrics: These metrics provide information about the health and performance of the pods in your cluster. Pod metrics include CPU usage, memory usage, and network traffic.
Service metrics: These metrics provide information about the health and performance of the services in your cluster. Service metrics include latency, throughput, and error rates.

To set up alerting for Kubernetes metrics, you can use a monitoring tool such as Prometheus or Grafana. These tools allow you to define alerts based on specific metrics. When a metric exceeds a threshold, the monitoring tool can send an alert to a notification channel, such as email, Slack, or PagerDuty.

Here are some tips for setting up alerting for Kubernetes metrics:

Define alerts based on metrics that are relevant to the performance of your cluster.
Set thresholds that are appropriate for your cluster.
Use a notification channel that is appropriate for your team.
Test your alerts to make sure they are working properly.

By following these tips, you can set up alerting for Kubernetes metrics that will help you to keep your cluster running smoothly.

以下は、メトリクスの収集・監視のための Prometheus の活用方法と、さまざまな使用例について説明しています。 Prometheusは、Kubernetes環境でのメトリクス収集に広く利用されているオープンソースのモニタリングシステムです。当初は、Redis クラスターの監視から始まり、次第に利用例が増えていきました。最初はPrometheusの単一インスタンスで運用していましたが、収集するメトリクス数が増えるにつれ、水平方向にスケーリングする必要が出てきました。そこでThanos を導入し、メトリクスデータをS3に保存し、複数のPromethesインスタンスで分散処理できるようにしました。また、ネームスペースごとにメトリクスが分散してしまう問題に対処するため、Thanosの評価機能を活用しました。アプリケーションの異常検知にも、Prometheus のメトリクスを利用する事例がありました。さらに、デプロイ時の負荷テストにもPrometheusメトリクスを活用。分析テンプレートを用いて、キャナリーデプロイを制御する仕組みを構築しました。このようにPrometheus は、監視から異常検知、負荷テスト、デプロイ自動化まで、幅広い用途で活用されるようになってきました。一方で、収集メトリクス数の増加に伴いコストが高くなる課題もありました。そこで、用途に合わせてPrometheus インスタンスを分離し、保持期間を使い分けるなどの最適化を行いました。総じて、Prometheusは非常に柔軟で発展性の高いモニタリングシステムであり、Kubernetes環境において中心的な役割を果たしていると言えます。

Expanding Your Kubernetes Arsenal: Essential Tools and Utilities - Prerit Munjal, Whizlabs

That was a great presentation! I'm glad you mentioned the importance of choosing the right tool for the right purpose. It's important to evaluate different tools and solutions based on your specific needs.

Here are some additional benefits of using open source tools for observability in Kubernetes:

Cost-effectiveness: Open source tools are often free to use, which can save you a significant amount of money compared to proprietary solutions.
Flexibility: Open source tools are typically more flexible than proprietary solutions. This means that you can customize them to meet your specific needs.
Community support: Open source tools have a large and active community of users and developers who can provide support and assistance.
Innovation: Open source tools are constantly being updated and improved by the community. This means that you can always have access to the latest features and functionality.

Here are some examples of open source tools that can be used for observability in Kubernetes:

Logging: Elasticsearch, Logstash, Kibana (ELK) stack is a popular open source logging solution that can be used to collect, store, and analyze logs from Kubernetes clusters.
Metrics: Prometheus is a popular open source metrics collection and monitoring system that can be used to collect metrics from Kubernetes clusters.
Tracing: Jaeger is a popular open source distributed tracing system that can be used to trace requests through Kubernetes clusters.

These are just a few examples of the many open source tools that can be used for observability in Kubernetes. By carefully evaluating your needs and choosing the right tools, you can improve the observability of your Kubernetes clusters and make it easier to troubleshoot and resolve issues.

Multi-Region, Multi-Primary, and Eventually Consistent? You're Crazy - Bryan Oliver, Thoughtworks

This presentation focuses on the use of open policy agent (OPA) and Istio multi-cluster to accomplish a global multi-region asynchronous multi-primary distribution of APIs. This is a complex use case that is relevant in specific scenarios such as financial systems and auction-based applications.

OPA ensures that writes are only accepted from the right locations. This means that even if a request is routed to the correct place, it will still be validated by OPA to ensure that it is valid.

Istio multi-cluster is used to route requests to the correct regions. This is done using destination rules and virtual services. Destination rules define the criteria for targeting specific services, and virtual services define the rules for routing requests.

This is a cloud-native promise-focused solution. There are other ways to tackle this problem, such as using AWS services. However, this solution does not require developers to write any code in their applications in regards to the actual routing.

Lastly, it is highly recommended to use global service load balancers or DNS accelerators to get requests into your network as quickly as possible. This solution does not do that for you.

以下は、マルチリージョンでの金融取引やオークションシステムにおけるポリシー適用とルーティングパターンについての発表の要約です。金融取引やオークションでは、時間的制約、データの権利に関する規制、遅延による多額の損失など、さまざまな要件がある。マルチリージョン環境下では、リージョン間でデータを複製し、適切なリージョンへ書き込み/読み取りリクエストをルーティングする必要がある。この課題を解決するため、Open Policy Agent(OPA)とIstio Service Meshを組み合わせた方法が紹介された。OPAではポリシーを定義し、リクエストの許可/拒否を制御する。IstioのDestination RulesとVirtual Servicesを使って、リクエストの種類(読み書き)に応じて適切なリージョンへルーティングできる。具体的には、書き込みリクエストに対してはソースリージョンを確認し、ポリシーに基づき許可された場合のみ、プライマリデータベースのあるリージョンへルーティングする。一方、読み取りリクエストは最も近いリージョンへルーティングされる。こうしてデータの整合性を保ちつつ、レイテンシを最小限に抑える。この手法により、クラウドネイティブなマルチリージョンシステムにおいて、アプリケーションコードを変更することなく、ポリシーとルーティングを制御できる。ただし、リクエストをネットワーク内に効率的に流し込むため、グローバルロードバランサーなどの追加の仕組みが必要になる。

multi-region primary databaseの利点を活用したい場合、mutli-region, clusterが活用できる

writeは、データストアに近いPodから行いたいので、Destination Ruleでルーティング

Destionation Rule

OPA Envoy plugin

Vanguard's multi region approach

GoTo Financial’s Story: Towards 10k ArgoCD Apps to Support Billions of $ Transactions - Giri Kuncoro & Giovan Isa Musthofa, GoTo Financial

ここでは、Kubernetes上で大規模なアプリケーションを運用する際の課題と対策について説明されています。主な内容は以下の通りです。 1. UIの高速化:初期ロード時間が長くなる問題に対し、データ圧縮やアプリケーションのフィルタリング機能を導入して対応。 2. CPU利用の最適化:CPU制限値の設定、リソース制約の調整、reconciliation処理の最適化などにより、CPU消費を抑える工夫をした。 3. クラスタのスケーリング:アプリケーション数の増加に伴い、レポジトリ、APIサーバー、コントローラーコンポーネントのスケーリングが必要になった。シャーディング、レプリカ数の増加、リソース割り当ての最適化などの対策を講じた。 4. アーキテクチャの進化:モノリシックからマイクロサービスアーキテクチャへの移行を検討中。フェデレーテッドなアプローチによる分散化やパフォーマンス向上が期待できる。 5. その他の技術的課題:gRPCの問題、マニフェストのキャッシュ制御、再起動時の接続切断などについても対処した。付随して、大規模展開におけるArgo CDの有用性や、コミュニティとの連携による開発の促進などが言及されています。全体として、大規模Kubernetes運用における実践的な経験とノウハウが示されています。

Argocd、集中管理でのパフォーマンス劣化
- argocd チューニング
- gzip compression
- k8s cpu limits U(CFS )
- repo-server replica HPA
- timeout
- repository caching
今後
　　　 - Akuity をもちいたハイブリッドモデル

https://akuity.io/

Cilium Cluster Mesh: The eBPF-Powered Multicluster Solution for Kubernetes - Abdul Basit, Rakuten Symphony

The presentation is about a service mesh based multi cluster networking, Cilium.

Cilium is a Kubernetes service mesh that provides networking, security, visibility, and load balancing for applications running on Kubernetes. Cilium is based on eBPF, a technology that enables the dynamic insertion of code into the Linux kernel without having to recompile it.

Cilium can be used to create a single logical cluster of multiple Kubernetes clusters. This can be useful for organizations that want to run their applications on multiple clouds or data centers.

Cilium can also be used to improve the availability of applications. If an application is running on multiple clusters, Cilium can automatically failover to another cluster if one cluster becomes unavailable.

Cilium can also be used to improve the security of applications. Cilium can enforce network policies that restrict which applications can communicate with each other. Cilium can also encrypt traffic between applications.

Finally, Cilium can be used to improve the observability of applications. Cilium provides metrics and logs that can be used to troubleshoot application problems.

The presentation then goes on to demonstrate how to use Cilium to create a multi cluster network. The presentation also demonstrates how to use Cilium to improve the availability, security, and observability of applications.

ここで話されている主な内容は、クラウドネイティブなアプリケーションをマルチクラスター環境で実行する際の課題と、そのための製品であるCiliumを使った解決策について説明されています。要約すると: - クラウドネイティブアプリケーションでは、マルチクラスター環境での実行が求められる場合がある。しかし、サービスディスカバリ、ネットワーク接続、モニタリング、暗号化、アクセス制御、トラフィック管理など、様々な課題がある。 - Ciliumは、これらの課題を1つの製品で解決できるサービスメッシュ製品である。CNIプラグインによりネットワーク接続を提供し、エージェントとサーバーの構成でマルチクラスター環境におけるサービスディスカバリ、ポリシー適用、ロードバランシングなどを実現する。 - デモではCiliumを使って2つのクラスターにまたがるサービスの疎通を実現し、北南方向(インターネットからのアクセス)への対応も示した。ただし一部の機能では制約があることも説明された。 - 全体としてCiliumは、マルチクラスタークラウドネイティブアプリケーションにおける様々な要求を1つの製品で満たす点が強みとされている。

Trust in Honk, Tie up Your Yaml: A Kpack Experience - Ram Iyengar, Cloud Foundry Foundation

:::
K-pack is a Kubernetes automation tool that runs natively and helps people generate OCI compatible container images, auto source code. It is an alternative to a darker wait and provides a more repeatable way of building containers. K-pack is also integrated with a variety of security tools, such as SBOM generation, digital signatures, and isolated build environments.

The presentation also included a demo of K-pack in action. The demo showed how to use K-pack to take source code from a GitHub repo, create a container out of it, sign the container, and upload it to a Docker Hub instance. The demo also showed how to use K-pack to rebuild the container and verify it after a change has been made to the source code.

Overall, the presentation provided a good overview of K-pack and its features. It also showed how K-pack can be used to improve the security of the software supply chain.
:::

以下は、Kubernetes自動化ツールであるK-Packについての解説です。 K-PackはKubernetesクラスター上で直接実行され、ソースコードからOCI互換のコンテナイメージを自動生成することができます。Dockerビルドコマンドの代替としてK-Packを使うことで、特権コンテナの使用時の問題を回避でき、再現性の高いビルドが可能になります。 K-Packを使うとGitベースの自動化が簡単に設定できます。Gitリポジトリを監視してコードの変更を検知し、新しいイメージビルドをトリガーすることができます。また、ビルド時にソフトウェア供給チェーンセキュリティのベストプラクティスを適用することもできます。具体的には、ビルドの成果物であるコンテナイメージに対して、ソフトウェア資産(SBOM)の生成、デジタル署名の付与、セキュリティスコアリングなどの機能を提供しています。これらにより、イメージの整合性を検証し、供給元を証明することができます。デモではNode.jsアプリケーションのビルドを例に、K-Packを使ったイメージ作成の流れと、コサインなどのツールを使ったイメージ署名、検証の手順を実演しています。最終的に、イメージに含まれるソフトウェアのライセンス情報なども確認できます。 K-PackはKubernetesネイティブのツールであり、クラウドネイティブなアプリケーション開発において、アーティファクトのビルドとセキュアなデプロイを自動化するのに適した選択肢となっています。

https://slsa.dev/

Self-Service Multi-Cloud K8s Platform at a Regulated Financial Enterprise - Yash Gandhi, Morgan Stanley

The presentation is about building a production-grade Kubernetes environment from scratch. The presenter discusses the challenges of building such an environment, the architecture of the platform, and the security aspects of the platform. The presentation also discusses how the platform can be used to deploy applications in a secure and compliant manner.

The platform is built on a foundation of Terraform and Kubernetes. Terraform is used to provision the infrastructure for the Kubernetes clusters, and Kubernetes is used to orchestrate the containers that run the applications. The platform also includes a number of open source components that are used to provide additional features and functionality.

The platform is designed to be secure and compliant with a number of regulatory requirements. The platform includes a number of security controls that are used to protect the environment from attacks. The platform also includes a number of features that are used to help users comply with regulatory requirements.

The platform is designed to be easy to use. Users can provision new Kubernetes clusters in a matter of minutes. Users can also deploy applications to the Kubernetes clusters using a simple and intuitive interface.

The platform is a valuable tool for organizations that are looking to build a production-grade Kubernetes environment. The platform is secure, compliant, and easy to use.

WebAssembly for AI Infra : A Lightweight, Fast, and Secure Alternative Approach - Miley Fu, Second State Inc

The presentation is about how to use WebAssembly to run lightweight, fast, and secure AI inference. The speaker first introduced the limitations of Python and Docker for AI inference, and then explained why Rust and WebAssembly are the right solutions. They also showed the performance and memory benefits of using Rust and WebAssembly, and how to run a large language model on a local computer. Finally, the speaker demonstrated how to build a serverless AI app in just a few minutes with Rust and WebAssembly.

The main points of the presentation are:

Python has limitations for AI inference, such as performance, memory management, and portability.
Docker also has limitations, such as long cold start time, large disk space, and security risks.
Rust and WebAssembly are the right solutions for AI inference because they offer performance, memory safety, and portability.
WebAssembly is a new technology that is gaining traction for cloud computing. It is lightweight, fast, and secure.
With Rust and WebAssembly, you can run a large language model on your local computer.
You can also build a serverless AI app in just a few minutes with Rust and WebAssembly.

以下は、RustとWebAssemblyを使ってAI推論を高速かつ軽量に実行するための基盤技術について説明しています。主な内容は次のとおりです。 - PythonとDockerによるAI推論の課題(パフォーマンス、メモリ使用量、依存関係の複雑さなど) - RustとWebAssemblyの利点(高速、軽量、ポータブル、セキュア) - WebAssemblyが次世代のクラウドコンピューティングになるとの見解 - WebAssemblyを使ったAI推論の実例(大規模言語モデルの軽量実行、リアルタイム画像/音声解析など) - Rust/WebAssemblyでAIアプリを迅速に構築するためのフレームワークの紹介 - Discord/GitHubなどとの連携によるAIボット/コードレビューボットの構築例 Pythonには様々な課題があり、RustやWebAssemblyを活用することで、AIシステムをより高速で軽量に構築できると説明しています。また、実際にWebAssemblyを使ったAIアプリケーションのデモを紹介しながら、その利点と構築の手順を解説しています。WebAssemblyが次世代のクラウドコンピューティングの中核になるとの見解も示されています。

python、パフォマンス問題。Mojoは32xはやい
Wasmedge は、Highlevel VM

開発者をひきつけるポイントは、ローリング時間がはやいこと

Llama2 7bでM1 Macで試してみた、

WasmEdgeをLLM Support有効でインストール

curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- --plugin wasi_nn-ggml

curl -LO https://github.com/second-state/llama-utils/raw/main/chat/llama-chat.wasm

Llama2 7b Chat Modelをダウンロード

wasmedge --dir .:. --nn-preload default:GGML:AUTO:llama-2-7b-chat-q5_k_m.gguf llama-chat.wasm

llama-chatは、下記llama-utilsから取得する