👏
Phi-3-vision-128k-instructをAzure AI Studio上で触ってみる
やること
Phi-3-vision-128k-instructをAzure AI Studio上で触ってみる
Phi-3-vision-128k-instructとは?
2023/5/22のMicrosoft Buildで発表されたSLMです。
言語と視覚機能を備えた4.2Bのマルチモーダルモデルです。
詳細なことは、下記のブログを参照してください。
とても細かくわかりやすく説明されています。
検証手順
- Azure AI studioを開く
- 左タブの「Model catalog」をクリック
- 「Phi-3-vision-128k-instruct」と検索し、「Phi-3-vision-128k-instruct」をクリック
- 表示されたことを確認
画像の説明
- 以下の画像で検証
- 返答を確認
The image displays a logo consisting of a stylized eye with a blue and orange
color scheme, accompanied by the text 'Headwaters' in a bold, sans-serif font.
The eye is positioned to the left of the text, and the overall design is simple
and modern.
- 日本語で返答を求めてみる
- 違う画像を投げる(エスコンフィールド)
- 返答を確認
The image shows an indoor sports facility with a large, open-air roof
structure. The facility is equipped with a baseball field, including
a pitcher's mound and a batter's box. The field is surrounded by a fence,
and there are advertisements and sponsor logos visible on the walls.
The sky is overcast, and the lighting suggests it's either dawn or dusk.
- 違う画像を投げる(梅田駅の顔認証改札)
- 返答を確認
The image depicts an airport terminal with a check-in counter.
There are two people standing in front of the counter,
and a third person is walking past. The terminal has a digital display
board with various flight information, and there are signs indicating
'Keep Right' and 'Exit'. The floor is tiled,
and there are escalators in the background.
OCR
- 以下の画像でOCRしてみる
- 返答を確認
The image shows a webpage with a profile for 'takekawa tomoki' who is described
as a Microsoft Data & AI developer. The profile includes social media
statistics with 598 likes, 25 followings, and 22 followers. Below the profile, there are sections for Articles, Scraps, and Comments, with the number of
articles being 105, scraps 3, and comments not visible.
The main content of the page includes a series of articles or posts with
the label 'TECH' and titles related to technology and Microsoft products
such as Azure, GPT-40 API, Python, Azure AI Document Intelligence,
SQL Database, Open AI, GPT-40, and Azure Developer CLI. Each post includes a thumbnail image, a title, a brief
まとめ
Phi-3-vision-128k-instructを触ってみました。
他のSLMと比べて、性能/レスポンス速度が高いなーと思いました。
Jetson orin nanoで動くのか気になりました。今後やります。
Discussion