スキャンした紙資料のOCR(光学文字認識)を低コストなAlibaba Cloudでやってみた
つい先日、大量の紙資料をデジタルデータ化するにあたって作業のスピードアップを図りたい、という相談がありました。スキャンデータの書式や画質がまちまちであるため人の目による確認作業を廃することが叶わず、データ登録ツールの改良でなんとか対応したい。
今回の相談の前提条件として
- フォームに入力されたテキストと既存データの一致度を算出する機能がすでにある
- アップロードした画像を範囲選択する機能が実装済み
なので、作業者の手入力箇所を削減できればそれなりの短時間化が見込めるのではないかとのことでした。
そこで手頃なOCRサービスを探していたところ、Alibaba Cloudで提供されているContent Moderation v1.0にOCR機能があることを知りました。
1,000枚あたりの料金が
0.504 USD以下(上海リージョン)
0.850 USD以下(シンガポールリージョン)
と、かなりお手頃。
どのようなものかと気になったのでこちらの手順に沿って試してみます。
1. Pythonの仮想環境を作成する
uvを使います
% uv init ocrtest && cd ocrtest
Initialized project `ocrtest` at `/Users/user/ocrtest`
2. Python SDKを導入する
次が必要です
- https://pypi.org/project/aliyun-python-sdk-core-v3/
- https://pypi.org/project/aliyun-python-sdk-green/
- こちらのインストール手順にあるzipファイル
uv add
でSDKのパッケージを追加し、
% uv add aliyun-python-sdk-core-v3 aliyun-python-sdk-green
Using CPython 3.11.10 interpreter at: /Users/user/.asdf/installs/python/3.11.10/bin/python
Creating virtual environment at: .venv
Resolved 8 packages in 440ms
Built aliyun-python-sdk-core==2.16.0
Prepared 2 packages in 2.09s
Installed 7 packages in 76ms
+ aliyun-python-sdk-core==2.16.0
+ aliyun-python-sdk-core-v3==2.13.33
+ aliyun-python-sdk-green==3.6.6
+ cffi==1.17.1
+ cryptography==43.0.3
+ jmespath==0.10.0
+ pycparser==2.22
ダウンロードしたzipファイルのextension以下を
./.venv/lib/python3.11/site-packages/aliyunsdkgreen/request/
へコピーしました。
3. アクセスキーを設定する
AccessKey Managementで作成したIDとシークレットをそれぞれ環境変数に設定します。
% export ALIBABA_CLOUD_ACCESS_KEY_ID=<AccessKey ID>
% export ALIBABA_CLOUD_ACCESS_KEY_SECRET=<AccessKey Secret>
4. Content Moderationを有効化する
コンソールからは見つけることができなかったので、公式サイトの製品ページからEnable Nowへ進みます。次の画面でActivate Nowを押すことで有効化が完了です。
OpenAPI Portalでサンプルコードをもらってくる
5.こちらのページからコードをコピペします
↓サンプルコードにbody(L26-34)を追加したもの ※ URLはGoogle画像検索から拝借しました
#!/usr/bin/env python
#coding=utf-8
import os
from aliyunsdkcore.client import AcsClient
from aliyunsdkcore.request import CommonRequest
from aliyunsdkcore.auth.credentials import AccessKeyCredential
from aliyunsdkcore.auth.credentials import StsTokenCredential
# Please ensure that the environment variables ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET are set.
credentials = AccessKeyCredential(os.environ['ALIBABA_CLOUD_ACCESS_KEY_ID'], os.environ['ALIBABA_CLOUD_ACCESS_KEY_SECRET'])
# use STS Token
# credentials = StsTokenCredential(os.environ['ALIBABA_CLOUD_ACCESS_KEY_ID'], os.environ['ALIBABA_CLOUD_ACCESS_KEY_SECRET'], os.environ['ALIBABA_CLOUD_SECURITY_TOKEN'])
client = AcsClient(region_id='cn-shanghai', credential=credentials)
request = CommonRequest()
request.set_accept_format('json')
request.set_method('POST')
request.set_protocol_type('https') # https | http
request.set_domain('green-cip.ap-southeast-1.aliyuncs.com')
request.set_version('2018-05-09')
request.add_header('Content-Type', 'application/json')
request.set_uri_pattern('/green/image/scan')
**body = '''{
"scenes": ["ocr"],
"tasks": [
{
"url": "https://www.bousai.go.jp/kaigirep/hakusho/h14/bousai2002/html/hyo/img/hy120502.jpg"
}
]
}'''
request.set_content(body.encode("utf-8"))**
response = client.do_action_with_exception(request)
# python2: print(response)
print(str(response, encoding = 'utf-8'))
uv run
で実行する
6. uv run hello.py
{"code":200,"data":[{"code":200,"extras":{},"msg":"OK","results":[{"label":"ocr","ocrData":["表2-5-21,近年主土砂災害による死者·行方不明者状況,死,年月,原因,発生箇所,行方不明者数,兵庫表六甲,昭和42.7,中豪雨,島 呉市周辺,集中豪雨,88,黑川村,集中豪雨,31,43.8,阜白河村,台凰第7号,104,44.6,鹿児島鹿児島市周边,ラスがけ崩れ,児島鹿児島市周辺,シラスがけ崩れ,47,46.8,三重尾鹫·熊野, 集中豪雨,36,本天草周辺,47.7, 集中豪雨,115,高知土佐山田町, 集中豪雨,60,49.7,香川小豆島,台風第8号,29,50.8,青森岩木山, 集中豪雨,高知仁淀川周辺,台風第5号,68,シラスがけ崩れ,51.6,鹿児島鹿児島市周辺,32,小豆島等全国,台風第17号,53.5,新潟妙高高原町,融雪,56.8,長野字原,台風第15号, 10,長崎等全国,57.7,集中豪雨,259,三重等全国,風第10号,島根等全国,58.7,集中豪雨,94,59.6,本五木町,梅雨前線, 16,長野王滝村,地震,29,60.2,新潟青梅町 ,青梅,地すべり,長野長野市,梅雨前線,26,61.7,鹿児島鹿児島市,シラスがけ崩れ,18,広島加計町,梅雨前線,63.7,10,平成元.7,福井越前町,岩石の落下,熊本一の宮町,2.7,梅雨前,11,鹿児島瀬戸内町,台風第19号,11,5.8,鹿児島鹿児島市,豪雨,47,兵庫西宫市,阪神·淡路大震災,7.1,34,新潟·長野,融雪·降水,8.12,鹿児島出水市,梅雨前線,9.7,21,11.6,広島等全国,梅雨前線,24,注:死者·行方不明者数が10人以上もの,者"],"ocrLocations":[{"h":19.0,"text":"表2-5-21","w":106.0,"x":21.0,"y":2.0},{"h":22.0,"text":"近年主土砂災害による死者·行方不明者状況","w":417.0,"x":147.0,"y":2.0},{"h":18.0,"text":"死","w":21.0,"x":516.0,"y":49.0},{"h":20.0,"text":"年月","w":37.0,"x":51.0,"y":58.0},{"h":19.0,"text":"原因","w":38.0,"x":403.0,"y":58.0},{"h":21.0,"text":"発生箇所","w":80.0,"x":198.0,"y":58.0},{"h":20.0,"text":"行方不明者数","w":115.0,"x":514.0,"y":69.0},{"h":22.0,"text":"兵庫表六甲","w":111.0,"x":141.0,"y":96.0},{"h":20.0,"text":"昭和42.7","w":91.0,"x":24.0,"y":97.0},{"h":20.0,"text":"中豪雨","w":76.0,"x":347.0,"y":97.0},{"h":20.0,"text":"島 呉市周辺","w":108.0,"x":165.0,"y":123.0},{"h":19.0,"text":"集中豪雨","w":76.0,"x":347.0,"y":124.0},{"h":17.0,"text":"88","w":22.0,"x":567.0,"y":126.0},{"h":18.0,"text":"黑川村","w":73.0,"x":184.0,"y":152.0},{"h":19.0,"text":"集中豪雨","w":76.0,"x":347.0,"y":152.0},{"h":19.0,"text":"31","w":20.0,"x":567.0,"y":152.0},{"h":20.0,"text":"43.8","w":52.0,"x":57.0,"y":179.0},{"h":20.0,"text":"阜白河村","w":101.0,"x":154.0,"y":179.0},{"h":19.0,"text":"台凰第7号","w":92.0,"x":346.0,"y":179.0},{"h":18.0,"text":"104","w":27.0,"x":559.0,"y":180.0},{"h":18.0,"text":"44.6","w":56.0,"x":57.0,"y":208.0},{"h":21.0,"text":"鹿児島鹿児島市周边","w":175.0,"x":146.0,"y":206.0},{"h":19.0,"text":"ラスがけ崩れ","w":132.0,"x":346.0,"y":207.0},{"h":20.0,"text":"児島鹿児島市周辺","w":174.0,"x":154.0,"y":233.0},{"h":18.0,"text":"シラスがけ崩れ","w":131.0,"x":348.0,"y":235.0},{"h":18.0,"text":"47","w":20.0,"x":567.0,"y":235.0},{"h":19.0,"text":"46.8","w":51.0,"x":59.0,"y":235.0},{"h":19.0,"text":"三重尾鹫·熊野","w":142.0,"x":152.0,"y":261.0},{"h":20.0,"text":" 集中豪雨","w":80.0,"x":346.0,"y":261.0},{"h":18.0,"text":"36","w":21.0,"x":567.0,"y":262.0},{"h":20.0,"text":"本天草周辺","w":122.0,"x":153.0,"y":288.0},{"h":19.0,"text":"47.7","w":54.0,"x":57.0,"y":289.0},{"h":19.0,"text":" 集中豪雨","w":77.0,"x":347.0,"y":289.0},{"h":18.0,"text":"115","w":27.0,"x":560.0,"y":290.0},{"h":21.0,"text":"高知土佐山田町","w":148.0,"x":142.0,"y":315.0},{"h":20.0,"text":" 集中豪雨","w":80.0,"x":347.0,"y":316.0},{"h":17.0,"text":"60","w":21.0,"x":567.0,"y":318.0},{"h":19.0,"text":"49.7","w":53.0,"x":58.0,"y":344.0},{"h":20.0,"text":"香川小豆島","w":111.0,"x":148.0,"y":344.0},{"h":19.0,"text":"台風第8号","w":95.0,"x":346.0,"y":344.0},{"h":18.0,"text":"29","w":21.0,"x":567.0,"y":345.0},{"h":19.0,"text":"50.8","w":52.0,"x":57.0,"y":371.0},{"h":19.0,"text":"青森岩木山","w":113.0,"x":145.0,"y":371.0},{"h":19.0,"text":" 集中豪雨","w":79.0,"x":347.0,"y":371.0},{"h":20.0,"text":"高知仁淀川周辺","w":150.0,"x":144.0,"y":398.0},{"h":19.0,"text":"台風第5号","w":98.0,"x":346.0,"y":399.0},{"h":19.0,"text":"68","w":21.0,"x":567.0,"y":399.0},{"h":18.0,"text":"シラスがけ崩れ","w":132.0,"x":345.0,"y":427.0},{"h":19.0,"text":"51.6","w":56.0,"x":58.0,"y":427.0},{"h":21.0,"text":"鹿児島鹿児島市周辺","w":188.0,"x":143.0,"y":426.0},{"h":18.0,"text":"32","w":21.0,"x":567.0,"y":428.0},{"h":20.0,"text":"小豆島等全国","w":116.0,"x":142.0,"y":454.0},{"h":20.0,"text":"台風第17号","w":98.0,"x":346.0,"y":454.0},{"h":18.0,"text":"53.5","w":58.0,"x":58.0,"y":482.0},{"h":20.0,"text":"新潟妙高高原町","w":153.0,"x":143.0,"y":481.0},{"h":18.0,"text":"融雪","w":40.0,"x":346.0,"y":482.0},{"h":18.0,"text":"56.8","w":55.0,"x":57.0,"y":510.0},{"h":20.0,"text":"長野字原","w":95.0,"x":144.0,"y":509.0},{"h":19.0,"text":"台風第15号","w":95.0,"x":347.0,"y":509.0},{"h":19.0,"text":" 10","w":22.0,"x":567.0,"y":510.0},{"h":20.0,"text":"長崎等全国","w":95.0,"x":145.0,"y":536.0},{"h":19.0,"text":"57.7","w":57.0,"x":58.0,"y":537.0},{"h":18.0,"text":"集中豪雨","w":76.0,"x":347.0,"y":538.0},{"h":18.0,"text":"259","w":29.0,"x":558.0,"y":538.0},{"h":20.0,"text":"三重等全国","w":98.0,"x":142.0,"y":564.0},{"h":19.0,"text":"風第10号","w":95.0,"x":347.0,"y":565.0},{"h":19.0,"text":"島根等全国","w":96.0,"x":144.0,"y":591.0},{"h":19.0,"text":"58.7","w":56.0,"x":58.0,"y":592.0},{"h":19.0,"text":"集中豪雨","w":76.0,"x":347.0,"y":592.0},{"h":17.0,"text":"94","w":20.0,"x":567.0,"y":593.0},{"h":18.0,"text":"59.6","w":54.0,"x":59.0,"y":620.0},{"h":18.0,"text":"本五木町","w":94.0,"x":161.0,"y":620.0},{"h":20.0,"text":"梅雨前線","w":79.0,"x":346.0,"y":619.0},{"h":19.0,"text":" 16","w":20.0,"x":568.0,"y":620.0},{"h":18.0,"text":"長野王滝村","w":116.0,"x":142.0,"y":648.0},{"h":19.0,"text":"地震","w":42.0,"x":347.0,"y":648.0},{"h":17.0,"text":"29","w":20.0,"x":568.0,"y":649.0},{"h":18.0,"text":"60.2","w":55.0,"x":58.0,"y":676.0},{"h":19.0,"text":"新潟青梅町 ","w":114.0,"x":142.0,"y":675.0},{"h":8.0,"text":"青梅","w":51.0,"x":196.0,"y":680.0},{"h":19.0,"text":"地すべり","w":74.0,"x":346.0,"y":675.0},{"h":20.0,"text":"長野長野市","w":104.0,"x":150.0,"y":702.0},{"h":19.0,"text":"梅雨前線","w":77.0,"x":346.0,"y":702.0},{"h":18.0,"text":"26","w":20.0,"x":568.0,"y":704.0},{"h":19.0,"text":"61.7","w":58.0,"x":57.0,"y":730.0},{"h":21.0,"text":"鹿児島鹿児島市","w":152.0,"x":142.0,"y":729.0},{"h":19.0,"text":"シラスがけ崩れ","w":130.0,"x":349.0,"y":730.0},{"h":17.0,"text":"18","w":20.0,"x":569.0,"y":732.0},{"h":20.0,"text":"広島加計町","w":103.0,"x":146.0,"y":757.0},{"h":18.0,"text":"梅雨前線","w":76.0,"x":347.0,"y":758.0},{"h":19.0,"text":"63.7","w":56.0,"x":57.0,"y":758.0},{"h":19.0,"text":"10","w":20.0,"x":569.0,"y":758.0},{"h":20.0,"text":"平成元.7","w":90.0,"x":25.0,"y":785.0},{"h":19.0,"text":"福井越前町","w":102.0,"x":146.0,"y":785.0},{"h":18.0,"text":"岩石の落下","w":97.0,"x":347.0,"y":786.0},{"h":20.0,"text":"熊本一の宮町","w":122.0,"x":145.0,"y":811.0},{"h":19.0,"text":"2.7","w":53.0,"x":63.0,"y":813.0},{"h":19.0,"text":"梅雨前","w":77.0,"x":347.0,"y":813.0},{"h":17.0,"text":"11","w":18.0,"x":569.0,"y":814.0},{"h":20.0,"text":"鹿児島瀬戸内町","w":145.0,"x":145.0,"y":839.0},{"h":19.0,"text":"台風第19号","w":97.0,"x":347.0,"y":839.0},{"h":18.0,"text":"11","w":18.0,"x":569.0,"y":841.0},{"h":19.0,"text":"5.8","w":52.0,"x":64.0,"y":867.0},{"h":21.0,"text":"鹿児島鹿児島市","w":143.0,"x":147.0,"y":866.0},{"h":19.0,"text":"豪雨","w":42.0,"x":346.0,"y":868.0},{"h":19.0,"text":"47","w":20.0,"x":568.0,"y":868.0},{"h":20.0,"text":"兵庫西宫市","w":102.0,"x":149.0,"y":894.0},{"h":20.0,"text":"阪神·淡路大震災","w":151.0,"x":346.0,"y":894.0},{"h":19.0,"text":"7.1","w":52.0,"x":64.0,"y":895.0},{"h":18.0,"text":"34","w":20.0,"x":568.0,"y":896.0},{"h":20.0,"text":"新潟·長野","w":99.0,"x":143.0,"y":921.0},{"h":18.0,"text":"融雪·降水","w":92.0,"x":347.0,"y":923.0},{"h":19.0,"text":"8.12","w":56.0,"x":64.0,"y":923.0},{"h":20.0,"text":"鹿児島出水市","w":136.0,"x":142.0,"y":948.0},{"h":19.0,"text":"梅雨前線","w":79.0,"x":346.0,"y":949.0},{"h":18.0,"text":"9.7","w":55.0,"x":63.0,"y":950.0},{"h":16.0,"text":"21","w":18.0,"x":568.0,"y":951.0},{"h":18.0,"text":"11.6","w":60.0,"x":59.0,"y":977.0},{"h":19.0,"text":"広島等全国","w":118.0,"x":126.0,"y":976.0},{"h":20.0,"text":"梅雨前線","w":80.0,"x":345.0,"y":976.0},{"h":18.0,"text":"24","w":19.0,"x":568.0,"y":978.0},{"h":17.0,"text":"注:死者·行方不明者数が10人以上もの","w":276.0,"x":8.0,"y":1009.0},{"h":21.0,"text":"者","w":19.0,"x":610.0,"y":48.0}],"rate":99.91,"scene":"ocr","suggestion":"review"}],"taskId":"imgUrq$d$Frwm5cmPPNEAXjR-1AGTRx","url":"https://www.bousai.go.jp/kaigirep/hakusho/h14/bousai2002/html/hyo/img/hy120502.jpg"}],"msg":"OK","requestId":"5035B49E-3D57-3D78-A79E-992E57ACE0FA"}
整形済みのJSON
{
"code": 200,
"data": [
{
"code": 200,
"extras": {},
"msg": "OK",
"results": [
{
"label": "ocr",
"ocrData": [
"表2-5-21,近年主土砂災害による死者·行方不明者状況,死,年月,原因,発生箇所,行方不明者数,兵庫表六甲,昭和42.7,中豪雨,島 呉市周辺,集中豪雨,88,黑川村,集中豪雨,31,43.8,阜白河村,台凰第7号,104,44.6,鹿児島鹿児島市周边,ラスがけ崩れ,児島鹿児島市周辺,シラスがけ崩れ,47,46.8,三重尾鹫·熊野, 集中豪雨,36,本天草周辺,47.7, 集中豪雨,115,高知土佐山田町, 集中豪雨,60,49.7,香川小豆島,台風第8号,29,50.8,青森岩木山, 集中豪雨,高知仁淀川周辺,台風第5号,68,シラスがけ崩れ,51.6,鹿児島鹿児島市周辺,32,小豆島等全国,台風第17号,53.5,新潟妙高高原町,融雪,56.8,長野字原,台風第15号, 10,長崎等全国,57.7,集中豪雨,259,三重等全国,風第10号,島根等全国,58.7,集中豪雨,94,59.6,本五木町,梅雨前線, 16,長野王滝村,地震,29,60.2,新潟青梅町 ,青梅,地すべり,長野長野市,梅雨前線,26,61.7,鹿児島鹿児島市,シラスがけ崩れ,18,広島加計町,梅雨前線,63.7,10,平成元.7,福井越前町,岩石の落下,熊本一の宮町,2.7,梅雨前,11,鹿児島瀬戸内町,台風第19号,11,5.8,鹿児島鹿児島市,豪雨,47,兵庫西宫市,阪神·淡路大震災,7.1,34,新潟·長野,融雪·降水,8.12,鹿児島出水市,梅雨前線,9.7,21,11.6,広島等全国,梅雨前線,24,注:死者·行方不明者数が10人以上もの,者"
],
"ocrLocations": [
{
"h": 19.0,
"text": "表2-5-21",
"w": 106.0,
"x": 21.0,
"y": 2.0
},
{
"h": 22.0,
"text": "近年主土砂災害による死者·行方不明者状況",
"w": 417.0,
"x": 147.0,
"y": 2.0
},
{
"h": 18.0,
"text": "死",
"w": 21.0,
"x": 516.0,
"y": 49.0
},
{
"h": 20.0,
"text": "年月",
"w": 37.0,
"x": 51.0,
"y": 58.0
},
{
"h": 19.0,
"text": "原因",
"w": 38.0,
"x": 403.0,
"y": 58.0
},
{
"h": 21.0,
"text": "発生箇所",
"w": 80.0,
"x": 198.0,
"y": 58.0
},
{
"h": 20.0,
"text": "行方不明者数",
"w": 115.0,
"x": 514.0,
"y": 69.0
},
{
"h": 22.0,
"text": "兵庫表六甲",
"w": 111.0,
"x": 141.0,
"y": 96.0
},
{
"h": 20.0,
"text": "昭和42.7",
"w": 91.0,
"x": 24.0,
"y": 97.0
},
{
"h": 20.0,
"text": "中豪雨",
"w": 76.0,
"x": 347.0,
"y": 97.0
},
{
"h": 20.0,
"text": "島 呉市周辺",
"w": 108.0,
"x": 165.0,
"y": 123.0
},
{
"h": 19.0,
"text": "集中豪雨",
"w": 76.0,
"x": 347.0,
"y": 124.0
},
{
"h": 17.0,
"text": "88",
"w": 22.0,
"x": 567.0,
"y": 126.0
},
{
"h": 18.0,
"text": "黑川村",
"w": 73.0,
"x": 184.0,
"y": 152.0
},
{
"h": 19.0,
"text": "集中豪雨",
"w": 76.0,
"x": 347.0,
"y": 152.0
},
{
"h": 19.0,
"text": "31",
"w": 20.0,
"x": 567.0,
"y": 152.0
},
{
"h": 20.0,
"text": "43.8",
"w": 52.0,
"x": 57.0,
"y": 179.0
},
{
"h": 20.0,
"text": "阜白河村",
"w": 101.0,
"x": 154.0,
"y": 179.0
},
{
"h": 19.0,
"text": "台凰第7号",
"w": 92.0,
"x": 346.0,
"y": 179.0
},
{
"h": 18.0,
"text": "104",
"w": 27.0,
"x": 559.0,
"y": 180.0
},
{
"h": 18.0,
"text": "44.6",
"w": 56.0,
"x": 57.0,
"y": 208.0
},
{
"h": 21.0,
"text": "鹿児島鹿児島市周边",
"w": 175.0,
"x": 146.0,
"y": 206.0
},
{
"h": 19.0,
"text": "ラスがけ崩れ",
"w": 132.0,
"x": 346.0,
"y": 207.0
},
{
"h": 20.0,
"text": "児島鹿児島市周辺",
"w": 174.0,
"x": 154.0,
"y": 233.0
},
{
"h": 18.0,
"text": "シラスがけ崩れ",
"w": 131.0,
"x": 348.0,
"y": 235.0
},
{
"h": 18.0,
"text": "47",
"w": 20.0,
"x": 567.0,
"y": 235.0
},
{
"h": 19.0,
"text": "46.8",
"w": 51.0,
"x": 59.0,
"y": 235.0
},
{
"h": 19.0,
"text": "三重尾鹫·熊野",
"w": 142.0,
"x": 152.0,
"y": 261.0
},
{
"h": 20.0,
"text": " 集中豪雨",
"w": 80.0,
"x": 346.0,
"y": 261.0
},
{
"h": 18.0,
"text": "36",
"w": 21.0,
"x": 567.0,
"y": 262.0
},
{
"h": 20.0,
"text": "本天草周辺",
"w": 122.0,
"x": 153.0,
"y": 288.0
},
{
"h": 19.0,
"text": "47.7",
"w": 54.0,
"x": 57.0,
"y": 289.0
},
{
"h": 19.0,
"text": " 集中豪雨",
"w": 77.0,
"x": 347.0,
"y": 289.0
},
{
"h": 18.0,
"text": "115",
"w": 27.0,
"x": 560.0,
"y": 290.0
},
{
"h": 21.0,
"text": "高知土佐山田町",
"w": 148.0,
"x": 142.0,
"y": 315.0
},
{
"h": 20.0,
"text": " 集中豪雨",
"w": 80.0,
"x": 347.0,
"y": 316.0
},
{
"h": 17.0,
"text": "60",
"w": 21.0,
"x": 567.0,
"y": 318.0
},
{
"h": 19.0,
"text": "49.7",
"w": 53.0,
"x": 58.0,
"y": 344.0
},
{
"h": 20.0,
"text": "香川小豆島",
"w": 111.0,
"x": 148.0,
"y": 344.0
},
{
"h": 19.0,
"text": "台風第8号",
"w": 95.0,
"x": 346.0,
"y": 344.0
},
{
"h": 18.0,
"text": "29",
"w": 21.0,
"x": 567.0,
"y": 345.0
},
{
"h": 19.0,
"text": "50.8",
"w": 52.0,
"x": 57.0,
"y": 371.0
},
{
"h": 19.0,
"text": "青森岩木山",
"w": 113.0,
"x": 145.0,
"y": 371.0
},
{
"h": 19.0,
"text": " 集中豪雨",
"w": 79.0,
"x": 347.0,
"y": 371.0
},
{
"h": 20.0,
"text": "高知仁淀川周辺",
"w": 150.0,
"x": 144.0,
"y": 398.0
},
{
"h": 19.0,
"text": "台風第5号",
"w": 98.0,
"x": 346.0,
"y": 399.0
},
{
"h": 19.0,
"text": "68",
"w": 21.0,
"x": 567.0,
"y": 399.0
},
{
"h": 18.0,
"text": "シラスがけ崩れ",
"w": 132.0,
"x": 345.0,
"y": 427.0
},
{
"h": 19.0,
"text": "51.6",
"w": 56.0,
"x": 58.0,
"y": 427.0
},
{
"h": 21.0,
"text": "鹿児島鹿児島市周辺",
"w": 188.0,
"x": 143.0,
"y": 426.0
},
{
"h": 18.0,
"text": "32",
"w": 21.0,
"x": 567.0,
"y": 428.0
},
{
"h": 20.0,
"text": "小豆島等全国",
"w": 116.0,
"x": 142.0,
"y": 454.0
},
{
"h": 20.0,
"text": "台風第17号",
"w": 98.0,
"x": 346.0,
"y": 454.0
},
{
"h": 18.0,
"text": "53.5",
"w": 58.0,
"x": 58.0,
"y": 482.0
},
{
"h": 20.0,
"text": "新潟妙高高原町",
"w": 153.0,
"x": 143.0,
"y": 481.0
},
{
"h": 18.0,
"text": "融雪",
"w": 40.0,
"x": 346.0,
"y": 482.0
},
{
"h": 18.0,
"text": "56.8",
"w": 55.0,
"x": 57.0,
"y": 510.0
},
{
"h": 20.0,
"text": "長野字原",
"w": 95.0,
"x": 144.0,
"y": 509.0
},
{
"h": 19.0,
"text": "台風第15号",
"w": 95.0,
"x": 347.0,
"y": 509.0
},
{
"h": 19.0,
"text": " 10",
"w": 22.0,
"x": 567.0,
"y": 510.0
},
{
"h": 20.0,
"text": "長崎等全国",
"w": 95.0,
"x": 145.0,
"y": 536.0
},
{
"h": 19.0,
"text": "57.7",
"w": 57.0,
"x": 58.0,
"y": 537.0
},
{
"h": 18.0,
"text": "集中豪雨",
"w": 76.0,
"x": 347.0,
"y": 538.0
},
{
"h": 18.0,
"text": "259",
"w": 29.0,
"x": 558.0,
"y": 538.0
},
{
"h": 20.0,
"text": "三重等全国",
"w": 98.0,
"x": 142.0,
"y": 564.0
},
{
"h": 19.0,
"text": "風第10号",
"w": 95.0,
"x": 347.0,
"y": 565.0
},
{
"h": 19.0,
"text": "島根等全国",
"w": 96.0,
"x": 144.0,
"y": 591.0
},
{
"h": 19.0,
"text": "58.7",
"w": 56.0,
"x": 58.0,
"y": 592.0
},
{
"h": 19.0,
"text": "集中豪雨",
"w": 76.0,
"x": 347.0,
"y": 592.0
},
{
"h": 17.0,
"text": "94",
"w": 20.0,
"x": 567.0,
"y": 593.0
},
{
"h": 18.0,
"text": "59.6",
"w": 54.0,
"x": 59.0,
"y": 620.0
},
{
"h": 18.0,
"text": "本五木町",
"w": 94.0,
"x": 161.0,
"y": 620.0
},
{
"h": 20.0,
"text": "梅雨前線",
"w": 79.0,
"x": 346.0,
"y": 619.0
},
{
"h": 19.0,
"text": " 16",
"w": 20.0,
"x": 568.0,
"y": 620.0
},
{
"h": 18.0,
"text": "長野王滝村",
"w": 116.0,
"x": 142.0,
"y": 648.0
},
{
"h": 19.0,
"text": "地震",
"w": 42.0,
"x": 347.0,
"y": 648.0
},
{
"h": 17.0,
"text": "29",
"w": 20.0,
"x": 568.0,
"y": 649.0
},
{
"h": 18.0,
"text": "60.2",
"w": 55.0,
"x": 58.0,
"y": 676.0
},
{
"h": 19.0,
"text": "新潟青梅町 ",
"w": 114.0,
"x": 142.0,
"y": 675.0
},
{
"h": 8.0,
"text": "青梅",
"w": 51.0,
"x": 196.0,
"y": 680.0
},
{
"h": 19.0,
"text": "地すべり",
"w": 74.0,
"x": 346.0,
"y": 675.0
},
{
"h": 20.0,
"text": "長野長野市",
"w": 104.0,
"x": 150.0,
"y": 702.0
},
{
"h": 19.0,
"text": "梅雨前線",
"w": 77.0,
"x": 346.0,
"y": 702.0
},
{
"h": 18.0,
"text": "26",
"w": 20.0,
"x": 568.0,
"y": 704.0
},
{
"h": 19.0,
"text": "61.7",
"w": 58.0,
"x": 57.0,
"y": 730.0
},
{
"h": 21.0,
"text": "鹿児島鹿児島市",
"w": 152.0,
"x": 142.0,
"y": 729.0
},
{
"h": 19.0,
"text": "シラスがけ崩れ",
"w": 130.0,
"x": 349.0,
"y": 730.0
},
{
"h": 17.0,
"text": "18",
"w": 20.0,
"x": 569.0,
"y": 732.0
},
{
"h": 20.0,
"text": "広島加計町",
"w": 103.0,
"x": 146.0,
"y": 757.0
},
{
"h": 18.0,
"text": "梅雨前線",
"w": 76.0,
"x": 347.0,
"y": 758.0
},
{
"h": 19.0,
"text": "63.7",
"w": 56.0,
"x": 57.0,
"y": 758.0
},
{
"h": 19.0,
"text": "10",
"w": 20.0,
"x": 569.0,
"y": 758.0
},
{
"h": 20.0,
"text": "平成元.7",
"w": 90.0,
"x": 25.0,
"y": 785.0
},
{
"h": 19.0,
"text": "福井越前町",
"w": 102.0,
"x": 146.0,
"y": 785.0
},
{
"h": 18.0,
"text": "岩石の落下",
"w": 97.0,
"x": 347.0,
"y": 786.0
},
{
"h": 20.0,
"text": "熊本一の宮町",
"w": 122.0,
"x": 145.0,
"y": 811.0
},
{
"h": 19.0,
"text": "2.7",
"w": 53.0,
"x": 63.0,
"y": 813.0
},
{
"h": 19.0,
"text": "梅雨前",
"w": 77.0,
"x": 347.0,
"y": 813.0
},
{
"h": 17.0,
"text": "11",
"w": 18.0,
"x": 569.0,
"y": 814.0
},
{
"h": 20.0,
"text": "鹿児島瀬戸内町",
"w": 145.0,
"x": 145.0,
"y": 839.0
},
{
"h": 19.0,
"text": "台風第19号",
"w": 97.0,
"x": 347.0,
"y": 839.0
},
{
"h": 18.0,
"text": "11",
"w": 18.0,
"x": 569.0,
"y": 841.0
},
{
"h": 19.0,
"text": "5.8",
"w": 52.0,
"x": 64.0,
"y": 867.0
},
{
"h": 21.0,
"text": "鹿児島鹿児島市",
"w": 143.0,
"x": 147.0,
"y": 866.0
},
{
"h": 19.0,
"text": "豪雨",
"w": 42.0,
"x": 346.0,
"y": 868.0
},
{
"h": 19.0,
"text": "47",
"w": 20.0,
"x": 568.0,
"y": 868.0
},
{
"h": 20.0,
"text": "兵庫西宫市",
"w": 102.0,
"x": 149.0,
"y": 894.0
},
{
"h": 20.0,
"text": "阪神·淡路大震災",
"w": 151.0,
"x": 346.0,
"y": 894.0
},
{
"h": 19.0,
"text": "7.1",
"w": 52.0,
"x": 64.0,
"y": 895.0
},
{
"h": 18.0,
"text": "34",
"w": 20.0,
"x": 568.0,
"y": 896.0
},
{
"h": 20.0,
"text": "新潟·長野",
"w": 99.0,
"x": 143.0,
"y": 921.0
},
{
"h": 18.0,
"text": "融雪·降水",
"w": 92.0,
"x": 347.0,
"y": 923.0
},
{
"h": 19.0,
"text": "8.12",
"w": 56.0,
"x": 64.0,
"y": 923.0
},
{
"h": 20.0,
"text": "鹿児島出水市",
"w": 136.0,
"x": 142.0,
"y": 948.0
},
{
"h": 19.0,
"text": "梅雨前線",
"w": 79.0,
"x": 346.0,
"y": 949.0
},
{
"h": 18.0,
"text": "9.7",
"w": 55.0,
"x": 63.0,
"y": 950.0
},
{
"h": 16.0,
"text": "21",
"w": 18.0,
"x": 568.0,
"y": 951.0
},
{
"h": 18.0,
"text": "11.6",
"w": 60.0,
"x": 59.0,
"y": 977.0
},
{
"h": 19.0,
"text": "広島等全国",
"w": 118.0,
"x": 126.0,
"y": 976.0
},
{
"h": 20.0,
"text": "梅雨前線",
"w": 80.0,
"x": 345.0,
"y": 976.0
},
{
"h": 18.0,
"text": "24",
"w": 19.0,
"x": 568.0,
"y": 978.0
},
{
"h": 17.0,
"text": "注:死者·行方不明者数が10人以上もの",
"w": 276.0,
"x": 8.0,
"y": 1009.0
},
{
"h": 21.0,
"text": "者",
"w": 19.0,
"x": 610.0,
"y": 48.0
}
],
"rate": 99.91,
"scene": "ocr",
"suggestion": "review"
}
],
"taskId": "imgUrq$d$Frwm5cmPPNEAXjR-1AGTRx",
"url": "https://www.bousai.go.jp/kaigirep/hakusho/h14/bousai2002/html/hyo/img/hy120502.jpg"
}
],
"msg": "OK",
"requestId": "5035B49E-3D57-3D78-A79E-992E57ACE0FA"
}
とても簡単ですね。
入力画像が綺麗なので当然にしても認識精度は悪くないと思いました。
これが実際にスキャンした画像になるとここまで整ったものにはならないかもしれませんが、他のサービスでもそう違わないと思います。
料金が安くて気軽に使い始められるのはいいですね。
本当はOCR APIを使ってみたかったのですが、2024年10月現在Alibaba Cloud(国際版)においてはサービス未提供のようでした。Content Moderation v1.0のものに比べるとかなり高機能ですが料金設定もそれなりかもしれません。
RecognizeTableOcrやRecognizeJanpaneseなど、用途にあった操作を実行できるのは嬉しいですね。今後使えるようになったらぜひ試してみたいと思います。
Discussion