🌊

Prefect2.19.3からPrefectがDeploymentの際にPython依存ライブラリが無くてもよくなった。

2024/06/01に公開1

TL;DR;

https://github.com/PrefectHQ/prefect/issues/9512#issue-1704218542
のissueが2.19.3に解決されたので、検証してみた。
これにより、CI/CDをする際にわざわざ「全部入り」の実行環境でDeploymentを実行する必要がなくなった。

事前準備

検証手順

  1. 二種類のPrefectクライアント環境をvenvでインストールする

    python3 -m venv 2.19.2
    . 2.19.2/bin/activate
    pip install prefect==2.19.2
    deactivate
    
    python3 -m venv 2.19.3
    . 2.19.3/bin/activate
    pip install prefect==2.19.3
    deactivate
    
  2. どっちかでprefect cloudにloginしてdeploymentを登記する場所を決める

    . 2.19.2/bin/activate
    prefect cloud login
    cat $HOME/.prefect/profiles.toml
    
  3. 以下のようなflow.pyを作成

    from prefect import flow,task
    
    import dask # Deploy元にはインストールしていないライブラリ
    
    @task
    def sub():
        print("sub task", flush=True)
    
    @flow
    def main():
        _ = sub()
    # ここでmainをasync defすると解析に失敗します
    
    if __name__ == "__main__":
        main()
    
  4. とりあえず実行できない事を確認

    python flow.py
    Traceback (most recent call last):
      File "/root/2.19.2/flow.py", line 4, in <module>
        import dask # Deploy元にはインストールしていないライブラリ
        ^^^^^^^^^^^
    ModuleNotFoundError: No module named 'dask'
    
  5. それぞれの環境でprefect.yamlを作成しリネーム、編集

    . ./2.19.2/bin/activate
    prefect init
    # No, I'll use the default deployment configuration. を選択
    mv prefect.yaml 2.19.2.yaml
    
    2.19.2.yaml
    # Welcome to your prefect.yaml file! You can use this file for storing and managing
    # configuration for deploying your flows. We recommend committing this file to source
    # control along with your flow code.
    
    # Generic metadata about this project
    name: 2.19.2
    prefect-version: 2.19.2
    
    # build section allows you to manage and build docker images
    build: null
    
    # push section allows you to manage if and how this project is uploaded to remote locations
    push: null
    
    # pull section allows you to provide instructions for cloning this project in remote locations
    pull:
    - prefect.deployments.steps.set_working_directory:
        directory: /root/2.19.2
    
    # the deployments section allows you to provide configuration for deploying flows
    deployments:
    - name: mytest
      version: null
      tags: []
      description: null
      schedule: {}
      flow_name: null
      entrypoint: flow.py:main
      parameters: {}
      work_pool:
        name: null
        work_queue_name: null
        job_variables: {}
    
    . ./2.19.3/bin/activate
    prefect init
    # No, I'll use the default deployment configuration. を選択
    mv prefect.yaml 2.19.3.yaml
    
    2.19.3.yaml
    # Welcome to your prefect.yaml file! You can use this file for storing and managing
    # configuration for deploying your flows. We recommend committing this file to source
    # control along with your flow code.
    
    # Generic metadata about this project
    name: 2.19.3
    prefect-version: 2.19.3
    
    # build section allows you to manage and build docker images
    build: null
    
    # push section allows you to manage if and how this project is uploaded to remote locations
    push: null
    
    # pull section allows you to provide instructions for cloning this project in remote locations
    pull:
    - prefect.deployments.steps.set_working_directory:
        directory: /root/2.19.3
    
    # the deployments section allows you to provide configuration for deploying flows
    deployments:
    - name: mytest
      version: null
      tags: []
      description: null
      schedule: {}
      flow_name: null
      entrypoint: flow.py:main
      parameters: {}
      work_pool:
        name: null
        work_queue_name: null
        job_variables: {}
    
    1. それぞれの環境でprefect deployを実行する

       . ./2.19.2/bin/activate
      prefect deploy --prefect-file 2.19.2.yaml
      
       . ./2.19.3/bin/activate
      prefect deploy --prefect-file 2.19.3.yaml
      

検証結果

2.19.2の場合

? Would you like to use an existing deployment configuration? [Use arrows to move; enter to select; n
to select none]
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃   ┃ Name   ┃ Entrypoint   ┃ Description ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ > │ mytest │ flow.py:main │             │
└───┴────────┴──────────────┴─────────────┘
    No, configure a new deployment
Traceback (most recent call last):
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/root/2.19.2/flow.py", line 3, in <module>
    import dask # Deploy元にはインストールしていないライブラリ
    ^^^^^^^^^^^
ModuleNotFoundError: No module named 'dask'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/2.19.2/lib/python3.11/site-packages/prefect/cli/_utilities.py", line 42, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/root/2.19.2/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 304, in coroutine_wrapper
    return call()
           ^^^^^^
  File "/root/2.19.2/lib/python3.11/site-packages/prefect/_internal/concurrency/calls.py", line 432, in __call__
    return self.result()
           ^^^^^^^^^^^^^
  File "/root/2.19.2/lib/python3.11/site-packages/prefect/_internal/concurrency/calls.py", line 318, in result
    return self.future.result(timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/2.19.2/lib/python3.11/site-packages/prefect/_internal/concurrency/calls.py", line 179, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/root/2.19.2/lib/python3.11/site-packages/prefect/_internal/concurrency/calls.py", line 389, in _run_async
    result = await coro
             ^^^^^^^^^^
  File "/root/2.19.2/lib/python3.11/site-packages/prefect/cli/deploy.py", line 428, in deploy
    await _run_single_deploy(
  File "/root/2.19.2/lib/python3.11/site-packages/prefect/client/utilities.py", line 100, in with_injected_client
    return await fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/2.19.2/lib/python3.11/site-packages/prefect/cli/deploy.py", line 485, in _run_single_deploy
    flow = await run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/2.19.2/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 136, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/2.19.2/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/2.19.2/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/root/2.19.2/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/2.19.2/lib/python3.11/site-packages/prefect/flows.py", line 1668, in load_flow_from_entrypoint
    flow = import_object(entrypoint)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/2.19.2/lib/python3.11/site-packages/prefect/utilities/importtools.py", line 201, in import_object
    module = load_script_as_module(script_path)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/2.19.2/lib/python3.11/site-packages/prefect/utilities/importtools.py", line 164, in load_script_as_module
    raise ScriptError(user_exc=exc, path=path) from exc
prefect.exceptions.ScriptError: Script at 'flow.py' encountered an exception: ModuleNotFoundError("No module named 'dask'")
An exception occurred.

2.19.3の場合

prefect deploy --prefect-file 2.19.3.yaml 
? Would you like to use an existing deployment configuration? [Use arrows to move; enter to select; n
to select none]
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃   ┃ Name   ┃ Entrypoint   ┃ Description ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ > │ mytest │ flow.py:main │             │
└───┴────────┴──────────────┴─────────────┘
    No, configure a new deployment
? Would you like to configure schedules for this deployment? [y/n] (y): n
? Which work pool would you like to deploy this flow to? [Use arrows to move; enter to select]
┏━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃   ┃ Work Pool Name ┃ Infrastructure Type ┃ Description ┃
┡━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ > │ CloudManaged   │ prefect:managed     │             │
└───┴────────────────┴─────────────────────┴─────────────┘
? Would you like to build a custom Docker image for this deployment? [y/n] (n): n
╭───────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Deployment 'main/mytest' successfully created with id 'REDACTED'.     │
╰───────────────────────────────────────────────────────────────────────────────────────────────────╯

View Deployment in UI: 
https://app.prefect.cloud/account/<REDACTED>

? Would you like to save configuration for this deployment for faster deployments in the future? 
[y/n]: y
? Found existing deployment configuration with name: mytest and entrypoint: flow.py:main in the 
prefect.yaml file at 2.19.3.yaml. Would you like to overwrite that entry? [y/n]: y

Deployment configuration saved to 2.19.3.yaml! You can now deploy using this deployment configuration
with:

        $ prefect deploy -n mytest

You can also make changes to this deployment configuration by making changes to the YAML file.

To schedule a run for this deployment, use the following command:

        $ prefect deployment run 'main/mytest'

Deploy出来た。このままだと絶対に動かないけど。

/images/articles/prefect_2_19_3/cloud_deploy.png

Discussion