🚲

Python簡易リファレンス(メモ)

2022/08/29に公開

自分用メモですが誰かの役に立てば。

公式リファレンス

Python latest stable release document

演算

演算子

演算子の優先順位
x割るyの商と剰余を考えた時、x = (x // y) * y + x % y となる値が返却される。(マイナス値の演算結果は直感的でない)

演算種類	記法
代入, 四則, べき乗	=, +, -, , /, *
商と剰余	//, %
要素結合	+
比較	==, !=, <, >, <=, >=, is, is not, in, not in
論理	and, or, not
ビット	&, \|, ^, ~, <<, >>
累積代入	+=, //=, **=, >>=, ^= 等

文字列

正規表現は標準モジュール re のインポートが必要。

r"a\tb"  # a\tb
var = "def"
f"abc{var}ghi"  # abcdefghi
var = "{}def{}"
var.format("abc","ghi")  # abcdefghi
var = "foobarbaz"
var.split("b")  # ['foo', 'ar', 'az']
var.replace("b", "a")  # fooaaraaz
"  a\tb\n".strip()  # a	b

vars = "var"
f"{vars:_>8}"  # 右寄せ: _____var
f"{vars:_^8}"  # 中寄せ: __var___
f"{vars:_<8}"  # 左寄せ: var_____
vari = 1234
f"{vari:08}"  # ゼロ埋め: 00001234
f"{vari:,}"  # 桁区切り: 1,234
f"{vari:b}"  # 2進: 10011010010
f"{vari:o}"  # 8進: 2322
f"{vari:x}"  # 16進: 4d2
f"{vari:#b}"  # 接頭辞付2進: 0b10011010010
f"{vari:#o}"  # 接頭辞付8進: 0o2322
f"{vari:#x}"  # 接頭辞付16進: 0x4d2
varf = 12.34567
f"{varf:+.1f}"  # プラス記号: +12.3
f"{varf:.3f}"  # 小数有効桁: 12.346
f"{varf:.3g}"  # 表示有効桁: 12.3
f"{varf:.3e}"  # 指数表記: 1235e+01
f"{varf:.2%}"  # パーセント: 1234.57%

変数

変数は動的型付け。静的型付けのためには型ヒントを用いる。
typingモジュール全解説(3.10対応)
定数はFinal型アノテーションを用いる。

from typing import Final
URL_STR: Final[str] = r"https://www.google.com/search?q=foo+bar+baz"

スコープ

名前空間とスコープ
 Pythonのスコープについて
外側のスコープは参照可能。変更する場合はglobalやnonlocalなどを用いる。

LEGB
- Local scope
  - ローカルスコープに属する条件は「関数内で定義された」場合のみ
- Enclosing (function's) scope
  - protectedのようなイメージで派生クラスが関数内関数
  - このスコープによりクロージャを生成可能
- Global Scope
  - 同じモジュール(ファイル)のグローバルスコープに書かれた変数
  - 別モジュールに書かれたグローバル変数はモジュール変数と呼ぶ
  - モジュールオブジェクトの属性として参照可能
- Built-in scope
  - 組み込み関数用

型一覧

10進型は標準モジュール decimal、日時は標準モジュール datetimeのインポートが必要。

数値型
- 整数値 int()
  - 値域はメモリが許す限り無制限
- 真偽値 bool()
  - 数値: 0 == False, others == True
  - None: None == False
  - Collection: 要素数0 == False, others == True (たぶん)
    - 文字列: "" == False, others == True
    - リスト: [] == False, others == True
- 実数値 float()
  - 倍精度浮動小数点数
  - 1e1, 1e-1などの指数表記も可能
- 虚数値 complex()
  - 内部的に倍精度浮動小数点数*2で表現
  - 実数部 (z.real)
  - 虚数部 (z.imag)
テキストシーケンス型 ordered
- 文字列値 str() immutable

シーケンス型 ordered

リスト list() mutable

lst = [1,2,3,2,1]
lst.append(4)  # [1, 2, 3, 2, 1, 4]
lst.pop()      # [1, 2, 3, 2, 1]
lst.remove(2)  # [1, 3, 2, 1]
lst.reverse()  # [1, 2, 3, 1]

タプル tuple() immutable

tpl = ()    # 要素0のタプル
tpl = (1,)  # 要素1のタプル
tpl = (1,2) # 要素2のタプル
var, _ = tpl  # 複数同時代入(アンパック) _は捨てる際に慣例的に使用 var == 1

レンジ range() immutable
- 動的に値生成されるためメモリ節約される

バイナリシーケンス型 ordered
- バイト bytes() immutable
- bytearray bytearray() mutable
- メモリビュー memoryview() immutable/mutable
集合型 unordered
- set set() mutable
- frozenset frozenset() immutable
マッピング型 ordered
- 辞書 dict() mutable
  - OrderedDictとの実用上の違いは等値比較時に値順序位置まで比較されるかどうか
  - dict1.update(dict2)でdict1にdict2をマージ可能 (同キーは上書き)

シーケンス型のインデックス指定

インデックス開始は0
- マイナス値指定で逆順指定

スライスを用いた拡張指定

lst = [0,1,2,3,4,5,6,7,8,9,10]
lst[2:8] == [2, 3, 4, 5, 6, 7]  # True
lst[2:8:2] == [2, 4, 6]  # True
lst[::3] == [0, 3, 6, 9]  # True
lst[::-1] == [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]  # True
lst[:] == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]  # True
lst is not lst[:]  # True

制御

# 条件分岐
if boolean:
    pass
elif boolean:
    pass
else:
    pass
# ----------
match var:
    case val1:
        pass
    case val2 | val3:
        pass
    case _:
        pass
# ----------
match var:
    case var if boolean:
        pass
    case _:
        pass

# 繰り返しと継続と中断
for i in iterable:
    if boolean:
        continue
else:
    pass  # breakされなかった時に実行される
# ----------
while boolean:
    break
else:
    pass  # breakされなかった時に実行される

# 例外分岐
try:
    pass
except ArithmeticError as e:
    pass
except (ZeroDivisionError, TypeError) as e:
    pass
except Exception as e:
    pass  # その他例外
else:
    pass  # 正常終了時実行
finally:
    pass  # 常に最後に実行

# enter,exit自動呼出
with open(path) as f:
    f.read()

処理単位

関数

引数の定義

def func1(arg1, arg2=0):
- arg1は位置引数
- arg2はキーワード引数 optional かつデフォルト値設定
  - デフォルト値は1度だけ評価される
  - デフォルト値がmutableの場合、状態維持されているため予期せぬ動作が起こりうる
- 位置引数の後にキーワード引数
def func2(*args, **kwargs):
- *: 複数の引数をタプルとして受け取るという意味
- **: 複数のキーワード引数を辞書として受け取るという意味
- func(1,2,3,a=4,b=6) の結果は args==(1,2,3) kwargs=={"a":4,"b":6}

ラムダ式

lambda arg1, arg2: expression

使いそうな組み込み関数

print(*args, sep=" ", end="\n", file=sys.stdout, flush=False)
- 標準出力に出力
- 引数sepで区切り文字,endで改行変更可能
len(s)
- コレクションの要素数カウント
range([start, ]stop[, step])
- 等差数列動的生成
enumerate(iterable[, start=0])
- iterableに対する(index,value), ...
zip(*iterables[, strict=False])
- *iterablesに対する(iter1val1,iter2val1,...), (iter1val2,iter2val2,...), ...
- strict=Trueにするとiterablesのlenが異なると例外発生

open(file, mode="r", buffering=- 1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

with open(path) as f:
    var = f.read()  # line1\nline2\nline3\n
    var = f.readlines()  # ["line1\n", "line2\n", "line3\n"]
    while True:
        line = f.readline()  # line1\n
        if not line:
            break
with open(path, mode="w") as f:
    f.write("line1\nline2\nline3\n")  # line1\nline2\nline3\n
    f.writelines(["line1", "line2", "line3"])  # line1line2line3 (改行付与されない)

mode 動作

r 存在必要,読込専用

w 新規作成,ファイル上書き

a 新規作成,ファイル追記

r+ 新規作成,先頭から文字単位上書き
ディレクトリ作成はos.makedirs()
ファイルの存在確認はos.path.isfile()

all(iterable)
- iterableが全てTrueの場合True
any(iterable)
- iterableのいずれかがTrueの場合True

mode	動作
r	存在必要,読込専用
w	新規作成,ファイル上書き
a	新規作成,ファイル追記
r+	新規作成,先頭から文字単位上書き

内包表記

list
- [i+2 for i in iterable]
- [i+2 for i in iterable if i % 2 == 0]
- ["even" if i % 2 == 0 else "odd" for i in iterable]
- [x for row in matrix for x in row]
  - matrix == [[1, 2, 3], [4, 5, 6], [7, 8, 9]]; result == [1, 2, 3, 4, 5, 6, 7, 8, 9]
tuple
- tuple(i+2 for i in iterable)
dict
- {v: i for i, v in enumrate(iterable)}
set
- {i+2 for i in iterable}
generator
- (i+2 for i in iterable)

クラス

他言語でのプロパティやメソッドは総称してattributeと呼ばれる
_で始まるメソッドはprivateの意思表示 (強制力無し)
__で始まるメソッドはprivate (マングリングされる)
__method__は特殊メソッド (マジックメソッド)
プロパティアクセサは@property, @name.setterデコレーターで作成可能
functools.singledispatchで関数のオーバーロードは作成可能だがメソッドは不可能 (インテリセンス効かない)

class FooBar():
    CLASS_CONST: Final[int] = 100
    def __init__(self):  # constructor
        self.var = None  # variable for instance
    def __del__(self):  # destructor
        pass
    def func(self):  # method
        pass
    @classmethod
    def class_(cls):
        pass
    @staticmethod
    def static_():
        pass

class BarBaz(FooBar):  # inherit
    def __init__(self):
        super.__init__()
    def func(self, var):
        pass

from dataclasses import dataclass
@dataclass
class DataStruct():
    data1: int
    data2: str = "optionizing and default string"
varc = DataStruct(1, "foo")
varc == DataStruct(1)  # False
varc == DataStruct(1, "foo")  # True

例外

Pythonの例外処理
組み込み例外のクラス階層

BaseException
- SystemExit
- KeyboardInterrupt
- GeneratorExit
- Exception
  - StopIteration
  - StopAsyncIteration
  - ...
- ...

よく使いそうな標準モジュール

numpy, pandas, sklearn, matplotlib, PIL, requests, lxml, sqlalchemy等の外部モジュールも使用する前提。

re 正規表現

import re
var = r"address1@example.com, address2@example.com, address3@example.co.jp"
p = re.compile(r"([a-z]+)@([a-z]+)\.com")
m = p.search(var)  # 最初にマッチする文字列のみ
m.start()  # 0
m.end()  # 19
m.group()  # address1@example.com
for m in p.finditer(var):
  m.group()  # address1@example.com ; address2@example.com

re.sub(r"([a-z]+)@([a-z]+)\.com", "sample", var)  # sample, sample, address3@example.co.jp

datetime 日時表現

from datetime import date, datetime, timedelta
date(year, month, day)  # constructor
date.fromisoformat(date_string)  # constructor 書式指定はstrptime
date.today()
date.weekday()  # 月:0, 火:1,..., 日:6
date.isoformat()  # str(date)の実体
date.strftime(format)  # 書式指定出力
datetime(year, month, day, hour=0, minute=0, second=0, microsecond=0, tzinfo=None, *, fold=0)  # constructor
datetime.fromisoformat(date_string)  # constructor 書式指定はstrptime
datetime.now()
datetime.weekday()  # 月:0, 火:1,..., 日:6
datetime.isoformat()  # str(date)の実体
datetime.strftime(format)  # 書式指定出力 %Y-%m-%d %H:%M:%S
datetime.date()  # 型キャスト
timedelta(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0)  # constructor
timedelta.total_seconds()

# 可能な演算 datetimeをdateに置き換えても可能
datetime_add = datetime + timedelta
datetime_minus = datetime - timedelta
timedelta = datetime1 - datetime2

time UNIX時間(単位ms)

簡易経過時間計測に便利

import time
time1 = time.time()
time_diff = time.time() - time1
f"{time_diff:.3f} sec"

csv 区切り文字定型ファイル

import csv
with open("./sample_r.csv") as f:
    for line in csv.reader(f, delimiter="\t"):
        print(line)  # ["1a", "1b", "1c"] ; ["2a", "2b", "2c"]
    lst = [line for line in csv.reader(f, delimiter="\t")]  # [["1a", "1b", "1c"], ["2a", "2b", "2c"]]
with open("./sample_w.csv", "w") as f:
    csv.writerow(["1a", "1b", "1c"])
    csv.writerows([["2a", "2b", "2c"], ["3a", "3b", "3c"]])

json JSONデータ

import json
vars = json.JSONEncoder().encode(["⭐", {"A": 1, "B": 2, "C": [3, 4, 5]}, {"A": "AA", "B": "BB"}, [10, 11]])
# ["\u2b50", {"A": 1, "B": 2, "C": [3, 4, 5]}, {"A": "AA", "B": "BB"}, [10, 11]]
varj = json.loads(vars)
# ['⭐', {'A': 1, 'B': 2, 'C': [3, 4, 5]}, {'A': 'AA', 'B': 'BB'}, [10, 11]]
with open("./sample_r.json") as f:
    json.load(f)  # メソッドが異なる
varj = json.dumps(varj)
with open("./sample_w.json", "w") as f:
    json.dump(varj, f, ensure_ascii=False, indent=4)  # メソッドが異なる

logging ログ出力

import json
from logging import getLogger, config

with open("./log_config.json") as f:
    log_config = json.load(f)

config.dictConfig(log_config)
logger = getLogger(__name__)
logger.debug("debug message")
logger.info("info message")
logger.warning("warning message")
logger.error("error message")

log_config.json
{
    "version": 1,
    "disable_existing_loggers": false,
    "formatters": {
        "simple": {
            "format": "%(asctime)s %(name)s:%(lineno)s %(funcName)s [%(levelname)s]: %(message)s"
        }
    },

    "handlers": {
        "consoleHandler": {
            "class": "logging.StreamHandler",
            "level": "INFO",
            "formatter": "simple",
            "stream": "ext://sys.stdout"
        },
        "fileHandler": {
            "class": "logging.FileHandler",
            "level": "INFO",
            "formatter": "simple",
            "filename": "./app.log"
        }
    },

    "loggers": {
        "__main__": {
            "level": "DEBUG",
            "handlers": ["consoleHandler", "fileHandler"],
            "propagate": false
        },
        "same_hierarchy": {
            "level": "DEBUG",
            "handlers": ["consoleHandler", "fileHandler"],
            "propagate": false
        },
        "lower.sub": {
            "level": "DEBUG",
            "handlers": ["consoleHandler", "fileHandler"],
            "propagate": false
        }
    },

    "root": {
        "level": "INFO"
    }
}

copy コピー

deepcopy(x[, memo])

os.path ファイルパス

from os import path
path_str = r"./dir/subdir/filename.ext"
path.abspath(path_str)  # /home/me/pypj/dir/subdir/filename.ext
path.basename(path_str)  # filename.ext
path.dirname(path_str)  # ./dir/subdir
path.exists(path_str)  # True
path.isfile(path_str)  # True
path.isdir(path_str)  # False
path.islink(path_str)  # False
path.join(path.dirname(path_str), "filename2.ext")  # ./dir/subdir/filename2.ext

threading スレッドベースの並列処理

公式リファレンス参照

multiprocessing プロセスベースの並列処理

公式リファレンス参照

制御外トピック

コメントアウト

# 一行のコメントアウト

# 複数行のコメントアウトは存在しない
# 三重引用符による複数行文字列は構文解析の対象となるためインデント等構文的に有効である必要がある
# またその記法はdocstringとしても用いられている

コーディング規約

できるだけPEP8に準拠する。(linter/Formatterに頼る)

パスカルケース
- クラス, 例外
スネークケース
- その他 (ただし定数は大文字のみ)

参考: PEP 8 vs Google Style

参考文献

Pythonの基礎
Pythonコーディングヒント
最近のPythonについて
Package管理ツールについて
- pipとpipenvとpoetryの技術的・歴史的背景とその展望
  - PyPIのjson-apiはPEP691として2022-05-05にpostされた
linter/formatter設定,共有ツール
- PFN社内ツール “pysen” の紹介
なぜlenは関数なのか
- Python の len 関数でリストの要素数や文字列の文字を数える。
  - 内部的にはオブジェクトの__len__メソッドを呼び出すだけ
  - 前置的な方が可読性が良い
  - 標準関数としてlenを使用することで返り値がintであることを保証する

Discussion

ログインするとコメントできます