iTranslated by AI
Analyzing Japanese Diet Members Using the National Diet Record Search System API Ahead of the General Election
Introduction
The House of Representatives was dissolved in early 2026, and attention is focused on Diet members.
It is possible to obtain Diet deliberations from the Diet minutes via an API.
Example:
https://kokkai.ndl.go.jp/api/speech?any=%E7%B1%B3%E9%AB%98%E9%A8%B0&maximumRecords=100&recordPacking=json
In this article, we will use the National Diet Record Search System to summarize the utterances of specific Diet members and output what kind of politicians they are.
- Notebook for execution
- Output examples
National Diet Record Search System
API specifications are summarized below.
Supported endpoints include Meeting-level Simple Output, Meeting-level Output, and Utterance-level Output. We will use the Utterance-level Output this time.
For example, to retrieve data for "Mr. Koki Oozora" starting from January 1, 2024, you would make a request like this:
https://kokkai.ndl.go.jp/api/speech?speaker=%E5%A4%A7%E7%A9%BA%E5%B9%B8%E6%98%9F&from=2024-01-01&maximumRecords=10&recordPacking=json
The response will be a JSON like the following:
{
"numberOfRecords":61,
"numberOfReturn":10,
"startRecord":1,
"nextRecordPosition":11,
"speechRecord":[
{
"speechID":"121904601X00320251127_012",
"issueID":"121904601X00320251127",
"imageKind":"Minutes",
"searchObject":12,
"session":219,"nameOfHouse":"House of Representatives",
"nameOfMeeting":"Committee on Internal Affairs and Communications",
"issue":"No. 3",
"date":"2025-11-27",
"closing":null,
"speechOrder":12,
"speaker":"Koki Oozora",
"speakerYomi":"おおぞらこうき",
"speakerGroup":"Liberal Democratic Party / Independent",
"speakerPosition":null,
"speakerRole":null,
"speech":"○Member Oozora: Good morning. (omitted)...",
"startPage":0,
"speechURL":"(omitted)",
"meetingURL":"(omitted)",
"pdfURL":null
},
... (omitted)
]
}
The number of speechRecord entries retrieved at once can be controlled with the maximumRecords query parameter. The maximum value for maximumRecords is 100.
To retrieve the next page, specify the nextRecordPosition from the response in the startRecord query parameter.
https://kokkai.ndl.go.jp/api/speech?speaker=%E5%A4%A7%E7%A9%BA%E5%B9%B8%E6%98%9F&from=2024-01-01&maximumRecords=10&recordPacking=json&startRecord=11
nextRecordPosition will be null once the last page is retrieved.
People other than Diet members also appear in Diet deliberations.
For example, Mr. Koki Oozora was first elected in October 2024, but he spoke as a witness/reference person in February 2024.
https://kokkai.ndl.go.jp/api/speech?speechID=121315363X00220240214_067&recordPacking=json
In this example, it is marked as "speakerPosition":"Chairperson of NPO Anata no Ibasho","speakerRole":"Witness/Reference Person".
Filtering speakers can be done using the speaker parameter. The names to be specified can be obtained from the "House of Representatives Top Page > Member Information > List of Members" section.
As of January 30, 2026, it is blank due to the dissolution:
Past information can be checked from web archives.
Looking at these archives, Mr. "Joji Uruma" (漆間譲司 / うるま譲司) is interesting.
The notation varies between Kanji and Kana depending on the period.
For members whose names are displayed in two ways like this, the same records can be searched using both names.
# Search by 漆間譲司 (Kanji)
https://kokkai.ndl.go.jp/api/speech?speaker=%E6%BC%86%E9%96%93%E8%AD%B2%E5%8F%B8
or
# Search by うるま譲司 (Kana)
https://kokkai.ndl.go.jp/api/speech?speaker=%E3%81%86%E3%82%8B%E3%81%BE%E8%AD%B2%E5%8F%B8
As of 2026/1/30, the numberOfRecords was the same for both names.
Now, here is a sample code using the National Diet Record Search System API with Python.
import requests
from typing import Optional, Iterator
import time
from pydantic import BaseModel
class SpeakerItem(BaseModel):
speaker: Optional[str] = None
speakerYomi: Optional[str] = None
speechID: str
issueID: str
imageKind: str
searchObject: int
session: int
nameOfHouse: Optional[str] = None
nameOfMeeting: Optional[str] = None
issue: Optional[str] = None
date: Optional[str] = None
speechOrder: int
speakerGroup: Optional[str] = None
speakerPosition: Optional[str] = None
speakerRole: Optional[str] = None
speech: str
startPage: int = 0
speechURL: Optional[str] = None
meetingURL: Optional[str] = None
pdfURL: Optional[str] = None
# Reference
# https://kokkai.ndl.go.jp/api.html
class KokkaiSpeakerApi:
def __init__(self):
self.endpoint = "https://kokkai.ndl.go.jp/api/speech"
self.next_request_time = time.time()
self.interval_sec = 2
self.max_records = 100
def get_speak(self, speaker:str, from_date: Optional[str]=None, until_date: Optional[str]=None)->Iterator[SpeakerItem]:
with requests.Session() as session:
# Specify User-Agent if necessary (etiquette)
session.headers.update({
"User-Agent": "kokkai-api-client/1.0"
})
params = {
"speaker": speaker,
"maximumRecords": self.max_records,
"recordPacking": "json",
"startRecord": 1
}
if from_date:
params["from"] = from_date
if until_date:
params["until"] = until_date
# Recursive calls might hit the stack limit depending on the number of pages, but we'll allow it
yield from self._call_api(session, params)
def _call_api(self, session: requests.Session, params: dict)->Iterator[SpeakerItem]:
current_time = time.time()
if current_time < self.next_request_time:
time.sleep(self.next_request_time-current_time)
response = session.get(self.endpoint, params=params)
self.next_request_time = time.time() + self.interval_sec
response.raise_for_status()
result = response.json()
for record in result.get("speechRecord", []):
item = SpeakerItem(**record)
yield item
if result.get("nextRecordPosition"):
params["startRecord"] = result["nextRecordPosition"]
yield from self._call_api(session, params)
Usage example:
from_date = "2022-01-01"
until_date = "2026-01-31"
ctrl = KokkaiSpeakerApi()
for rec in ctrl.get_speak("うるま譲司", from_date=from_date, until_date=until_date):
print(rec)
Summary of Diet Members' Utterances
Now, let's try creating summaries of Diet members' utterances.
In this instance, we will retrieve the lists of members as of April 2025 and January 2024 from the web archive and save them as CSV files.
We collect utterances from "2022-01-01" to "2026-01-31" for each member using the Utterance-level Output of the National Diet Record Search System.
The collected content is stored in DuckDB as a cache.
Utterances of Diet members are aggregated per meeting, and an LLM is used to generate a summary of the points made by the member in each meeting.
The generated summaries are also stored in DuckDB as a cache.
Finally, we use the LLM to synthesize the meeting summaries to output a profile of what kind of person the member is.
How to Create the Cache
Create the DuckDB cache as follows.
DuckDB operation code
import duckdb
class SpeakerDbItem(SpeakerItem):
memberID: int
speechToken: str
@classmethod
def from_speaker(cls, item: SpeakerItem, member_id: int, speech_token: str) -> "SpeakerDbItem":
return cls(**item.model_dump(), memberID=member_id, speechToken=speech_token)
class SpeakerDbScoreItem(SpeakerDbItem):
score: float
class IssueItem(BaseModel):
issueID: str
nameOfHouse: Optional[str] = None
nameOfMeeting: Optional[str] = None
issue: Optional[str] = None
date: Optional[str] = None
class SpeakerIssueSummaryItem(IssueItem):
memberID: int
summary: str
class KokkaiDbRepository:
def __init__(self):
self.db_path = "./kokkai.duckdb"
def setup(self, is_update_fts: bool = True):
with self.connect() as conn:
conn.execute("INSTALL fts;")
conn.execute("LOAD fts;")
# Member table
conn.execute("""
CREATE SEQUENCE IF NOT EXISTS member_id_seq START 1;
CREATE TABLE IF NOT EXISTS members (
id BIGINT PRIMARY KEY DEFAULT nextval('member_id_seq'),
name VARCHAR NOT NULL,
kana VARCHAR NOT NULL,
kaiha VARCHAR
);
"""
# Speaker table
conn.execute("""
CREATE TABLE IF NOT EXISTS speakers (
speechID VARCHAR NOT NULL PRIMARY KEY,
memberID BIGINT,
issueID VARCHAR NOT NULL,
imageKind VARCHAR,
searchObject INTEGER,
session INTEGER,
nameOfHouse VARCHAR,
nameOfMeeting VARCHAR,
issue VARCHAR,
date VARCHAR,
speechOrder INTEGER,
speakerGroup VARCHAR,
speakerPosition VARCHAR,
speakerRole VARCHAR,
speech TEXT,
speechToken TEXT,
FOREIGN KEY (memberID) REFERENCES members(id)
);
"""
# Summary of speakers per meeting
conn.execute("""
CREATE TABLE IF NOT EXISTS speaker_summaries (
memberID BIGINT,
issueID VARCHAR NOT NULL,
nameOfHouse VARCHAR,
nameOfMeeting VARCHAR,
issue VARCHAR,
date VARCHAR,
summary TEXT,
FOREIGN KEY (memberID) REFERENCES members(id)
);
"""
# Recreate indices
if is_update_fts:
self.update_fts_index(conn)
def update_fts_index(self, conn):
try:
conn.execute("""
PRAGMA drop_fts_index('speakers');
"""
except duckdb.Error as e:
print(e)
pass
conn.execute("""
PRAGMA create_fts_index(
'speakers',
'speechID',
'speechToken'
);
"""
def connect(self):
con = duckdb.connect(self.db_path)
return con
def get_member_id(self, conn, name: str)->Optional[int]:
res = conn.execute("""
SELECT id FROM members WHERE name = ?;
""", (name,)).fetchone()
return res[0] if res else None
def append_member(self, conn, name: str, kana: str, kaiha: Optional[str]=None)->int:
res = conn.execute("""
INSERT INTO members (name, kana, kaiha)
VALUES (?, ?, ?)
RETURNING id;
""", (name, kana, kaiha)).fetchone()
return res[0]
def append_speaker(self, conn, items: list[SpeakerDbItem]):
rows = [(
item.speechID,
item.memberID,
item.issueID,
item.imageKind,
item.searchObject,
item.session,
item.nameOfHouse,
item.nameOfMeeting,
item.issue,
item.date,
item.speechOrder,
item.speakerGroup,
item.speakerPosition,
item.speakerRole,
item.speech,
item.speechToken
) for item in items]
conn.executemany("""
INSERT INTO speakers (
speechID,
memberID,
issueID,
imageKind,
searchObject,
session,
nameOfHouse,
nameOfMeeting,
issue,
date,
speechOrder,
speakerGroup,
speakerPosition,
speakerRole,
speech,
speechToken
) VALUES ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ? );
""", rows)
def get_issue_speeches(self, conn, speaker: str, issue_id: str, limit: int)->Iterator[SpeakerDbItem]:
member_id = self.get_member_id(conn, speaker)
if member_id is None:
return
rows = conn.execute(
"""
SELECT
speechID,
memberID,
issueID,
imageKind,
searchObject,
session,
nameOfHouse,
nameOfMeeting,
issue,
date,
speechOrder,
speakerGroup,
speakerPosition,
speakerRole,
speech,
speechToken
FROM speakers
WHERE memberID = ? AND issueID = ?
ORDER BY speechID ASC
LIMIT ?;
""",
(member_id, issue_id, limit),
).fetchall()
for row in rows:
item = SpeakerDbItem(
speechID=row[0],
memberID=row[1],
issueID=row[2],
imageKind=row[3],
searchObject=row[4],
session=row[5],
nameOfHouse=row[6],
nameOfMeeting=row[7],
issue=row[8],
date=row[9],
speechOrder=row[10],
speakerGroup=row[11],
speakerPosition=row[12],
speakerRole=row[13],
speech=row[14],
speechToken=row[15],
)
yield item
def get_speaker_issues(self, conn, speaker: str)->list[IssueItem]:
member_id = self.get_member_id(conn, speaker)
if member_id is None:
return []
rows = conn.execute(
"""
SELECT DISTINCT issueID, issue, nameOfHouse, nameOfMeeting, date FROM speakers
WHERE memberID = ?;
""",
(member_id,),
).fetchall()
return [
IssueItem(
issueID=row[0],
issue=row[1],
nameOfHouse=row[2],
nameOfMeeting=row[3],
date=row[4],
)
for row in rows
]
def get_speaker_issue_summary(self, conn, member_id: int, issueID: str)->Optional[SpeakerIssueSummaryItem]:
row = conn.execute(
"""
SELECT
memberID,
issueID,
issue,
nameOfHouse,
nameOfMeeting,
date,
summary
FROM speaker_summaries
WHERE memberID = ? AND issueID = ?;
""",
(member_id,issueID, ),
).fetchone()
if not row:
return None
return SpeakerIssueSummaryItem(
memberID = row[0],
issueID = row[1],
issue = row[2],
nameOfHouse = row[3],
nameOfMeeting = row[4],
date = row[5],
summary = row[6]
)
def append_speaker_issue_summary(self, conn, item: SpeakerIssueSummaryItem):
conn.execute("""
INSERT INTO speaker_summaries (
memberID,
issueID,
issue,
nameOfHouse,
nameOfMeeting,
date,
summary
) VALUES ( ?, ?, ?, ?, ?, ?, ? );
""", (
item.memberID,
item.issueID,
item.issue,
item.nameOfHouse,
item.nameOfMeeting,
item.date,
item.summary,
))
def search_speeches(self, conn, speaker: str, query: str, limit: int)->Iterator[SpeakerDbScoreItem]:
member_id = self.get_member_id(conn, speaker)
if member_id is None:
return
rows = conn.execute(
"""
SELECT
speechID,
memberID,
issueID,
imageKind,
searchObject,
session,
nameOfHouse,
nameOfMeeting,
issue,
date,
speechOrder,
speakerGroup,
speakerPosition,
speakerRole,
speech,
speechToken,
score
FROM (
SELECT
speechID,
memberID,
issueID,
imageKind,
searchObject,
session,
nameOfHouse,
nameOfMeeting,
issue,
date,
speechOrder,
speakerGroup,
speakerPosition,
speakerRole,
speech,
speechToken,
fts_main_speakers.match_bm25(speechID, ?) AS score
FROM speakers
WHERE memberID = ?
) t
WHERE score IS NOT NULL
ORDER BY score DESC
LIMIT ?;
""",
(query, member_id, limit),
).fetchall()
for row in rows:
item = SpeakerDbScoreItem(
speechID=row[0],
memberID=row[1],
issueID=row[2],
imageKind=row[3],
searchObject=row[4],
session=row[5],
nameOfHouse=row[6],
nameOfMeeting=row[7],
issue=row[8],
date=row[9],
speechOrder=row[10],
speakerGroup=row[11],
speakerPosition=row[12],
speakerRole=row[13],
speech=row[14],
speechToken=row[15],
score=row[16],
)
yield item
db_repository = KokkaiDbRepository()
db_repository.setup(False)
Usage example:
for row in db_repository.search_speeches(
db_repository.connect(),
speaker="玉木雄一郎",
query="社会保障",
limit=10
):
print(row)
- The
memberstable is a list of Diet members. - The
speakerstable acts as a cache for the utterance-level output from the National Diet Record Search System. AspeechTokencolumn is provided to allow for BM25 searching.- The
speechTokenindex is recreated whenever the data is updated[1].
- The
- The
speaker_summariestable stores the summaries of utterances made by the relevant member for each meeting.
Summarizing Utterances
We use OpenAI to summarize the utterances with the following code.
get_issue_summary generates a summary of a Diet member's remarks for each meeting.
get_speaker_summary generates a description of the member's profile based on the meeting summaries.
from openai import OpenAI, RateLimitError
import random
import time
from typing import List
class OpenAiService:
def __init__(self):
# Requires environment variable OPENAI_API_KEY
self.client = OpenAI()
self.poll_interval_seconds: float = 10.0
# Used to comply with the limit of embeddings batch (max 50,000 inputs)
self.max_inputs_per_batch: int = 50_000
def get_chat(self, messages: list[dict], model: str = "gpt-5-nano"):
response = self.client.chat.completions.create(
messages=messages,
model=model,
temperature=1,
)
return response.choices[0].message.content
class KokkaiOpenAIService(OpenAiService):
def get_issue_summary(self, issue_speeches: List[str]) -> str:
speech = "\n".join(issue_speeches)[:80000] # Limit to about 80,000 characters
messages = [
{
"role": "system",
"content": (
"You are summarizing Diet utterances. A Diet member made the following remarks. "
"Briefly summarize the main points of the member's questioning within 600 characters."
),
},
{"role": "user", "content": speech}
]
return self.get_chat(messages)
def get_speaker_summary(self, issue_summaries: List[str]) -> str:
content = "\n".join(issue_summaries)[:80000] # Limit to about 80,000 characters
messages = [
{
"role": "system",
"content": (
"You are summarizing the policies and ideas of Diet members for election purposes. The member has conducted the following questions in the Diet. "
"Organize which policies they prioritize in a way that is easy for voters to understand, and summarize it within 400 characters for a flyer."
),
},
{"role": "user", "content": content}
]
return self.get_chat(messages)
Service for Registering and Summarizing Utterances
We will create a service that integrates the DuckDB, OpenAI, and National Diet Record Search System API operation classes implemented so far.
-
import_speakerscollects utterance content using the National Diet Record Search System API. During this process, it performs morphological analysis to split words and registers them in DuckDB. -
make_speaker_issue_summarycreates summaries of a specified person's remarks for each meeting using OpenAI. -
make_speaker_textfurther summarizes the results ofmake_speaker_issue_summarywith OpenAI and outputs an overview of the person.
Implementation of KokkaiSpeakerService
from sudachipy import tokenizer
from sudachipy import dictionary
class KokkaiSpeakerService:
def __init__(self, db_repository: KokkaiDbRepository, kokkai_api: KokkaiSpeakerApi, open_ai_service: KokkaiOpenAIService):
self.db_repository = db_repository
self.kokkai_api = kokkai_api
self.open_ai_service = open_ai_service
self.tokenizer = dictionary.Dictionary().create()
def import_speakers(self, speaker_name: str, speaker_kana: str, speaker_kaiha: str, from_date: Optional[str]=None, until_date: Optional[str]=None):
with self.db_repository.connect() as conn:
member_id = self.db_repository.get_member_id(conn, speaker_name)
if member_id is not None:
print(f"{speaker_name} is already registered")
return
conn.execute("BEGIN;")
member_id = self.db_repository.append_member(conn, speaker_name, speaker_kana, speaker_kaiha)
items = []
for item in self.kokkai_api.get_speak(speaker_name, from_date, until_date):
tokens = " ".join([m.surface() for m in self.tokenizer.tokenize(item.speech, tokenizer.Tokenizer.SplitMode.C)])
db_item = SpeakerDbItem.from_speaker(item, member_id, tokens)
items.append(db_item)
if len(items) >= 100:
self.db_repository.append_speaker(conn, items)
conn.commit()
conn.execute("BEGIN;")
items = []
if items:
self.db_repository.append_speaker(conn, items)
conn.commit()
self.db_repository.update_fts_index(conn)
def make_speaker_issue_summary(self, speaker_name: str)->list[SpeakerIssueSummaryItem]:
result = []
with self.db_repository.connect() as conn:
member_id = self.db_repository.get_member_id(conn, speaker_name)
if member_id is None:
print(f"{speaker_name} is not registered")
return
for issue in self.db_repository.get_speaker_issues(conn, speaker_name):
issue_summary = self.db_repository.get_speaker_issue_summary(conn, member_id, issue.issueID)
if issue_summary:
print(f"Summary for {speaker_name}'s {issue.issueID} has already been created")
result.append(issue_summary)
continue
speeches = []
for speech in self.db_repository.get_issue_speeches(conn, speaker_name, issue.issueID, 100):
speeches.append(speech.speech)
summary =self.open_ai_service.get_issue_summary(speeches)
issue_summary = SpeakerIssueSummaryItem(
**issue.model_dump(),
memberID=member_id,
summary=summary
)
self.db_repository.append_speaker_issue_summary(conn, issue_summary)
# Commit each one to avoid wasting OpenAI usage in case of an abnormal exit
conn.commit()
result.append(issue_summary)
return result
def make_speaker_text(self, speaker_name: str, output_path: str):
items = self.make_speaker_issue_summary(speaker_name)
messages = [item.summary for item in items]
summary = self.open_ai_service.get_speaker_summary(messages)
with open(output_path, "w") as f:
f.write("## List of Questions\n")
for item in items:
f.write(f' - {item.date} {item.nameOfHouse} {item.nameOfMeeting} {item.issue} {item.summary.replace("\\n", "")}\n')
f.write("## Summary\n")
f.write(f"{summary}\n")
Example of registering utterances
import csv
import re
from tqdm import tqdm
from_date = "2022-01-01"
until_date = "2026-01-31"
db_repository = KokkaiDbRepository()
db_repository.setup(False)
open_ai_service = KokkaiOpenAIService()
service = KokkaiSpeakerService(db_repository, KokkaiSpeakerApi(), open_ai_service)
with open("2025_members.csv") as f:
reader = csv.reader(f)
next(reader) # Skip header
for row in tqdm(reader):
name = re.sub(r"\s+", "", row[0])[:-1]
kana = re.sub(r"\s+", "", row[1])
kaiha = re.sub(r"\s+", "", row[2])
service.import_speakers(name, kana, kaiha, from_date, until_date)
Example of summarizing utterances:
db_repository = KokkaiDbRepository()
db_repository.setup(False)
open_ai_service = KokkaiOpenAIService()
service = KokkaiSpeakerService(db_repository, KokkaiSpeakerApi(), open_ai_service)
speaker = "平将明"
items = service.make_speaker_text(speaker, output_path=f"./outputs/{speaker}.txt")
Evaluation of Execution Results
It took about 24 minutes to process Mr. Masaaki Taira. For a full-scale execution, it would be desirable to consider parallelization.
Also, the cost for the four people output this time was about $0.26 using gpt-5-nano.
Results for Mr. Masaaki Taira
Prioritizing the realization of a digital society centered on My Number and the promotion of EBPM, with a view to expanding online procedures, popularizing Myna insurance cards, and consolidating paper insurance cards. Accelerating local government standardization and migration to the Government Cloud, and promoting data utilization and DX in collaboration with local governments and the private sector. Strengthening cyber security, regulatory reform, and AI governance, and ensuring personal information protection and transparency. Also promoting international cooperation in disaster DX, Web3, and data distribution.
Since he is a former Minister for Digital Transformation, the results seem reasonable as he often mentioned terms like "Web3," "cyber security," "AI," and "Government Cloud."
Mr. Koki Oozora
Placing measures against loneliness and isolation at the core of the nation, balancing prevention-oriented legislative design with field operations. Strengthening cooperation between wide-area counters centered on 24-hour anonymous chats and regional support, ensuring the sustainability of NPOs with stable financial resources, optimizing AI risk assessment and standby response, collaborating with schools, local welfare commissioners, and companies, and enhancing education, public relations, online connectivity, support for Japanese nationals overseas, operation guidelines based on field realities, and inter-local government cooperation. Regarding the stimulation of timber demand, advocating for "wood power" and pushing the private sector by expanding mid-to-high-rise wooden structures and providing corporate incentives. Managing the free provision of stockpiled rice to children's cafeterias with a focus on fairness, data utilization, and future expansion. Member Oozora emphasizes improving budget accuracy and accountability, fair burden of reception fees and measures against non-payment, asset transparency and archive utilization, internal control, creating "places to belong," and outcome evaluation.
Since he is in his first term, he has attended few meetings, but perhaps because he spoke as a witness/reference person in his capacity as the "Chairperson of NPO Anata no Ibasho," content related to loneliness and isolation issues has appeared.
The topics of stockpiled rice and timber utilization were unexpected given his career, but it seems he actually belonged to the Committee on Agriculture, Forestry and Fisheries, so the result appears to be correct.
Mr. Yuichiro Tamaki
Mr. Tamaki emphasizes concretizing the emergency clause as a control clause to prevent abuse of power, making constitutional amendment mandatory for term extensions, and strictly limiting the duration and scope of emergency sessions. Regarding the National Referendum Act, he ensures fairness through online advertising regulations and fact-checking. Proposing education bonds, expansion of scholarships, and expansion of basic deductions, with priority on wage increases and education investment. Advocating for the immediate implementation of gasoline tax cuts and securing financial resources, legal positioning of the Self-Defense Forces and consolidation of right of self-defense arguments, and promotion of digital fundamental rights and online Diet sessions. Calling for a thorough investigation of the slush fund scandal and the restoration of political ethics.
As an opposition party leader with a long career, he tends to have a large volume of utterances.
Features such as the gasoline tax cut appear clearly, as they were often in the news, so the results seem generally appropriate.
The topic of the emergency clause likely appeared because deliberations during the COVID-19 pandemic were included.
Ms. Chinami Nishimura
This member promotes both convenience and fairness, including the use of pre-marital surnames, the appropriate utilization of civil trial information, and the reform of judicial administration while concurrently examining three civil law amendments. Legislating and ensuring transparency for victim relief and asset preservation related to the former Unification Church. Strengthening supervision in the fields of public safety and human rights, accelerating criminal proceedings using ICT, improving awareness and operation of Myna insurance cards, and maintaining universal health insurance. Also promoting employment reforms such as rectifying wage disparities, childcare, and nursing care leave.
As a member of the CDP elected seven times, her volume of utterances is high.
Also, utterances from her time as the Chairperson of the House of Representatives Committee on Judicial Affairs are mixed in.
The mention of "reform of judicial administration" seems to reflect her characteristics as a former committee chair and a law graduate.
Regarding Myna insurance cards, her stance seems consistent with her blog posts and party statements, where she expressed "doubts about digitalization that is predicated on the use of My Number cards." [2]
Looking closely at the summaries of individual meetings, she also spoke about liquefaction in the Noto Peninsula earthquake related to her constituency, but that did not appear in the overall summary.
Conclusion
Within the scope of the experiment, the "summaries" produced seem generally appropriate.
I believe the results were good in that they extracted unexpected details of work that could not be predicted simply by reading Wikipedia.
However, there are currently the following drawbacks:
- No countermeasures for same-name individuals.
- It might be more appropriate to separate and summarize utterances based on roles (Diet member, witness/reference person, Minister, etc.).
- The summarization speed with OpenAI is slow. Parallelization should be considered for processing all members.
- Balance between cost and OpenAI's accuracy.
- BM25 support is not being utilized.
- I tried to collect members in their second term in the House of Representatives, but shifting from the House of Councillors was not taken into account, and the adequacy of the volume needs to be verified.
- It would be preferable to refer more strongly to recent utterances during summarization.
- It would be better to instruct the summary prompt on what to prioritize. For members with many utterances, the result might be filled with non-essential information.
That being said, it seems useful enough to create summaries of Diet members in one's own constituency.
-
From Full-Text Search Extension:
The FTS index will not update automatically when input table changes. A workaround of this limitation can be recreating the index to refresh.↩︎ -
Action Log: Cancellation of Myna Insurance Card Now Possible ↩︎
Discussion