iTranslated by AI
4 Lessons from My First Experience with Design and Transaction Processing in Python
Introduction
When you're busy with daily tasks, it's easy to focus too much on results and neglect reflection.
In this era of advanced AI, I feel like I'm not accumulating skills if I don't reflect, and I have a sense of crisis that nothing will remain in the long run.
So, I decided to look back at my work regularly.
I want to focus especially on things I've touched for the first time and turn them into my own skills.
Lessons Learned
- I reached the simple but fundamental conclusion that when automating, you should aim to "eliminate human intervention to the limit."
- I reconfirmed that design is all about "considering the 'purpose' and 'goal' and what to do to achieve them," something often repeated in the business world.
- Instead of needlessly increasing effort, it seems easier to think by considering variables and constants, separating "places that only need to be checked once" from "places that require monthly verification."
- I learned about the great development principle: "Garbage In, Garbage Out."
What I Did
Transaction Processing
This is a process I wrote so often that I got bored of it when I was a Java engineer.
In Java, I had a vague understanding of writing it because it was in the design document and necessary, but I suddenly wondered how to write it in Python.
In fact, I didn't even know how to connect Python to a database in the first place.
Coincidentally, a task came up to replace part of an existing table on Snowflake with other data, so I racked my brain to see if I could implement transaction processing.
Since I was using Snowflake, I first needed to establish a connection.
import snowflake.connector
conn = snowflake.connector.connect(
user='YOUR_USER',
password='YOUR_PASSWORD',
account='YOUR_ACCOUNT',
warehouse='COMPUTE_WH',
database='MY_DB',
schema='PUBLIC',
role='SYSADMIN',
)
# 2. Create cursor
cur = conn.cursor()
Something like this.
conn is the communication path with the database itself, serving as the pipeline connecting Snowflake and Python in this case.
cur is the operation window for sending SQL statements to the DB and receiving results. It's like a means to perform processing through conn.
I learned that you can execute SQL processes from Python by running cur.execute(sql).
I wrote the following code to execute the transaction processing.
delete_sql = f"""
DELETE FROM DB_name.Schema_name.Table_name
WHERE xxx IN ('xxx', 'yyy', 'zzz');
"""
insert_sql = f"""
INSERT INTO DB_name.Schema_name.Table_name (
xxx, -- List the columns to replace
xxx,
xxx,
xxx,
xxx,
xxx
)
SELECT
xxx, -- Select the same columns
xxx,
xxx,
xxx,
xxx,
xxx
FROM
DB_name.Schema_name.Table_name2
WHERE
xxx IN ('xxx', 'yyy', 'zzz');
"""
try:
print("Starting transaction")
cur.execute("BEGIN;") # Signal to start the transaction
print("Deleting existing data from 3 media...")
cur.execute(delete_sql)
print("Deletion complete")
print("Inserting data from Table 2...")
cur.execute(insert_sql)
cur.execute("COMMIT;") # The process is finalized only at this point
print("Success: Transaction committed.")
except Exception as e:
cur.execute("ROLLBACK;") # Roll back if any error occurs
print(f"Error: Transaction rolled back. {e}")
By doing the above, you can avoid a situation where columns were deleted but could not be updated.
While transaction syntax varies by database, it seems achievable by inserting commands like BEGIN; before and after the SQL processing.
Designing a Quality Assurance Mechanism
Creating a design document.
It sounds like a real "upstream" task, doesn't it?
When I was a Java engineer, I had never created one from scratch; I only occasionally made minor fixes when changes occurred during the implementation phase.
So, I didn't know how to write one, nor did I know the format used for my current project.
In that state, I was given the task with a broad instruction: "Think of a mechanism where errors won't occur and implement it."
While receiving such an abstract assignment made me feel somewhat trusted, which was nice, I was also quite panicked given my lack of experience.
However, since I was entrusted with the implementation as well, I figured that the most important thing for this task was to "align common understanding before implementation." I decided that a rigid design document like a Java requirements definition wasn't necessary and chose to summarize it in a document instead.
What I wrote was roughly as follows:
- Create a mechanism to prevent issues in stored data and a mechanism to detect them if they occur.
- Separate the logic into "(1) places to verify logic validity just once" and "(2) places to monitor continuously every month," incorporating only (2) into the regular execution flow.
- For (1), create test cases covering all expected scenarios and verify that the behavior is as intended. After that, assume the same process will run and don't touch it unless there's a problem.
- For (2), follow the GIGO principle to check if correct data is flowing into the process. Design it so that the validation is automated, and if everything is "True," then there's no issue.
One thing I think I did well was not making the monthly processes too complex.
When told to "make sure there are no errors," double-checks or multi-point verifications are usually the first things that come to mind. However, doing that every time is exhausting in terms of man-hours, and more importantly, once you get used to it, you tend to get sloppy, which makes human errors more likely.
By considering this, I was able to simplify things: check the critical processes only once, and after that, assume the process is correct and focus only on the quality of the data. Furthermore, instead of manual checks, I focused on just seeing the results executed by code.
I think it's good to keep in mind the opposite of a "Human-in-the-Loop" (HITL) approach: thoroughly reducing human intervention in areas where decision-making is unnecessary.
Thoughts
I used to have the concern that "I don't know how to handle design or catch up because there's no boss or senior engineer around," but once I was told to just give it a try, I realized that I could somehow manage it.
I feel like I was overthinking that there must be a single "correct answer."
In the end, it's about thinking about what I want to achieve and just doing what's necessary for that. I feel that even if I move to another organization, it's simply a matter of doing the same thing while following their rules and formats.
Lately, I've been taking initiative to do various things, such as volunteering for tasks or giving presentations, to gain exposure to different types of work. I'm truly feeling that "making things" is much more enjoyable than "being made to work."
Since writing things down in an article helps them stay in my head, I plan to continue writing regularly.
Discussion