iTranslated by AI
How I Built an E-book Capture Tool with AI as a Python Beginner
How I Built an E-book Capture Tool with Claude as a Python Beginner
Introduction
I have almost no experience with programming.
However, driven solely by the desire to "save the e-books I bought to my own computer," I built a Python tool together with an AI.
As a result, I completed a tool that works perfectly.
In this article, I will honestly share "how I gave instructions to the AI" and "what obstacles I encountered." I hope this serves as a reference for others who are in the same boat: "I have something I want to make, but I can't write code."
What I Built
General Purpose E-book Capture Tool
- Automatically turns pages in an e-book viewer while saving screenshots.
- Compiles images into a PDF at the end.
- Operates via a GUI (just by pressing buttons).
It is available on GitHub:
What I Did
Step 1: First, I Created a Specification Document
Rather than having the AI write code right away, I first created a specification document summarizing "what I want to build."
By bouncing ideas back and forth with Claude, I nailed down the following details:
- What functions are necessary?
- What settings should be configurable in the GUI?
- Where should files be saved?
- How should the end of the book be detected?
By creating a solid specification document, my instructions to the AI remained consistent. I believe this was the most important step.
Step 2: Built Incrementally Using Phases
Trying to have everything built at once leads to complex code and tons of bugs.
So, I divided the project into 5 phases and proceeded while verifying the operation at each stage.
| Phase | Content |
|---|---|
| Phase 1 | GUI, Capture, Saving, Test Mode |
| Phase 2 | PDF Generation, Spreads Splitting |
| Phase 3 | Image-based Termination Detection (OpenCV) |
| Phase 4 | Text-based Termination Detection (OCR) |
| Phase 5 | Help Screen, Log Improvements |
Since I confirmed "it works!" at each phase before moving on, I knew exactly where a problem occurred if something went wrong.
Step 3: Giving Instructions to Claude Code
I used Claude Code for the actual code generation.
Giving instructions is simple.
Please implement Phase 1.
[Implementation Scope]
・tkinter GUI
・Screenshot acquisition
・Image saving
・Stop button
・Test mode
[Things NOT to implement yet]
・PDF generation
・OCR
・Termination detection
[Important]
Please prioritize the "minimum viable product that definitely works" first.
The key point is to explicitly state "things not to implement yet." If you don't, the AI will try to implement everything at once, and things will quickly get out of hand.
Challenges Faced
Installing Tesseract
I needed software called Tesseract to use the OCR function, but after installing it, it wouldn't work because the PATH was not set correctly, causing an error.
Solution: I resolved this by setting the environment variables in PowerShell.
Handling Two-Page Spreads
I encountered an issue where splitting pages in a two-page spread caused only the first page to be misaligned. This was due to the specifications of the e-book viewer application.
Solution: I changed my workflow to capture with the two-page spread display turned off.
Reflections
I barely wrote any code myself, yet I successfully completed a tool that works as expected.
I realized that the key is not just to outsource everything to AI, but to think carefully about "what I want to build" and communicate that clearly. If you properly create a specification, divide the work into phases, and verify operations at each step, even a beginner can build tools with AI.
Technologies Used
- Python / tkinter
- mss (for screenshots)
- Pillow (for image processing)
- pyautogui (for sending keyboard input)
- opencv-python (for image recognition)
- pytesseract (for OCR)
- Claude Code (for code generation)
Important Notes
- This is a tool created for personal learning purposes.
- It is intended for personal use of content you have purchased yourself.
- Please check the terms of service for each e-book service and use this tool at your own risk.
Discussion