iTranslated by AI
Trying out Open Interpreter 01 Light with M5 Atom Echo

Introduction
I thought the Open Interpreter 01 Lite Light was only sold in the US. However, since the hardware configuration is public and it seemed like I could try it out with an M5 Atom Echo, I decided to give it a go.
Note added 2024/05/07: I had been writing "Lite," but "Light" is correct.
Initially, I planned to run it in a GUI environment within Docker based on the article by the great master Karaage below. However, at the time of writing, 01 itself seems to focus on Mac operations, and adjusting the communication between the Docker environment and 01 Light looked tedious, so I installed it directly on my Mac.
If you are installing 01 directly, please be mindful of security and safety, as it might clutter your PC environment or involve microcontroller-related work.
What is 01?
It is apparently called 01 ("O-One"). It is an OSS released by Open Interpreter, specialized for controlling a PC via voice. For methods of direct voice input from a PC without using ESP32 like Atom Echo, the following article by Nike-chan is easy to understand. It seems to be different from the one below. The white round device that Killian from Open Interpreter is holding in the video below is the ESP32 terminal called 01 Light. They are very generous to keep both hardware and software open.
Motivation
Since having a child, the time I can spend on my computer is truly limited. I simply thought it would be great if I could remotely operate my computer with natural language voice while taking care of my child.
I know things like Mac's Voice Control exist, but I wanted to see what 01 is like.
Verification Environment
- PC: M2 MacBook
- OS: macOS Sonoma 14.4.1
- Python: 3.11.5
Implementation
0. Procuring M5 Atom Echo and other parts
I bought all the components listed in the official Bill of Materials (BOM) on impulse, but if you just want to get it running for now, you can manage with just an M5 Atom Echo.
The other items are batteries and switches for mobile operation, and amplifiers for extending the microphone and speaker. For practical use, I feel that an amplifier and a separate speaker would be desirable.
By the way, I was able to source all the parts listed above from Mouser and Marutsu. The total cost for everything was around 10,000 yen. Since I couldn't find the exact same battery, I purchased a similar one (as of April 1, 2024).
1. Environment Preparation
The basic flow in the YouTube video below was easy to understand.
First, follow the official documentation to install the necessary libraries for Mac from the terminal.
I will proceed assuming brew is already installed.
$ brew install portaudio ffmpeg cmake
Next, clone the 01 GitHub repository and move to the software directory.
$ git clone https://github.com/OpenInterpreter/01.git
$ cd software
For now, let's proceed to the setup for the M5 Atom Echo side.
2. M5 Atom Echo Side (01 Client) Setup
Download and install the version suitable for your machine from the Arduino IDE and launch the Arduino IDE.
- Go to Tools > Board > Boards Manager, search for "ESP32", and install it.
- Go to Tools > Manage Libraries, and install "M5Atom", "WebSockets by Marcus Sattler", and "Async TCP".
- Select Tools > Board > ESP32 Arduino > M5Stack Atom.
Connect the M5 Atom Echo (ESP32) to your computer. Open the file 01/software/source/clients/esp32/src/client.ino (located in the 01 folder cloned in Step 1) in the Arduino IDE. Click Verify (the checkmark icon), and once "Done compiling" is displayed, click Upload (the arrow icon).
If you encounter any issues at this stage, consider consulting Claude Opus, GPT-4, or Gemini 1.5 Pro. Depending on your environment, you might need to adjust settings such as the upload speed.

If it displays "Done uploading," the process was successful. You can now disconnect the USB cable from the Atom side.

3. Macbook Side (01 Server) Setup
Now, return to the terminal to set up the server.
While in the software directory from Step 1, run the following command to install the project and its dependencies into a virtual environment managed by Poetry.
$ poetry install
In the YouTube video mentioned earlier, it's only touched upon briefly, but we'll now set up the OpenAI API. Enter your own OpenAI API KEY.
*It seems like you can also use local LLMs or Claude, but I haven't tried those yet.
$ export OPENAI_API_KEY=sk-xxxxxxxxxxx
Oh, and one more thing—since we need to reference an IP address to start the 01 server, use the following command to check your IP address.
$ ifconfig
When executed, look for the inet address under en0:. Make a note of this IP address.
It's this part of the video.

Now, finally, start the 01 server.
$ poetry run 01 --server --server-host [IP address you noted earlier] --server-port 10001 --model gpt-4-turbo --stt-service openai
Once you see the following state, the server preparation is complete.

4. Connecting the Client and Server
Connect the M5 Atom Echo, which was prepared in Step 2, to a power source.
There is no need to connect it to the Mac via USB at this time; just plug it into a USB power source or a battery.
Communication between the Mac and M5 Atom Echo from here on will be via Wi-Fi.
On a PC or smartphone other than the Mac used for the server setup, go to the Wi-Fi connection screen. You should see an entry named "01-Light"; connect to it.

So then, a screen like the one below will appear. Enter the SSID and password of the Wi-Fi network that the server-configured Mac is connected to, then press "Connect".

On the next screen, enter the IP address noted in Step 3 followed by the port number (default: 10001) and press "Connect"!

If the M5 Atom Echo's LED changes from pink to blue and "connection open" is displayed in the Mac terminal, the setup is successfully completed!

Operation Check
Once you've made it this far, let's test the operation.
While holding down the large button on the Atom Echo, try asking questions like "What is today's date?" or "What time is it?"
It often fails to recognize Japanese correctly, so if you are confident in your English, you might have better success using English.
It's a success if 01 executes a script to display the date and time on the terminal and responds with voice through the Atom Echo! Great job. You can stop the 01 server by pressing Ctrl+C.

Afterward, I tried to launch an app by saying "Open Music," but while 01 appeared to be executing something on the terminal, it didn't perform the intended action (without any error messages).
That case, go to System Settings > Privacy & Security > Accessibility, and turn on "Allow the terminal to control your computer". In my case, I was able to perform various voice operations, such as launching apps, after restarting my Mac or the terminal.
Anyway, as of the time of writing, 01 is designed with English in mind, so Japanese instructions often fail. After quite a bit of trial and error, my API costs jumped to about $20 in one go, lol.
Conclusion
This was my first time using the Arduino IDE, but I learned a lot by consulting GPT-4 and Claude 3 and researching on Discord, and I managed to get it working.
To be honest, at this stage, it feels like mastering Mac's Voice Control might be more cost-effective. However, Mike Bird from Open Interpreter left a passionate comment saying a special announcement is coming soon, so I'll stay hopeful.
Also, since the underlying mechanism seems to be generating and executing code like PyAutoGUI from natural language via LLM, I'm imagining how this could be applied to controlling Raspberry Pi or other IoT devices with natural language.
Additionally, since this was my first time handling M5, please let me know if there are any points that need correction!
That's all for now. Cheers!! 🍺
Discussion