iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
📩

Announcing After-School Arrival and Departure Emails with Google Home

に公開

The after-school care center my child attends has a system that sends emails to parents when children enter or leave. From what I've looked into, quite a few after-school programs and cram schools seem to have similar systems.

I work from home, but since I'm not always looking at my phone, I thought it would be convenient if Google Home could notify me when an email arrives, so I decided to build it.
(I hear they are called Google Nest these days instead of Google Home, but for the sake of clarity, I will refer to them as Google Home in this article.)

Google Home doesn't have an API to make it speak arbitrary text, but it seems you can make it speak audio data using the Google Cast protocol. Also, you cannot use the Cast protocol directly from an external server like the cloud. Therefore, I set up a Raspberry Pi within the local network to stream the Cast data to the Google Home's IP address.

The program I created is available here:

https://github.com/fujikky/gmail-google-home-notify

Loading the Configuration File

I consolidated the Google OAuth client credentials, email search queries, and text to be read aloud into a configuration file named config.js. Since it contains sensitive information, config.js is added to .gitignore. The idea is to copy config.sample.js and modify the necessary parts.

config.js
// @ts-check

/**
 * @type {import("./src/config").Config}
 */
const config = {
  // Information for the OAuth client created in Google Cloud Console
  googleClientId: "xxxxxxx",
  googleClientSecret: "xxxxxxx",
  // Google Home's IP address
  // Can be checked from device information in the Home app
  googleHomeIp: "192.168.x.x",
  // Language to speak
  speechLanguage: "ja",
  // From email addresses to filter
  // If empty, no filtering is performed
  emails: ["user1@example.com", "user2@example.com"],
  // List of regular expressions to match the email body and functions to create the speech text
  // If empty, the entire email body will be read aloud
  conditions: [
    {
      regex: /\d\d?\d\d?日 (\d\d:\d\d):\d\d に◯◯学童に到着しました。/,
      speechText: (match) => `${match[1]}に学童に到着しました。`,
    },
    {
      regex: /\d\d?\d\d?日 (\d\d:\d\d):\d\d に◯◯学童を出発しました。/,
      speechText: (match) => `${match[1]}に学童を出発しました。`,
    },
  ],
};

module.exports = config;

Since the configuration file specified in .gitignore does not exist in the repository, importing it directly would cause an error. Therefore, I defined a function to perform a dynamic import. If the file does not exist, an error will be thrown.

config.ts
import path from "node:path";

export type SpeechTextCondition = {
  readonly regex: RegExp;
  readonly speechText: (match: RegExpMatchArray) => string;
};

// Configuration file type
export type Config = {
  readonly googleClientId: string;
  readonly googleClientSecret: string;
  readonly googleHomeIp: string;
  readonly speechLanguage: string;
  readonly emails: readonly string[];
  readonly conditions: readonly SpeechTextCondition[];
};

const CONFIG_PATH = path.join(process.cwd(), "config.js");

// Use dynamic import to avoid type errors
export const loadConfig = async () =>
  (await import(CONFIG_PATH)).default as Config;

The main code is written in TypeScript and has a transpiled dist folder for execution, but the configuration file is written in JS so that it can run without needing transpilation. By adding // @ts-check, type checking is performed on a per-file basis. Also, by separating tsconfig.json, you can run type checks in the editor or CI while excluding it from the transpilation target.

tsconfig.json
{
  "extends": "@tsconfig/node16-strictest/tsconfig.json",
  "compilerOptions": {
    "outDir": "dist"
  },
  "include": ["src", "typings"]
}
tsconfig.tsc.json
{
  "extends": "./tsconfig.json",
  "compilerOptions": {
    "noEmit": true
  },
  "include": ["config.js", "config.sample.js"]
}

Authenticating with the Gmail API

To handle Google authentication, we create an authentication module as follows. As a side note, the fs/promises module provides Promise-based versions of all the standard fs callback methods. This is very convenient as it eliminates the need for util.promisify.

By setting redirectUri to urn:ietf:wg:oauth:2.0:oob, the system is designed to have you paste the authorization code into the console after authenticating in the browser. Although this method was deprecated some time ago, it seems it is still available when creating an OAuth client in test mode. It was useful during development, so I hope it remains available.

https://takuya-1st.hatenablog.jp/entry/2022/03/14/171939

authorize.ts
import fs from "node:fs/promises";
import path from "node:path";
import readline from "node:readline";

import type { Credentials } from "google-auth-library";
import { google } from "googleapis";

import type { Config } from "./config";

// File to save token information
const TOKEN_PATH = path.join(process.cwd(), ".credentials.json");

// Accept user input
const prompt = (message: string) =>
  new Promise<string>((resolve) => {
    const rl = readline.createInterface({
      input: process.stdin,
      output: process.stdout,
    });
    rl.question(message, (answer) => {
      resolve(answer.trim());
      rl.close();
    });
  });

const loadTokens = async (): Promise<Credentials | null> => {
  try {
    return JSON.parse(await fs.readFile(TOKEN_PATH, "utf-8"));
  } catch {
    return null;
  }
};

export const authorize = async (config: Config) => {
  const client = new google.auth.OAuth2({
    clientId: config.googleClientId,
    clientSecret: config.googleClientSecret,
    redirectUri: "urn:ietf:wg:oauth:2.0:oob",
  });

  // Write to file if token information is updated
  client.on("tokens", async (tokens) => {
    // Refresh token might not be included, so merge with existing token file content
    const oldTokens = (await loadTokens()) || {};
    await fs.writeFile(TOKEN_PATH, JSON.stringify({ ...oldTokens, ...tokens }));
  });

  // Load the token file
  let tokens = await loadTokens();

  // If there is no token file, prompt for authentication in the browser
  if (!tokens) {
    const url = client.generateAuthUrl({
      access_type: "offline",
      scope: "https://www.googleapis.com/auth/gmail.readonly",
    });
    const code = await prompt(
      `${url}\n\n` +
        "Open the following URL in your browser, then paste the resulting authorization code below: "
    );
    // Receive the authorization code copied and pasted by the user from the console
    const result = await client.getToken(code);
    tokens = result.tokens;
  }

  client.setCredentials(tokens);

  return client;
};

Fetching New Emails Matching Specific Conditions in Gmail

Once authentication is complete, we search for necessary messages from the email list using the Gmail API. By passing a search query to the q parameter, you can retrieve the same results from the API as you would by entering them into the search field on the Gmail web interface.

The from: query is used to filter the sender's email address. In my case, since there were systems for sending entry/exit emails for other extracurricular activities besides after-school care, I made it possible to accept multiple email addresses. To perform an OR search for multiple email addresses, it seems you can write it like {from:user1@example.com from:user2@example.com}.

The mechanism for fetching new emails can also be expressed with a query. Using the after: query, you can specify "items newer than the specified date". I only discovered this while researching, but while usually only "dates" like after:2022/06/01 can be specified, you can also specify the "time" by passing a UNIX timestamp.

https://www.labnol.org/internet/gmail-search-tips/29206/

Therefore, I devised a mechanism to fetch new emails without missing any by saving the date and time of the last email check and passing that timestamp into the after: query.

Finally, since gmail.users.messages.list(), which retrieves the message list, only returns message IDs, an additional request gmail.users.messages.get() is made to retrieve the message content.

fetchMessage.ts
import { google } from "googleapis";

import { authorize } from "./authorize";
import type { Config } from "./config";

export const fetchMessage = async (after: number, config: Config) => {
  // Set the authenticated OAuth2 client to the Gmail API
  const client = await authorize(config);
  const gmail = google.gmail({ version: "v1", auth: client });

  const mutableQuery: string[] = [];
  if (config.emails.length > 0) {
    // Construct the filter query for "From" email addresses
    const fromQuery = config.emails.map((mail) => `from:${mail}`).join(" ");
    mutableQuery.push(`{${fromQuery}}`);
  }
  // Construct the query for the received time filter
  mutableQuery.push(`after:${after}`);

  // Retrieve the message list
  const result = await gmail.users.messages.list({
    userId: "me",
    maxResults: 1,
    q: mutableQuery.join(" "),
  });

  const messageId = result.data.messages?.[0]?.id;
  if (!messageId) return null;

  // Retrieve message details
  const message = await gmail.users.messages.get({
    userId: "me",
    id: messageId,
  });

  return message.data;
};

Loading and Saving the Time the Email was Received

I will create a process to save and load the UNIX timestamp of the most recent time an email was received in a file named .timestamp.

timestamp.ts
import fs from "node:fs/promises";
import path from "node:path";

const TIMESTAMP_PATH = path.join(process.cwd(), ".timestamp");

export const createTimestamp = (beforeMinutes: number = 0) => {
  const d = new Date();
  d.setMinutes(d.getMinutes() - beforeMinutes);
  return Math.floor(d.getTime() / 1000);
};

export const loadTimestamp = async () => {
  try {
    const file = await fs.readFile(TIMESTAMP_PATH, "utf-8");
    const ts = parseInt(file, 10);
    if (Number.isNaN(ts)) throw new Error(`${file} is not a number`);
    return ts;
  } catch (e) {
    const ts = createTimestamp(5);
    await saveTimestamp(ts);
    return ts;
  }
};

export const saveTimestamp = async (ts: number) => {
  await fs.writeFile(TIMESTAMP_PATH, String(ts));
};

With this, the timestamp is written to the file only when there is a new email. If there are no new emails, the timestamp from the last time an email was received will be used no matter how many times the script is executed. This ensures that emails are checked without any omissions. When executed for the first time, it is set to use the current time's timestamp.

index.ts
import { loadConfig } from "./config";
import { fetchMessage } from "./fetchMessage";
import { createTimestamp, loadTimestamp, saveTimestamp } from "./timestamp";

(async () => {
  const config = await loadConfig();

  // Create the timestamp to be saved for the next run
  const nextTs = createTimestamp();
  // Load the timestamp saved in the file
  const ts = await loadTimestamp();
  // Fetch message
  const message = await fetchMessage(ts, config);
  if (!message) return;

  // TODO: Make Google Home speak based on the email content

  // Save the timestamp only if a message was retrieved
  await saveTimestamp(nextTs);
})();

Creating the Text for Google Home to Speak from the Message Content

Simply making it speak the email subject or body would result in a simple implementation, but things like the signature at the end of the body would also be read aloud. This time, I decided to use regular expressions to extract and read only the matching parts from the email body.
By the way, handling the email body can be cumbersome because you have to consider things like rich text and Base64 encoding, so I'm using message.snippet. This is a field that returns a short summary of the body as a short string, regardless of whether it's rich text or plain text.

https://developers.google.com/gmail/api/reference/rest/v1/users.messages?hl=en#Message.FIELDS.snippet

createSpeechText.ts
import type { gmail_v1 } from "googleapis";

import type { Config } from "./config";

export const createSpeechText = (
  message: gmail_v1.Schema$Message,
  config: Config
) => {
  const snippet = message.snippet;
  if (!snippet) return null;

  if (config.conditions.length === 0) {
    return snippet;
  }

  for (const condition of config.conditions) {
    const match = snippet.match(condition.regex);
    if (!match) continue;

    return condition.speechText(match);
  }
  return null;
};

Making Google Home Speak

I decided to use a library called google-home-player, which easily handles creating audio data from text and streaming it to Google Cast.

When initializing GoogleHomePlayer, you need to provide the Google Home's IP address. You can check the IP address by launching the Home app on your smartphone, going to the settings screen for the target Google Home device, and displaying "Device information."

import GoogleHomePlayer from "google-home-player";

const googleHome = new GoogleHomePlayer("192.168.x.x", "ja");
await googleHome.say("Hello");

Finally, combine everything to complete it.

index
import GoogleHomePlayer from "google-home-player";

import { loadConfig } from "./config";
import { createSpeechText } from "./createSpeechText";
import { fetchMessage } from "./fetchMessage";
import { createTimestamp, loadTimestamp, saveTimestamp } from "./timestamp";

(async () => {
  const config = await loadConfig();

  const nextTs = createTimestamp();
  const ts = await loadTimestamp();

  const message = await fetchMessage(ts, config);
  if (!message) return;

  const text = createSpeechText(message, config);
  if (!text) return;

  const googleHome = new GoogleHomePlayer(
    config.googleHomeIp,
    config.speechLanguage
  );
  await googleHome.say(text);

  await saveTimestamp(nextTs);
})();

Running on Raspberry Pi

For the actual device, I used a Raspberry Pi 3 Model B+ that I had on hand. I installed Raspberry Pi OS Lite, set up Wi-Fi and SSH, and made it accessible via SSH from my PC.

Since I am using Node.js v16 (LTS) this time, I installed the same version of Node.js on the Raspberry Pi side. Since the default version tends to be a bit old, I retrieved v16 from NodeSource. I also installed git and yarn.

$ curl -fsSL https://deb.nodesource.com/setup_16.x | sudo -E bash -
$ sudo apt-get install -y nodejs
$ sudo apt-get install -y git
$ npm install -g yarn

Clone the repository and edit the configuration file.

$ git clone https://github.com/fujikky/gmail-google-home-notify.git
$ cd gmail-google-home-notify
$ cp config.sample.js config.js
$ pico config.js

After that, install the dependencies and try running it once.

$ yarn --production
$ yarn start

After authenticating with Google in your browser, the process should proceed once you enter the authorization code into the console. On the first run, the current time is used for the timestamp, so nothing will be read aloud. If it finishes normally after that, .timestamp will be saved. By executing yarn start periodically, any emails that have arrived between the time in .timestamp and the current time will be read aloud.

When it comes to periodic execution, cron is the way to go, so I also created a script for cron. This is because the current directory functionality wouldn't work correctly unless the process went through a shell script.

start.sh
#!/bin/bash -eu

BASEDIR=$(cd $(dirname $0) && pwd)

cd $BASEDIR
yarn start >> /var/log/gmail-google-home-notify.log 2>&1

Finally, add the cron settings.

$ sudo touch /var/log/gmail-google-home-notify.log
$ sudo chmod 666 /var/log/gmail-google-home-notify.log
$ crontab edit -e

To execute it every minute, the setting would look like this:

*/1 * * * * '/home/xxxx/gmail-google-home-notify/start.sh'

Now, if you just leave this running, Google Home will automatically read your emails for you! Please give it a try if you're interested!

Discussion