iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🙋

Monitoring and Mocking Requests with Puppeteer as a Local Proxy

に公開

For performance tuning, I wanted an experimental tool to rewrite network requests non-destructively without touching the source code to see how much LCP improves, but I couldn't find any good proxy tools for this purpose.

Existing local proxy tools often require DNS configuration. I wanted to avoid them because they can potentially cause unintended side effects.

tl;dr

  • I intercepted and rewrote requests using Puppeteer's page.setRequestInterception(true).

How to capture request content from the browser

Test HTML

<!DOCTYPE html>
<html lang="en">

<head>
  <meta charset="UTF-8">
</head>

<body>
  <script type="module">
    const x = await fetch('https://jsonplaceholder.typicode.com/posts')
      .then(response => response.json())
    const el = document.createElement('pre')
    el.textContent = JSON.stringify(x, null, 2)
    document.body.appendChild(el)
  </script>
</body>

</html>

This is a simple HTML file that fetches JSON from jsonplaceholder and inserts it into the DOM as a pre element.

Suppose you want to intercept this request mid-flight and rewrite its content.

Puppeteer network interception

I will skip the installation of Puppeteer itself.

https://pptr.dev/guides/network-interception

Here is the Node script that uses this:

import puppeteer, { type HTTPRequest } from "puppeteer";

async function main() {
  const browser = await puppeteer.launch({
    headless: false,
    args: [
      "--no-sandbox",
      "--disable-setuid-sandbox",
      "--window-size=1024,768",
    ],
  });

  // Force termination with Ctrl-C
  process.on("SIGINT", () => {
    browser.close().finally(() => {
      process.exit(0);
    });
  });

  // For checking if the requests instance is identical
  const requests = new Set<HTTPRequest>();

  const page = (await browser.pages())[0];
  await page.setRequestInterception(true);
  page.on("request", (req) => {
    requests.add(req);
    if (req.isInterceptResolutionHandled()) return;
    if (req.url().endsWith("/posts")) {
      req.respond({
        status: 200,
        contentType: "application/json",
        headers: {
          "Access-Control-Allow-Origin": "*",
          "Content-Type": "application/json",
        },
        body: JSON.stringify([
          { id: 1, title: "title1" },
          { id: 2, title: "title2" },
          { id: 3, title: "title3" },
        ]),
      });
      return;
    }
    req.continue();
  });

  page.on("requestfinished", async (request) => {
    requests.delete(request);
    const response = request.response();
    if (!response) {
      return;
    }
  });
  page.on("requestfailed", (request) => {
    requests.delete(request);
  });

  await page.goto("http://localhost:4000/proxied.html", {
    waitUntil: "networkidle0",
  });

  console.log("[results]", requests.size);
  await browser.close();
}

main().catch(console.error);

The important part is here. When a request is made to /posts, it immediately creates a response with mock data and modifies it.

  const page = (await browser.pages())[0];
  await page.setRequestInterception(true);
  page.on("request", (req) => {
    // ...
    if (req.url().endsWith("/posts")) {
      req.respond({...});
      return
    }
    req.continue();
  }

Initiator

Since it runs on the Chrome DevTools Protocol, you can retrieve the call stack up to the point where the network request is triggered.

    if (req.url().endsWith("/posts")) {
      const initiator = req.initiator();
      console.log(
        "[mock]",
        req.url(),
        "by",
        // initiator?.url,
        initiator?.stack?.callFrames.map((frame) => {
          return `${frame.url}:${frame.functionName}:${frame.lineNumber}:${frame.columnNumber}`;
        })
      );
      req.respond({...});
      return;
    }

Let's wrap the logic in a main function in the JS of proxied.html.

    async function main() {
      fetch('https://jsonplaceholder.typicode.com/posts')
        .then(response => response.json())
        .then(data => {
          const el = document.createElement('pre')
          el.textContent = JSON.stringify(data, null, 2)
          document.body.appendChild(el)
        })
    }
    main();

The log looks like this:

[mock] https://jsonplaceholder.typicode.com/posts by [
  'http://localhost:4000/proxied.html:main:9:6',
  'http://localhost:4000/proxied.html::17:4'
]

Building a simple network profiler

To see which JS requests are slow and by how much, let's create a profiler that measures requests for each Initiator.

import puppeteer, { type HTTPRequest } from "puppeteer";
type Profile = Map<
  string,
  Array<[duration: number, started: number, ended: number]>
>;
function serializeHttpRequest(req: HTTPRequest): string {
  const method = req.method();
  const initiator = req.initiator();
  const firstFrame = initiator?.stack?.callFrames[0];
  if (!firstFrame) return `${method} <no-stack> ${req.url()}`;
  return `${method} ${firstFrame?.url}:${firstFrame?.functionName}:${firstFrame?.lineNumber}:${firstFrame?.columnNumber}`;
}

function serializeProfile(profile: Profile): string {
  const sortedUrls = Array.from(profile.keys()).sort();
  let s = "";
  let allTotal = 0;
  for (const url of sortedUrls) {
    const durations = profile.get(url)!;
    const reqsByStarted = durations.sort((a, b) => a[1] - b[1]);

    if (reqsByStarted.length === 1) {
      s += `${url}\t${reqsByStarted[0][0]}\n`;
      allTotal += reqsByStarted[0][0];
    } else {
      const sum = reqsByStarted.reduce((acc, [duration]) => acc + duration, 0);
      allTotal += sum;
      s += `${url}\ttotal:${sum}\n`;
      for (const [duration, started, ended] of reqsByStarted) {
        s += `  ${duration}\n`;
      }
    }
  }
  s += `total:${allTotal}`;
  return s;
}

async function main(targetUrl: string) {
  const browser = await puppeteer.launch({
    headless: false,
    // slowMo: 30,
    args: [
      "--no-sandbox",
      "--disable-setuid-sandbox",
      "--window-size=1024,768",
    ],
  });
  process.on("SIGINT", () => {
    browser.close().finally(() => {
      process.exit(0);
    });
  });

  // Check if request instances are cached and released
  const requests = new Set<HTTPRequest>();

  const page = (await browser.pages())[0];
  await page.setRequestInterception(true);
  const profiles: Map<
    string,
    Array<[duration: number, started: number, ended: number]>
  > = new Map();
  const requestStarted: Map<HTTPRequest, number> = new Map();

  page.on("request", (req) => {
    requests.add(req);
    requestStarted.set(req, performance.now());
    if (req.isInterceptResolutionHandled()) return;
    req.continue();
  });

  page.on("requestfinished", async (request) => {
    requests.delete(request);
    const started = requestStarted.get(request)!;
    const now = performance.now();
    const duration = now - started;
    const key = serializeHttpRequest(request);
    if (!profiles.has(key)) {
      profiles.set(key, []);
    }
    profiles.get(key)!.push([duration, started, now]);
  });
  page.on("requestfailed", (request) => {
    requests.delete(request);
    const started = requestStarted.get(request)!;
    const now = performance.now();
    const duration = now - started;
    const key = serializeHttpRequest(request);
    if (!profiles.has(key)) {
      profiles.set(key, []);
    }
    profiles.get(key)!.push([duration, started, now]);
  });
  await page.goto(targetUrl, {
    waitUntil: "networkidle0",
  });

  await page.waitForSelector("body");
  await page.waitForNetworkIdle();

  console.log("[serialized]", serializeProfile(profiles));
  await browser.close();
}

main("<url>").catch(console.error);

This is a quickly thrown-together script, so the code is a bit rough.

Example execution against https://www.nicovideo.jp/ranking

OPTIONS <no-stack> https://api.nicoad.nicovideo.jp/v1/nicoadgroups      57.81432599999994
OPTIONS <no-stack> https://prebid-a.rubiconproject.com/event    25.908453999999892
POST https://micro.rubiconproject.com/prebid/dynamic/14490.js?key1=wwwnicovideojp:i:11:39025   total:1965.1232170000007
  157.67779999999993
  45.16228000000001
  409.00818700000036
  373.6880229999997
  271.213068
  138.20560100000012
  25.350596000000223
  137.82976099999996
  359.40771700000005
  25.330491000000166
  13.241911999999957
  9.00778100000025
POST https://resource.video.nimg.jp/web/scripts/bundle/vendor.js?1729155654::1:479756  63.3017040000002
POST https://www.googletagmanager.com/gtag/js?id=G-5LM4HED1NJ&l=NicoGoogleTagManagerDataLayer&cx=c:Jc:239:222  48.95321899999999
POST https://www.googletagmanager.com/gtag/js?id=G-5LM4HED1NJ&l=NicoGoogleTagManagerDataLayer&cx=c:Mc:240:212  72.90145000000007
POST https://www.googletagmanager.com/gtag/js?id=G-FS29H4ZGX2&l=NicoGoogleTagManagerDataLayer&cx=c:Jc:161:222  142.85180200000013
POST https://www.googletagmanager.com/gtag/js?id=G-FS29H4ZGX2&l=NicoGoogleTagManagerDataLayer&cx=c:Mc:162:212  102.04987099999994
total:23485.146703

If you sort this by arbitrary data or display only large numbers, it looks like it could be quite useful.

So, that was an example of performing local proxy-like tasks using the Chrome DevTools Protocol.

Discussion