iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article

[Next.js] The Hell of Implementing a Single-Pane Markdown Editor from Scratch Without Using SimpleMDE

に公開

0. What I made this time

  • Demo

https://next-markdown-murex.vercel.app/

  • What I registered on npm

https://www.npmjs.com/package/@react-libraries/markdown-editor

1. The Road to Hell: Until Opening the Door

1.1. Types of Markdown editors

Generally, there are roughly two types of Markdown editors:

  • 1-pane type
    Applies decoration to the input text itself.
    The method used by Zenn.
  • 2-pane type
    Displays text and rendering results separately (e.g., side-by-side).
    The method used by Qiita, etc.

If you ask which requires more advanced technology, it is overwhelmingly the 1-pane type. To begin with, decorating input text in real-time within a web application requires an unreasonable amount of effort.

1.2. How to integrate a one-pane Markdown editor

The standard choices are SimpleMDE and its improved version EasyMDE. For those using React, ReactSimpleMDE is the go-to. Effectively, there are no other options.

1.3. EasyMDE has bad compatibility with Next.js

EasyMDE tries to access the window object as soon as the package is imported. Doing this causes Next.js to dive into undefined content and crash during SSR. Furthermore, looking at the behavior when generating the Markdown editor, it is mandatory to place a textarea in the DOM that doesn't strictly need to be there. When it actually runs, it creates a DOM tree containing contentEditable=true next to the textarea. Since it was originally a general-purpose JavaScript library, this is understandable, but it completely escapes the control of the virtual DOM, making it difficult to handle from React.

1.4. ReactSimpleMDE has bad compatibility with Next.js

ReactSimpleMDE is a package designed to make EasyMDE easy to use from React. In terms of behavior, it simply calls EasyMDE's functions, so it also crashes during Next.js SSR. While it can be used by lazy-loading the ReactSimpleMDE package, it requires extra effort. Additionally, it has baffling behavior, such as recreating the EasyMDE object just by changing options or text content. Even if you decide to use an EasyMDE-based library for your Markdown editor, I recommend avoiding ReactSimpleMDE.

1.5. No, well, there's no other choice... I have to do it

With limited options and nowhere to turn, I decided to build it myself. It's easy to imagine that what lies ahead is a thorny path—or rather, pure hell—but since I claim "reinventing the wheel" as my trade, I have no choice but to do it.

2. Hell's Menu

2.1. It's Easy if You Just Want to Display It

Displaying Markdown in React is easy if you use react-markdown. However, because it's only for displaying, it naturally doesn't accept any input.

2.2. The Path of a One-Pane Markdown Editor

  • Create a text editing area that can be decorated with contentEditable=true.
  • Input Markdown text.
  • Analyze Markdown content.
  • Decorate text.

If you thought "Oh, that's easy" when seeing this, you're right. This part alone is easy. The real hell starts here.

  • Decoration causes the caret (cursor) to become detached.
  • Restore the caret to its original position.

Yes, we've entered the tedious part. Once the text is decorated, the content is rewritten and is no longer the same as the original data. After decoration, you must identify the node that seems to be the original position from the DOM and restore the caret. The information available is the DOM structure before decoration, the node where the caret was located, and the offset within it.

To do this, you need to count the number of characters while ignoring the DOM nodes. You have to navigate through a rendered DOM tree packed with numerous HTML nodes to find that specific location. Even more terrifying are line breaks. In plain text, you can count a line break as one character by counting \n, but block-level elements like DIV or P may or may not cause a line break depending on their surrounding context.

However, this is only the first circle of hell.

2.3. Let's Parse the Markdown for Now

Building a Markdown parser from scratch would be way beyond a joke, so I'll use unified. I'll add remark-parse as the basic Markdown parser and remark-gfm for table parsing. I won't be converting to HTML, so this is all I need. Ultimately, I'll handle outputting ReactNodes myself.

The result of passing it through unified is a tree structure of the Markdown analysis. It contains text information decomposed for display, but I won't—or rather, can't—use it. This decomposed text converts # Title into Title, which is problematic for building an editor because I need the original symbols. The useful information is the type (like heading, strong, or emphasis) and the offset of the original text range.

2.4. Converting Parsed Content into ReactNodes

Using the position information analyzed by unified, we build a ReactNode hierarchy without compromising the original text. We simply skip elements like paragraph or text nodes that aren't necessary for decoration. Then, we create nodes at locations like headers (heading) or bold text (strong) and wrap the text inside them.

Additionally, we count the number of nodes and set key values so that changes in the node structure can be detected when the text is edited manually.

import React from "react";
import type unist from "unist";
import type { Root, Content } from "mdast";
import { unified, Processor, Compiler } from "unified";
import remarkParse from "remark-parse";
import remarkGfm from "remark-gfm";

export type VNode = { type: string; value?: unknown; start: number; end: number };

function ReactCompiler(this: Processor) {
  const expandNode = (node: Content & Partial<unist.Parent<Content>>, nodes: VNode[]) => {
    nodes.push({
      type: node.type,
      start: node.position!.start.offset!,
      end: node.position!.end!.offset!,
      value: node.type === "heading" ? node.depth : undefined,
    });
    node.children?.forEach((n) => expandNode(n, nodes));
  };
  const reactNode = (vnodes: VNode[], value: string): React.ReactNode => {
    let position = 0;
    let index = 0;
    let nodeCount = 0;
    const getNode = (limit: number): React.ReactNode => {
      const nodes = [];
      while (position < limit && index < vnodes.length) {
        const vnode = vnodes[index];
        const [start, end] = [vnode.start, vnode.end];
        if (start > limit) {
          nodes.push(value.substring(position, limit));
          position = limit;
          break;
        }
        if (position < start) {
          if (index < vnodes.length) {
            nodes.push(value.substring(position, start));
            position = start;
          } else {
            nodes.push(value.substring(position, end));
            position = end;
          }
        } else {
          const TagName = {
            heading: "h" + vnode.value,
            strong: "strong",
            emphasis: "em",
            inlineCode: "code",
            code: "code",
            list: "code",
            table: "code",
          }[vnode.type] as keyof JSX.IntrinsicElements;
          index++;
          if (TagName) {
            if (index < vnodes.length) {
              nodes.push(React.createElement(TagName, { key: index }, getNode(end)));
            } else {
              nodes.push(React.createElement(TagName, { key: index }, value.substring(start, end)));
              position = end;
            }
          }
        }
      }
      if (position < limit) {
        nodeCount++;
        nodes.push(value.substring(position, limit));
        position = limit;
      }
      nodeCount += nodes.length;
      return nodes.length ? nodes : null;
    };
    const nodes = getNode(value.length);
    if (!nodes) return;
    return React.createElement("span", { key: nodeCount }, nodes);
  };

  const Compiler: Compiler = (tree: unist.Node & Partial<unist.Parent<unist.Node>>, value) => {
    const nodes: VNode[] = [];
    expandNode(tree as Content, nodes);
    return reactNode(
      nodes.filter((node) => !["text", "paragraph"].includes(node.type)),
      String(value)
    );
  };
  this.Compiler = Compiler;
}
const processor = unified().use(remarkParse).use(remarkGfm).use(ReactCompiler) as Processor<
  Root,
  Root,
  Root,
  React.ReactElement
>;

export const useMarkdown = (value: string) => {
  const node = React.useMemo(() => {
    return processor.processSync(value).result;
  }, [value]);
  return node;
};

2.5. Rendering the Converted ReactNode to a contentEditable Node

Set the converted ReactNode as the children of the tag where contentEditable is configured. This will display the content. Now, this is where the real hell begins. When set to contentEditable, the content can be modified manually. Since it's a text editor, this is natural, but React cannot detect changes occurring in node content due to manual modifications, so the displayed content goes haywire. You might see the same content displayed multiple times, or encounter errors because non-deletable nodes are generated. While there are ways to avoid this using dangerouslySetInnerHTML, you lose the benefits of virtual DOM diffing.

So, here is the countermeasure. If manual updates are the problem, we will manage the data ourselves using the data sent via onKeyDown and onKeyPress!

Since we need to check the input content and apply decorations anyway, we might as well check all manually entered text at the time of input. We should also handle onPaste and onDrop entirely ourselves. If you rely on the browser's default functions, you lose; you need a heart that doubts everything and trusts no one. Yes, as you proceed through hell, your heart grows weary.

const insertText = (text?: string, start?: number, end?: number) => {
  const pos = getPosition();
  const currentText = refNode.current!.innerText;
  const startPos = start !== undefined ? start : pos[0];
  const endPos = end !== undefined ? end : start !== undefined ? start : pos[1];
  pushText(
    currentText.slice(0, startPos) + (text || "") + currentText.slice(endPos, currentText.length)
  );
  property.position = startPos + (text?.length || 0);
};
const deleteInsertText = (text: string, start: number, end: number) => {
  const pos = getPosition();
  const currentText = refNode.current!.innerText;
  if (pos[0] < start) {
    const currentText2 = currentText.slice(0, start) + currentText.slice(end, currentText.length);
    pushText(
      currentText2.slice(0, pos[0]) + text + currentText2.slice(pos[1], currentText2.length)
    );
    property.position = pos[0] + text.length;
  } else {
    const currentText2 =
      currentText.slice(0, pos[0]) + text + currentText.slice(pos[1], currentText.length);
    pushText(currentText2.slice(0, start) + currentText2.slice(end, currentText2.length));
    property.position = pos[0] + text.length + start - end;
  }
};
const deleteText = (start: number, end: number) => {
  const currentText = refNode.current!.innerText;
  const text = currentText.slice(0, start) + currentText.slice(end, currentText.length);
  pushText(text);
};
const handleInput: FormEventHandler<HTMLElement> = (e) => {
  e.preventDefault();
  const currentText = e.currentTarget.innerText;
  if (!property.active) {
    pushText(currentText);
    property.position = getPosition()[0];
  }
};
const handlePaste: ClipboardEventHandler<HTMLElement> = (e) => {
  const t = e.clipboardData.getData("text/plain").replace(/\r\n/g, "\n");
  insertText(t);
  e.preventDefault();
};
const handleDragStart: DragEventHandler<HTMLDivElement> = (e) => {
  property.dragText = e.dataTransfer.getData("text/plain");
};
const handleDrop: DragEventHandler<HTMLDivElement> = (e) => {
  if (document.caretRangeFromPoint) {
    const p = getPosition();
    var sel = getSelection()!;
    const x = e.clientX;
    const y = e.clientY;
    const pos = document.caretRangeFromPoint(x, y)!;
    sel.removeAllRanges();
    sel.addRange(pos);
    const t = e.dataTransfer.getData("text/plain").replace(/\r\n/g, "\n");
    deleteInsertText(t, p[0], p[1]);
  } else {
    const p = getPosition();
    const range = document.createRange();
    range.setStart((e.nativeEvent as any).rangeParent, (e.nativeEvent as any).rangeOffset);
    var sel = getSelection()!;
    sel.removeAllRanges();
    sel.addRange(range);
    const t = e.dataTransfer.getData("text/plain").replace(/\r\n/g, "\n");
    deleteInsertText(t, p[0], p[1]);
  }
  e.preventDefault();
};
const handleKeyDown: KeyboardEventHandler<HTMLDivElement> = (e) => {
  switch (e.key) {
    case "Tab": {
      insertText("\t");
      e.preventDefault();
      break;
    }
    case "Enter":
      const p = getPosition();
      if (p[0] === refNode.current!.innerText.length) {
        insertText("\n\n");
        property.position--;
      } else insertText("\n");
      e.preventDefault();
      break;
    case "Backspace":
      {
        const p = getPosition();
        const start = Math.max(p[0] - 1, 0);
        const end = Math.min(p[1], refNode.current!.innerText.length);
        deleteText(start, end);
        property.position = start;
        e.preventDefault();
      }
      break;
    case "Delete":
      {
        const p = getPosition();
        deleteText(p[0], p[1] + 1);
        property.position = p[0];
        e.preventDefault();
      }
      break;
    case "z":
      if (e.ctrlKey && !e.shiftKey) {
        undoText();
      }
      break;
    case "y":
      if (e.ctrlKey && !e.shiftKey) {
        redoText();
      }
      break;
  }
};

2.6. Nodes at the Cursor Position

The one browser feature we barely need is caret position management. If we don't track where the caret is, the text changes will no longer align with the actual position. Since the caret exists inside DOM nodes, we must navigate through the DOM to save and restore the current position. The nuisance here is block elements, which make line-break detection difficult.

  const movePosition = (editor: HTMLElement, start: number, end?: number) => {
    const selection = document.getSelection();
    if (!selection) return;
    const findNode = (node: Node, count: number): [Node | null, number] => {
      if (node.nodeType === Node.TEXT_NODE) {
        count -= node.textContent!.length;
      } else if (node.nodeName === 'BR') {
        count -= 1;
      }
      if (count <= 0) {
        return [node, (node.nodeType === Node.TEXT_NODE ? node.textContent!.length : 0) + count];
      }
      for (let i = 0; i < node.childNodes.length; i++) {
        const [n, o] = findNode(node.childNodes[i], count);
        if (n) return [n, o];
        count = o;
      }
      return [null, count];
    };
    const [targetNode, offset] = findNode(editor, start);
    const [targetNode2, offset2] = end !== undefined ? findNode(editor, end) : [null, 0];
    const range = document.createRange();
    try {
      if (targetNode) {
        range.setStart(targetNode, offset);
        if (targetNode2) range.setEnd(targetNode2, offset2);
        selection.removeAllRanges();
        selection.addRange(range);
      } else {
        range.setStart(refNode.current!, 0);
        selection.removeAllRanges();
        selection.addRange(range);
      }
    } catch (e) {
      console.error(e);
    }
  };
  const getPosition = () => {
    const selection = document.getSelection();
    if (!selection) return [0, 0] as const;
    const getPos = (end = true) => {
      const [targetNode, targetOffset] = end
        ? [selection.anchorNode, selection.anchorOffset]
        : [selection.focusNode, selection.focusOffset];
      const findNode = (node: Node) => {
        if (node === targetNode && (node !== refNode.current || !targetOffset)) {
          return [true, targetOffset] as const;
        }
        let count = 0;
        for (let i = 0; i < node.childNodes.length; i++) {
          const [flag, length] = findNode(node.childNodes[i]);
          count += length;
          if (flag) return [true, count] as const;
        }
        count +=
          node.nodeType === Node.TEXT_NODE
            ? node.textContent!.length
            : node.nodeName === 'BR' || node.nodeName === 'DIV' || node.nodeName === 'P'
            ? 1
            : 0;
        return [false, count] as const;
      };
      const p = findNode(refNode.current!);
      return p[0] ? p[1] : p[1] - 1;
    };

So, here is the countermeasure. Get rid of all block elements and configure everything to consist of pre and newline characters!

Since we are also managing onKeyDown ourselves, it's easy to block the submission of p or div tags and inject newline characters instead. Now, block element DOMs no longer exist in the caret calculations.

outline: none;
white-space: pre-wrap;
code,
p,
div,
h1,
h2,
h3,
h4,
h5,
h6,
h7 {
  display: inline;
}

2.7. Whoa, It's Done

There were some more complications around onDrop, but anyway, I managed to create something that works. It seems I've finally escaped hell. Now, let's create an npm package.

2.8. The ESM Hell That Wasn't Over Yet

Once I turned it into an npm package and placed it in node_modules, it stopped working in Next.js. The reason is that unified, the library I'm using for Markdown analysis, is in ESM format. Explaining this in detail would be long, but general Node.js packages are in CJS format, so mixing them with ESM leads to trouble. If it's not packaged, Webpack can handle it nicely during bundling. However, once separated, it's not that simple.

As a solution, I considered two options: making imports asynchronous to call ESM from CJS, or making this npm package itself in ESM. I ultimately chose the latter. From there, I encountered a troublesome phenomenon where @emotion, which I was using because it supports both CJS and ESM, actually caused bugs. When calling a dual-support package in an ESM state, different types are called for server-side and client-side processing under the Next.js environment. As a result, I saw symptoms like the default property being present or absent in the imported instance. Anyway, I took measures against that as well.

import type { CreateStyled } from "@emotion/styled";
import styled from "@emotion/styled";

export const Root = (typeof styled === "function"
  ? styled
  : (styled as { default: CreateStyled }).default)("div")``;

3. Return from Hell

It turned out to be a journey of about a week, but I managed to make it back alive. Now I have to write documentation and create additional features.

For now, what I've built is already registered on npm, as shown in the link at the beginning.

Start running with the reinvented wheel!

GitHubで編集を提案

Discussion