Changing Speech to PDF with NextJS and ExpressJS | CSS-Methods

August 4, 2022

1

With speech interfaces changing into extra of a factor, it’s value exploring a number of the issues we will do with speech interactions. Like, what if let’s imagine one thing and have that transcribed and pumped out as a downloadable PDF?

Effectively, spoiler alert: we completely can do this! There are libraries and frameworks we will cobble collectively to make it occur, and that’s what we’re going to do collectively on this article.

These are the instruments we‘re utilizing

First off, these are the 2 massive gamers: Subsequent.js and Categorical.js.

Subsequent.js tacks on further functionalities to React, together with key options for constructing static websites. It’s a go-to for a lot of builders due to what it presents proper out of the field, like dynamic routing, picture optimization, built-in-domain and subdomain routing, quick refreshes, file system routing, and API routes… amongst many, many different issues.

In our case, we positively want Subsequent.js for its API routes on our shopper server. We would like a route that takes a textual content file, converts it to PDF, writes it to our filesystem, then sends a response to the shopper.

Categorical.js permits us to get slightly Node.js app going with routing, HTTP helpers, and templating. It’s a server for our personal API, which is what we’ll want as we move and parse information between issues.

We’ve another dependencies we’ll be placing to make use of:

react-speech-recognition: A library for changing speech to textual content, making it accessible to React parts.
regenerator-runtime: A library for troubleshooting the “regeneratorRuntime is just not outlined” error that exhibits up in Subsequent.js when utilizing react-speech-recognition
html-pdf-node: A library for changing an HTML web page or public URL right into a PDF
axios: A library for making HTTP requests in each the browser and Node.js
cors: A library that enables cross-origin useful resource sharing

Establishing

The very first thing we wish to do is create two mission folders, one for the shopper and one for the server. Title them no matter you’d like. I’m naming mine audio-to-pdf-client and audio-to-pdf-server, respectively.

The quickest technique to get began with Subsequent.js on the shopper aspect is to bootstrap it with create-next-app. So, open your terminal and run the next command out of your shopper mission folder:

npx create-next-app shopper

Now we want our Categorical server. We will get it by cd-ing into the server mission folder and working the npm init command. A bundle.json file might be created within the server mission folder as soon as it’s completed.

We nonetheless want to truly set up Categorical, so let’s do this now with npm set up specific. Now we will create a brand new index.js file within the server mission folder and drop this code in there:

const specific = require("specific")
const app = specific()

app.hear(4000, () => console.log("Server is working on port 4000"))

Able to run the server?

node index.js

We’re going to wish a pair extra folders and and one other file to maneuver ahead:

Create a parts folder within the shopper mission folder.
Create a SpeechToText.jsx file within the parts subfolder.

Earlier than we go any additional, we have now slightly cleanup to do. Particularly, we have to change the default code within the pages/index.js file with this:

import Head from "subsequent/head";
import SpeechToText from "../parts/SpeechToText";

export default operate House() {
  return (
    <div className="dwelling">
      <Head>
        <title>Audio To PDF</title>
        <meta
          identify="description"
          content material="An app that converts audio to pdf within the browser"
        />
        <hyperlink rel="icon" href="https://css-tricks.com/favicon.ico" />
      </Head>

      <h1>Convert your speech to pdf</h1>

      <predominant>
        <SpeechToText />
      </predominant>
    </div>
  );
}

The imported SpeechToText element will ultimately be exported from parts/SpeechToText.jsx.

Let’s set up the opposite dependencies

Alright, we have now the preliminary setup for our app out of the best way. Now we will set up the libraries that deal with the info that’s handed round.

We will set up our shopper dependencies with:

npm set up react-speech-recognition regenerator-runtime axios

Our Categorical server dependencies are up subsequent, so let’s cd into the server mission folder and set up these:

npm set up html-pdf-node cors

Most likely a superb time to pause and ensure the information in our mission folders are in tact. Right here’s what it is best to have within the shopper mission folder at this level:

/audio-to-pdf-web-client
├─ /parts
|  └── SpeechToText.jsx
├─ /pages
|  ├─ _app.js
|  └── index.js
└── /kinds
    ├─globals.css
    └── House.module.css

And right here’s what it is best to have within the server mission folder:

/audio-to-pdf-server
└── index.js

Constructing the UI

Effectively, our speech-to-PDF wouldn’t be all that nice if there’s no technique to work together with it, so let’s make a React element for it that we will name <SpeechToText>.

You may completely use your personal markup. Right here’s what I’ve acquired to present you an concept of the items we’re placing collectively:

import React from "react";

const SpeechToText = () => {
  return (
    <>
      <part>
        <div className="button-container">
          <button sort="button" type={{ "--bgColor": "blue" }}>
            Begin
          </button>
          <button sort="button" type={{ "--bgColor": "orange" }}>
            Cease
          </button>
        </div>
        <div
          className="phrases"
          contentEditable
          suppressContentEditableWarning={true}
        ></div>
        <div className="button-container">
          <button sort="button" type={{ "--bgColor": "purple" }}>
            Reset
          </button>
          <button sort="button" type={{ "--bgColor": "inexperienced" }}>
            Convert to pdf
          </button>
        </div>
      </part>
    </>
  );
};

export default SpeechToText;

This element returns a React fragment that comprises an HTML <``part``> aspect that comprises three divs:

.button-container comprises two buttons that might be used to begin and cease speech recognition.
.phrases has contentEditable and suppressContentEditableWarning attributes to make this aspect editable and suppress any warnings from React.
One other .button-container holds two extra buttons that might be used to reset and convert speech to PDF, respectively.

Styling is one other factor altogether. I gained’t go into it right here, however you’re welcome to make use of some kinds I wrote both as a place to begin on your personal kinds/international.css file.

View Full CSS

html,
physique {
  padding: 0;
  margin: 0;
  font-family: -apple-system, BlinkMacSystemFont, Segoe UI, Roboto, Oxygen,
    Ubuntu, Cantarell, Fira Sans, Droid Sans, Helvetica Neue, sans-serif;
}

a {
  shade: inherit;
  text-decoration: none;
}

* {
  box-sizing: border-box;
}

.dwelling {
  background-color: #333;
  min-height: 100%;
  padding: 0 1rem;
  padding-bottom: 3rem;
}

h1 {
  width: 100%;
  max-width: 400px;
  margin: auto;
  padding: 2rem 0;
  text-align: middle;
  text-transform: capitalize;
  shade: white;
  font-size: 1rem;
}

.button-container {
  text-align: middle;
  show: flex;
  justify-content: middle;
  hole: 3rem;
}

button {
  shade: white;
  background-color: var(--bgColor);
  font-size: 1.2rem;
  padding: 0.5rem 1.5rem;
  border: none;
  border-radius: 20px;
  cursor: pointer;
}

button:hover {
  opacity: 0.9;
}

button:energetic {
  remodel: scale(0.99);
}

.phrases {
  max-width: 700px;
  margin: 50px auto;
  peak: 50vh;
  border-radius: 5px;
  padding: 1rem 2rem 1rem 5rem;
  background-image: -webkit-gradient(
    linear,
    0 0,
    0 100%,
    from(#d9eaf3),
    color-stop(4%, #fff)
  ) 0 4px;
  background-size: 100% 3rem;
  background-attachment: scroll;
  place: relative;
  line-height: 3rem;
  overflow-y: auto;
}

.success,
.error {
  background-color: #fff;
  margin: 1rem auto;
  padding: 0.5rem 1rem;
  border-radius: 5px;
  width: max-content;
  text-align: middle;
  show: block;
}

.success {
  shade: inexperienced;
}

.error {
  shade: purple;
}

The CSS variables in there are getting used to manage the background shade of the buttons.

Let’s see the newest adjustments! Run npm run dev within the terminal and test them out.

You need to see this in browser whenever you go to http://localhost:3000:

Our first speech to textual content conversion!

The primary motion to take is to import the mandatory dependencies into our <SpeechToText> element:

import React, { useRef, useState } from "react";
import SpeechRecognition, {
  useSpeechRecognition,
} from "react-speech-recognition";
import axios from "axios";

Then we test if speech recognition is supported by the browser and render a discover if not supported:

const speechRecognitionSupported =
  SpeechRecognition.browserSupportsSpeechRecognition();

if (!speechRecognitionSupported) {
  return <div>Your browser doesn't help speech recognition.</div>;
}

Subsequent up, let’s extract transcript and resetTranscript from the useSpeechRecognition() hook:

const { transcript, resetTranscript } = useSpeechRecognition();

That is what we want for the state that handles listening:

const [listening, setListening] = useState(false);

We additionally want a ref for the div with the contentEditable attribute, then we have to add the ref attribute to it and move transcript as youngsters:

const textBodyRef = useRef(null);

…and:

<div
  className="phrases"
  contentEditable
  ref={textBodyRef}
  suppressContentEditableWarning={true}
  >
  {transcript}
</div>

The very last thing we want here’s a operate that triggers speech recognition and to tie that operate to the onClick occasion listener of our button. The button units listening to true and makes it run constantly. We’ll disable the button whereas it’s in that state to stop us from firing off further occasions.

const startListening = () => {
  setListening(true);
  SpeechRecognition.startListening({
    steady: true,
  });
};

…and:

<button
  sort="button"
  onClick={startListening}
  type={{ "--bgColor": "blue" }}
  disabled={listening}
>
  Begin
</button>

Clicking on the button ought to now begin up the transcription.

Extra capabilities

OK, so we have now a element that may begin listening. However now we want it to do a couple of different issues as nicely, like stopListening, resetText and handleConversion. Let’s make these capabilities.

const stopListening = () => {
  setListening(false);
  SpeechRecognition.stopListening();
};

const resetText = () => {
  stopListening();
  resetTranscript();
  textBodyRef.present.innerText = "";
};

const handleConversion = async () => {}

Every of the capabilities might be added to an onClick occasion listener on the suitable buttons:

<button
  sort="button"
  onClick={stopListening}
  type={{ "--bgColor": "orange" }}
  disabled={listening === false}
>
  Cease
</button>

<div className="button-container">
  <button
    sort="button"
    onClick={resetText}
    type={{ "--bgColor": "purple" }}
  >
    Reset
  </button>
  <button
    sort="button"
    type={{ "--bgColor": "inexperienced" }}
    onClick={handleConversion}
  >
    Convert to pdf
  </button>
</div>

The handleConversion operate is asynchronous as a result of we are going to ultimately be making an API request. The “Cease” button has the disabled attribute that might be be triggered when listening is fake.

If we restart the server and refresh the browser, we will now begin, cease, and reset our speech transcription within the browser.

Now what we want is for the app to transcribe that acknowledged speech by changing it to a PDF file. For that, we want the server-side path from Categorical.js.

Establishing the API route

The aim of this route is to take a textual content file, convert it to a PDF, write that PDF to our filesystem, then ship a response to the shopper.

To setup, we’d open the server/index.js file and import the html-pdf-node and fs dependencies that might be used to put in writing and open our filesystem.

const HTMLToPDF = require("html-pdf-node");
const fs = require("fs");
const cors = require("cors)

Subsequent, we are going to setup our route:

app.use(cors())
app.use(specific.json())

app.submit("https://css-tricks.com/", (req, res) => {
  // and so on.
})

We then proceed to outline our choices required in an effort to use html-pdf-node contained in the route:

let choices = { format: "A4" };
let file = {
  content material: `<html><physique><pre type="font-size: 1.2rem">${req.physique.textual content}</pre></physique></html>`,
};

The choices object accepts a price to set the paper dimension and elegance. Paper sizes comply with a a lot totally different system than the sizing items we usually use on the internet. For instance, A4 is the standard letter dimension.

The file object accepts both the URL of a public web site or HTML markup. As a way to generate our HTML web page, we are going to use the html, physique, pre HTML tags and the textual content from the req.physique.

You may apply any styling of your selection.

Subsequent, we are going to add a trycatch to deal with any errors which may pop up alongside the best way:

attempt {

} catch(error){
  console.log(error);
  res.standing(500).ship(error);
}

Subsequent, we are going to use the generatePdf from the html-pdf-node library to generate a pdfBuffer (the uncooked PDF file) from our file and create a singular pdfName:

HTMLToPDF.generatePdf(file, choices).then((pdfBuffer) => {
  // console.log("PDF Buffer:-", pdfBuffer);
  const pdfName = "./information/speech" + Date.now() + ".pdf";

  // Subsequent code right here
}

From there, we use the filesystem module to put in writing, learn and (sure, lastly!) ship a response to the shopper app:

fs.writeFile(pdfName, pdfBuffer, operate (writeError) {
  if (writeError) {
    return res
      .standing(500)
      .json({ message: "Unable to put in writing file. Attempt once more." });
  }

  fs.readFile(pdfName, operate (readError, readData) {
    if (!readError && readData) {
      // console.log({ readData });
      res.setHeader("Content material-Kind", "utility/pdf");
      res.setHeader("Content material-Disposition", "attachment");
      res.ship(readData);
      return;
    }

    return res
      .standing(500)
      .json({ message: "Unable to put in writing file. Attempt once more." });
  });
});

Let’s break that down a bit:

The writeFile filesystem module accepts a file identify, information and a callback operate that may returns an error message if there’s a difficulty writing to the file. When you’re working with a CDN that gives error endpoints, you would use these as a substitute.
The readFile filesystem module accepts a file identify and a callback operate that’s succesful or returning a learn error in addition to the learn information. As soon as we have now no learn error and the learn information is current, we are going to assemble and ship a response to the shopper. Once more, this may be changed along with your CDN’s endpoints in case you have them.
The res.setHeader("Content material-Kind", "utility/pdf"); tells the browser that we’re sending a PDF file.
The res.setHeader("Content material-Disposition", "attachment"); tells the browser to make the obtained information downloadable.

For the reason that API route prepared, we will use it in our app at http://localhost:4000. We will the proceed to the shopper a part of our utility to finish the handleConversion operate.

Dealing with the conversion

Earlier than we will begin engaged on a handleConversion operate, we have to create a state that handles our API requests for loading, error, success, and different messages. We’re going use React’s useState hook to set that up:

const [response, setResponse] = useState({
  loading: false,
  message: "",
  error: false,
  success: false,
});

Within the handleConversion operate, we are going to test for when the online web page has been loaded earlier than working our code and ensure the div with the editable attribute is just not empty:

if (typeof window !== "undefined") {
const userText = textBodyRef.present.innerText;
  // console.log(textBodyRef.present.innerText);

  if (!userText) {
    alert("Please converse or write some textual content.");
    return;
  }
}

We proceed by wrapping our eventual API request in a trycatch, dealing with any error which will come up, and updating the response state:

attempt {

} catch(error){
  setResponse({
    ...response,
    loading: false,
    error: true,
    message:
      "An surprising error occurred. Textual content not transformed. Please attempt once more",
    success: false,
  });
}

Subsequent, we set some values for the response state and likewise set config for axios and make a submit request to the server:

setResponse({
  ...response,
  loading: true,
  message: "",
  error: false,
  success: false,
});
const config = {
  headers: {
    "Content material-Kind": "utility/json",
  },
  responseType: "blob",
};

const res = await axios.submit(
  "http://localhost:4000",
  {
    textual content: textBodyRef.present.innerText,
  },
  config
);

As soon as we have now gotten a profitable response, we set the response state with the suitable values and instruct the browser to obtain the obtained PDF:

setResponse({
  ...response,
  loading: false,
  error: false,
  message:
    "Conversion was profitable. Your obtain will begin quickly...",
  success: true,
});

// convert the obtained information to a file
const url = window.URL.createObjectURL(new Blob([res.data]));
// create an anchor aspect
const hyperlink = doc.createElement("a");
// set the href of the created anchor aspect
hyperlink.href = url;
// add the obtain attribute, give the downloaded file a reputation
hyperlink.setAttribute("obtain", "yourfile.pdf");
// add the created anchor tag to the DOM
doc.physique.appendChild(hyperlink);
// power a click on on the hyperlink to begin a simulated obtain
hyperlink.click on();

And we will use the next under the contentEditable div for displaying messages:

<div>
  {response.success && <i className="success">{response.message}</i>}
  {response.error && <i className="error">{response.message}</i>}
</div>

Ultimate code

I’ve packaged every thing up on GitHub so you’ll be able to try the total supply code for each the server and the shopper.

Previous articleTypeScript vs. JavaScript – DEV Group

Next articleiPhone 14: The whole lot we all know to this point

Changing Speech to PDF with NextJS and ExpressJS | CSS-Methods

These are the instruments we‘re utilizing

Establishing

Let’s set up the opposite dependencies

Constructing the UI

Our first speech to textual content conversion!

Extra capabilities

Establishing the API route

Dealing with the conversion

Ultimate code

Transformer Token and Place Embedding with Keras

Skilling for achievement: How demand for improvement expertise is altering

Implicit Grids, Repeatable Structure Patterns, and Danglers | CSS-Tips

LEAVE A REPLY Cancel reply

Most Popular

iPhone 14: The whole lot we all know to this point

TypeScript vs. JavaScript – DEV Group

Raspberry Pi Zero 2 W Prints Terminal Instructions on Receipt Printer

Important RCE Bug Might Let Hackers Remotely Take Over DrayTek Vigor Routers

Recent Comments

ABOUT US

POPULAR POSTS

iPhone 14: The whole lot we all know to this point

TypeScript vs. JavaScript – DEV Group

Raspberry Pi Zero 2 W Prints Terminal Instructions on Receipt Printer

POPULAR CATEGORY