Saturday, August 19, 2023
HomeProgrammingExtracting Meta Tag Info Utilizing JavaScript

Extracting Meta Tag Info Utilizing JavaScript


Introduction

When constructing, analyzing, or scraping internet pages, it is usually essential to extract meta tag info. These tags present knowledge concerning the HTML doc, like descriptions, key phrases, writer info, and extra.

On this Byte, we’ll clarify find out how to extract this knowledge utilizing JavaScript.

Retrieving Meta Tag Knowledge

To retrieve meta tag knowledge, we are able to use the querySelector() technique in JavaScript. This technique returns the primary Ingredient inside the doc that matches the required selector, or group of selectors.

Here is an instance:

let metaDescription = doc.querySelector("meta[name="description"]")
                      .getAttribute("content material");
console.log(metaDescription);

On this code, we’re querying for a meta tag with the identify ‘description’ after which getting the ‘content material’ attribute of that tag. The console will output the outline of the web page.

Working with Open Graph (OG) Meta Tags

Open Graph meta tags are used to complement the “preview” of a webpage on social media or in a messenger. They let you specify the title, description, and picture that can be used when your web page is shared.

To fetch the Open Graph title of a web page, you need to use the next code:

let ogTitle = doc.querySelector("meta[property='og:title']")
              .getAttribute("content material");
console.log(ogTitle);

This code fetches the Open Graph title of the web page and prints it to the console.

Fetching Knowledge from All Doc Meta Tags

If you wish to fetch knowledge from all of the meta tags in a doc, you need to use the getElementsByTagName() technique, which returns a stay HTMLCollection of components with the given tag identify.

Here is how you are able to do it:

let metaTags = doc.getElementsByTagName("meta");

for (var i = 0; i < metaTags.size; i++) {
    console.log(metaTags[i].getAttribute("identify") + " : " + metaTags[i].getAttribute("content material"));
}

This code will output the “identify” and “content material” attributes of all of the meta tags within the doc.

Retrieving Meta Tags Utilizing Node.js

Up till this level we have seen find out how to extract the meta tag knowledge utilizing JS in-browser. We all know this as a result of all examples have used the doc object, which is simply obtainable in browser environments. Let’s now see how you are able to do this from a distinct JS runtime, like Node.

Assuming you’ve got Node and npm in your machine, set up the axios and cheerio libraries:

$ npm set up axios cheerio

Hyperlink: To be taught extra about find out how to use the Axios library, learn our article, Making Asynchronous HTTP Requests in JavaScript with Axios.

To be taught extra about Cheerio.js, see our information, Construct a Internet-Scraped API with Categorical and Cheerio.

Load the libraries into your script utilizing the require command:

const axios = require('axios');
const cheerio = require('cheerio');

And now we’ll use Axios to fetch the net web page we’re curious about. It returns a promise, so be sure to deal with it correctly with async/await or a .then() block.

strive {
    const response = await axios.get('https://instance.com');
    
    // Extract the web page knowledge right here...
} catch (error) {
    console.error(error);
}

Now we’ll use Cheerio.js to extract the meta tags from the HTML we have fetched. When you’ve ever labored with jQuery, you may discover how comparable Cheerio.js is.

strive {
    const response = await axios.get('https://instance.com');
    const $ = cheerio.load(response.knowledge);
    const metaTags = $('meta');

    metaTags.every((i, tag) => {
        const identify = $(tag).attr('identify');
        const content material = $(tag).attr('content material');
        console.log(`Meta identify: ${identify}, content material: ${content material}`);
    });
} catch (error) {
    console.error(error);
}

What we have completed right here is load the HTML response into Cheerio, after which grabbed all of the meta tags. We looped by way of every and printed out the “identify” and “content material” attributes. You may simply modify this code to seize different attributes or construction the information as wanted.

Conclusion

On this Byte, we have explored find out how to extract meta tag info from a webpage utilizing JavaScript. We lined find out how to retrieve particular meta tag knowledge, work with Open Graph tags, and fetch knowledge from all meta tags in a doc.

We additionally noticed find out how to extract meta tag info from different JavaScript runtimes, like Node.js utilizing the Axios and Cheerio.js libraries.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments