Back

Punycode


Background

Messing with RSS feed recently (as NextJS do not have any official RSS parser plugin) , I realised Chinese URL links generated were not validated. UTF-8 encoded Chinese characters were not considered as valid.

What I've Learned

Punycode is a special encoding used to convert Unicode characters to ASCII, which is a smaller, restricted character set. Punycode is used to encode internationalized domain names (IDN).

https://www.punycoder.com/

How to recreate the problem?

Punycode is available to encode strings of Unicode to ASCII symbols, however, the returned URL is not working as expected. For example, a Chinese blog link like https://hk.news.yahoo.com/65個指明地方納入強制檢測公告-004959684.html was converted into https://hk.news.yahoo.com/xn--65-q43c84c9a92hp8grphq2u14i5uhd3aq94bo6lkm8c-004959684.html instead of https://hk.news.yahoo.com/65%E5%80%8B%E6%8C%87%E6%98%8E%E5%9C%B0%E6%96%B9%E7%B4%8D%E5%85%A5%E5%BC%B7%E5%88%B6%E6%AA%A2%E6%B8%AC%E5%85%AC%E5%91%8A-004959684.html

Action Items

Following code does not work as expected. Only the variable was converted instead of the whole URI string.

fs.readdirSync(blogPostDir)

.map((fileName) => {

// we need the full path of the file to be able to read it

const fullPath = path.join(blogPostDir, fileName);

// read the file so we can grab the front matter

const file = fs.readFileSync(fullPath, "utf8");

  

// for the RSS feed we don't need the html, we

// just want the attributes

const { data: frontmatter, content } = matter(file);

const excerpt = getExcerpt(content, 800);

const url = CONFIG.URL + `/posts/` + fileName.replace(".md", "") + `/`;

// console.log(excerpt);

// I want access to the fileName later on so we save it to our object

return { ...frontmatter, fileName, excerpt, url };

})

// sort the items by date in descending order, feel free

// to customize this as needed to sort your RSS items properly

.filter((post) => post.draft === false)

.sort((a, b) => +new Date(b.date) - +new Date(a.date))

// loop over each blog post and add it to our RSS feed

.forEach(({ title, date, tags, category, fileName, excerpt, url }) => {

// title, description, and date are properties of my front matter

// attributes. Yours might be different depending on your data structure

feed.item({

title,

description: excerpt,

url: punycode.encode(url),

author: CONFIG.AUTHOR_NAME,

categories: tags,

date,

});

});
Last Updated February 11, 2022
Tags
punycodefrontendquestions

Related posts

    PrevNext Blog Component for NextJS Using ContentLayer

    Using ContentLayer, it's easy to create a Prev & Next component if you want to encourage users to read one blog article and then the order. All it takes are two simple step. 1. Create related posts lo

    Migrating from Gatsby to Next.js 14

    This website has entered v3 of its iteration! v2 was created using Gatsby and integrated with my previous design resources `dojo.today`. Entering v3, I'm migrating the website from Gatsby to Next.js 1