Skip to content

Conversation

@argyleink
Copy link
Collaborator

@argyleink argyleink commented Jan 19, 2026

if plaintext is discovered, aka unformatted hard to read garbage, fix it.

Summary by CodeRabbit

  • New Features
    • RSS descriptions are now automatically transformed to properly formatted HTML. Plain-text content is intelligently parsed and converted, with support for timestamp grouping, bold section headers, automatic link detection and conversion to clickable anchors, and proper HTML escaping for safe content rendering.

✏️ Tip: You can customize this high-level summary in your review settings.

@vercel
Copy link

vercel bot commented Jan 19, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
starpod Ready Ready Preview, Comment Jan 19, 2026 6:40pm

Request Review

@coderabbitai
Copy link

coderabbitai bot commented Jan 19, 2026

📝 Walkthrough

Walkthrough

A new TypeScript module provides utilities to transform plain text and RSS descriptions to HTML, including functions for escaping HTML entities, detecting HTML-like content, and converting plain text with timestamps, headers, and links. The RSS module now conditionally applies this transformation when content isn't already HTML.

Changes

Cohort / File(s) Summary
RSS Transformation Utilities
src/lib/rss-transform.ts
New module with three exported functions: escapeHtml() for safe HTML encoding, looksLikeHtml() for content-type detection, and transformPlainTextToHtml() for multi-part transformation (timestamps grouped as lists, bold headers detected, link detection with anchor tags, paragraphs as default).
RSS Module Integration
src/lib/rss.ts
Imports transformation utilities and adds conditional logic to check if description content resembles HTML; applies transformPlainTextToHtml() for plain text, otherwise preserves original content.
Unit Test Coverage
tests/unit/rss.test.ts
New comprehensive test suite validating escaping behavior, HTML detection heuristics, and multi-scenario transformation tests covering headers, timestamps, links, blank input, and proper entity escaping.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A hop through the plaintext, a bound through the bold,
Timestamps and headers transformed to behold,
Links wrapped in anchors, escapes nice and clean,
The finest HTML transformation I've seen!

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Plaintext RSS fixer upper' is vague and uses colloquial language without clearly specifying what is being fixed or how. Use a more descriptive title that clarifies the specific change, such as 'Transform plaintext RSS content to HTML' or 'Add plaintext RSS to HTML conversion utilities'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/lib/rss-transform.ts`:
- Around line 35-63: The current link-list detection treats any line containing
"http" as a list item, dropping trailing text; update the initial test and the
while loop in rss-transform.ts to only treat a line as a link-list entry when
the entire line matches either "URL" or "Label: URL" (e.g., use a stricter regex
that anchors the whole line such as matching optional "Label:" then a single URL
with no extra text). Apply the same anchored regex in both the outer if and the
while condition that builds linkItems, and then keep the existing parsing logic
(item.match(/^(.+?):\s*(https?:\/\/.+)$/) and fallback URL-only match) to render
the <ul> only for those fully-matching lines so inline sentences with trailing
text are left untouched.

In `@src/lib/rss.ts`:
- Around line 111-115: The truncation currently always runs
htmlToText(description), which strips literal '<'/'>' in plain text; change the
logic so description is truncated from htmlToText(description) only when
looksLikeHtml(description) is true, otherwise truncate the raw description
string; update the description assignment (alongside content which uses
looksLikeHtml and transformPlainTextToHtml) to use a conditional: if
looksLikeHtml(description) use htmlToText(description) for truncation, else use
description directly, then pass that result into truncate(..., 260).
🧹 Nitpick comments (1)
src/lib/rss-transform.ts (1)

27-33: Header handling doesn’t match the comment.
Line 27 says plain “Links” is supported, but the regex only accepts **...**. Either update the comment or add support for unbolded headers to avoid confusion.

💡 Minimal fix (comment alignment)
-    // Check if this is a section header (bold text like "**Links**" or just "Links")
+    // Check if this is a section header (bold text like "**Links**")

Comment on lines +35 to +63
// Check if this looks like a link list item (e.g., "CodeRabbit: https://...")
if (/:?\s*https?:\/\//.test(line)) {
// Collect all consecutive link lines into a list
const linkItems: string[] = [];
while (i < lines.length && /:?\s*https?:\/\//.test(lines[i])) {
linkItems.push(lines[i]);
i++;
}
html.push('<ul>');
linkItems.forEach(item => {
// Parse "Label: URL" or just "URL"
const match = item.match(/^(.+?):\s*(https?:\/\/.+)$/);
if (match) {
const label = escapeHtml(match[1].trim());
const url = escapeHtml(match[2].trim());
html.push(`<li>${label}: <a href="${url}">${url}</a></li>`);
} else {
const urlMatch = item.match(/(https?:\/\/.+)/);
if (urlMatch) {
const url = escapeHtml(urlMatch[1].trim());
html.push(`<li><a href="${url}">${url}</a></li>`);
} else {
html.push(`<li>${escapeHtml(item)}</li>`);
}
}
});
html.push('</ul>');
continue;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inline-URL sentences get coerced into link lists (drops surrounding text).
Any line containing http becomes a list item, so a sentence like “Sponsor: https://x (use code …)” loses the trailing text. If that’s not intended, tighten detection to only match lines that are just a URL or “Label: URL”.

🔧 Safer link-line detection
-    if (/:?\s*https?:\/\//.test(line)) {
+    const linkLineRegex = /^(?:[^:]+:\s*)?https?:\/\/\S+\s*$/;
+    if (linkLineRegex.test(line)) {
       // Collect all consecutive link lines into a list
       const linkItems: string[] = [];
-      while (i < lines.length && /:?\s*https?:\/\//.test(lines[i])) {
+      while (i < lines.length && linkLineRegex.test(lines[i])) {
         linkItems.push(lines[i]);
         i++;
       }
       html.push('<ul>');
       linkItems.forEach(item => {
         // Parse "Label: URL" or just "URL"
-        const match = item.match(/^(.+?):\s*(https?:\/\/.+)$/);
+        const match = item.match(/^(?:([^:]+):\s*)?(https?:\/\/\S+)\s*$/);
         if (match) {
-          const label = escapeHtml(match[1].trim());
-          const url = escapeHtml(match[2].trim());
-          html.push(`<li>${label}: <a href="${url}">${url}</a></li>`);
+          const label = match[1] ? escapeHtml(match[1].trim()) : null;
+          const url = escapeHtml(match[2].trim());
+          html.push(
+            label
+              ? `<li>${label}: <a href="${url}">${url}</a></li>`
+              : `<li><a href="${url}">${url}</a></li>`
+          );
         } else {
           const urlMatch = item.match(/(https?:\/\/.+)/);
           if (urlMatch) {
             const url = escapeHtml(urlMatch[1].trim());
             html.push(`<li><a href="${url}">${url}</a></li>`);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Check if this looks like a link list item (e.g., "CodeRabbit: https://...")
if (/:?\s*https?:\/\//.test(line)) {
// Collect all consecutive link lines into a list
const linkItems: string[] = [];
while (i < lines.length && /:?\s*https?:\/\//.test(lines[i])) {
linkItems.push(lines[i]);
i++;
}
html.push('<ul>');
linkItems.forEach(item => {
// Parse "Label: URL" or just "URL"
const match = item.match(/^(.+?):\s*(https?:\/\/.+)$/);
if (match) {
const label = escapeHtml(match[1].trim());
const url = escapeHtml(match[2].trim());
html.push(`<li>${label}: <a href="${url}">${url}</a></li>`);
} else {
const urlMatch = item.match(/(https?:\/\/.+)/);
if (urlMatch) {
const url = escapeHtml(urlMatch[1].trim());
html.push(`<li><a href="${url}">${url}</a></li>`);
} else {
html.push(`<li>${escapeHtml(item)}</li>`);
}
}
});
html.push('</ul>');
continue;
}
// Check if this looks like a link list item (e.g., "CodeRabbit: https://...")
const linkLineRegex = /^(?:[^:]+:\s*)?https?:\/\/\S+\s*$/;
if (linkLineRegex.test(line)) {
// Collect all consecutive link lines into a list
const linkItems: string[] = [];
while (i < lines.length && linkLineRegex.test(lines[i])) {
linkItems.push(lines[i]);
i++;
}
html.push('<ul>');
linkItems.forEach(item => {
// Parse "Label: URL" or just "URL"
const match = item.match(/^(?:([^:]+):\s*)?(https?:\/\/\S+)\s*$/);
if (match) {
const label = match[1] ? escapeHtml(match[1].trim()) : null;
const url = escapeHtml(match[2].trim());
html.push(
label
? `<li>${label}: <a href="${url}">${url}</a></li>`
: `<li><a href="${url}">${url}</a></li>`
);
} else {
const urlMatch = item.match(/(https?:\/\/.+)/);
if (urlMatch) {
const url = escapeHtml(urlMatch[1].trim());
html.push(`<li><a href="${url}">${url}</a></li>`);
} else {
html.push(`<li>${escapeHtml(item)}</li>`);
}
}
});
html.push('</ul>');
continue;
}
🤖 Prompt for AI Agents
In `@src/lib/rss-transform.ts` around lines 35 - 63, The current link-list
detection treats any line containing "http" as a list item, dropping trailing
text; update the initial test and the while loop in rss-transform.ts to only
treat a line as a link-list entry when the entire line matches either "URL" or
"Label: URL" (e.g., use a stricter regex that anchors the whole line such as
matching optional "Label:" then a single URL with no extra text). Apply the same
anchored regex in both the outer if and the while condition that builds
linkItems, and then keep the existing parsing logic
(item.match(/^(.+?):\s*(https?:\/\/.+)$/) and fallback URL-only match) to render
the <ul> only for those fully-matching lines so inline sentences with trailing
text are left untouched.

Comment on lines +111 to 115
content: looksLikeHtml(description)
? description
: transformPlainTextToHtml(description),
description: truncate(htmlToText(description), 260),
duration: itunes_duration,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Plain-text descriptions still go through htmlToText.
When description is plaintext, running htmlToText can drop literal </> sequences. Consider using raw text for truncation when !looksLikeHtml(description).

✅ Suggested fix
-          return {
+          const isHtml = looksLikeHtml(description);
+          const descriptionText = isHtml ? htmlToText(description) : description;
+          return {
             id,
             title: `${title}`,
-            content: looksLikeHtml(description)
-              ? description
-              : transformPlainTextToHtml(description),
-            description: truncate(htmlToText(description), 260),
+            content: isHtml ? description : transformPlainTextToHtml(description),
+            description: truncate(descriptionText, 260),
🤖 Prompt for AI Agents
In `@src/lib/rss.ts` around lines 111 - 115, The truncation currently always runs
htmlToText(description), which strips literal '<'/'>' in plain text; change the
logic so description is truncated from htmlToText(description) only when
looksLikeHtml(description) is true, otherwise truncate the raw description
string; update the description assignment (alongside content which uses
looksLikeHtml and transformPlainTextToHtml) to use a conditional: if
looksLikeHtml(description) use htmlToText(description) for truncation, else use
description directly, then pass that result into truncate(..., 260).

@argyleink
Copy link
Collaborator Author

closing, fix was done upstream

@argyleink argyleink closed this Jan 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants