Skip to content

refactor: optimize directory traversal in check-edit-links script #4126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

NalinDalal
Copy link

@NalinDalal NalinDalal commented May 25, 2025

Description
I needed to validate edit links for a directory of markdown documentation files.

Each .md file should have a corresponding "Edit on GitHub"-style URL. The goal is to:

  1. Traverse the markdown docs folder.
  2. Generate the appropriate editLink for each file based on configuration.
  3. Check if the editLink exists (i.e. doesn’t return a 404).
  4. Report all broken links (404s).

Related issue(s)
Fixes #3586, or See #3586

Linted the code and built the application to check for any type error and linting issues, atlast the app was built successfully.
update the file: scripts/markdown/check-edit-links.ts for respective issue

If anything comes up, maybe some issue, wrong code or anything please let me know

Summary by CodeRabbit

  • Style
    • Reordered CSS utility classes in several components for improved consistency. No visual or functional changes.
  • Refactor
    • Streamlined internal script for checking markdown edit links to improve efficiency and readability. No impact on end-user experience.

Copy link
Contributor

coderabbitai bot commented May 25, 2025

Walkthrough

The pull request updates the ordering of CSS utility classes in several React components and refactors the check-edit-links script to use an async generator for directory traversal. The script's directory traversal is modernized for improved memory efficiency and maintainability, with no changes to exported entities in the components.

Changes

File(s) Change Summary
components/CaseStudyCard.tsx,
components/campaigns/AnnouncementHero.tsx,
pages/tools/generator.tsx
Reordered Tailwind CSS utility classes in className attributes; no logic or layout changes.
scripts/markdown/check-edit-links.ts Replaced recursive directory traversal with async generator (walkDirectory), refactored generatePaths, improved code readability and memory efficiency.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant check-edit-links.ts
    participant FileSystem

    User->>check-edit-links.ts: Run script
    check-edit-links.ts->>FileSystem: walkDirectory (async generator)
    loop For each markdown file
        FileSystem-->>check-edit-links.ts: Yield file path
        check-edit-links.ts->>check-edit-links.ts: processBatch (check links)
    end
    check-edit-links.ts-->>User: Output results
Loading

Assessment against linked issues

Objective Addressed Explanation
Use generator-based async directory traversal for memory efficiency (#3586)
Modernize directory traversal logic in check-edit-links script (#3586)

Assessment against linked issues: Out-of-scope changes

No out-of-scope changes detected.

Possibly related PRs

Suggested labels

ready-to-merge

Suggested reviewers

  • anshgoyalevil
  • derberg
  • akshatnema
  • sambhavgupta0705

Poem

A rabbit hopped through code today,
Tidying classes along the way.
With async steps, it searched each file,
Making scripts more agile and agile.
Now memory’s light, the code is neat—
This modern hop can’t be beat!
🐇✨

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

npm warn EBADENGINE Unsupported engine {
npm warn EBADENGINE package: '@tisoap/[email protected]',
npm warn EBADENGINE required: { node: '>=16', npm: '^8.0.0' },
npm warn EBADENGINE current: { node: 'v24.1.0', npm: '11.3.0' }
npm warn EBADENGINE }
npm error code ERR_SSL_WRONG_VERSION_NUMBER
npm error errno ERR_SSL_WRONG_VERSION_NUMBER
npm error request to https://10.0.0.28:4873/@parcel/watcher-wasm/-/watcher-wasm-2.4.1.tgz failed, reason: C04C5860B77F0000:error:0A00010B:SSL routines:ssl3_get_record:wrong version number:../deps/openssl/openssl/ssl/record/ssl3_record.c:354:
npm error
npm error A complete log of this run can be found in: /.npm/_logs/2025-06-07T17_41_54_718Z-debug-0.log


📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f2c3261 and 2bbf7c7.

📒 Files selected for processing (1)
  • scripts/markdown/check-edit-links.ts (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • scripts/markdown/check-edit-links.ts
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

netlify bot commented May 25, 2025

Deploy Preview for asyncapi-website ready!

Built without sensitive environment variables

Name Link
🔨 Latest commit 23cb94b
🔍 Latest deploy log https://app.netlify.com/projects/asyncapi-website/deploys/6866aaf5e3aa790008d4329d
😎 Deploy Preview https://deploy-preview-4126--asyncapi-website.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@NalinDalal NalinDalal changed the title Optimize directory traversal in check-edit-links script Refactor: Optimize directory traversal in check-edit-links script May 25, 2025
@NalinDalal NalinDalal changed the title Refactor: Optimize directory traversal in check-edit-links script refactor: Optimize directory traversal in check-edit-links script May 25, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
scripts/markdown/check-edit-links.ts (1)

117-121: Simplify error handling in async function.

Since this is an async function, you can throw the error directly instead of using Promise.reject.

Apply this diff to simplify the error handling:

-      } catch (error) {
-        return Promise.reject(
-          new Error(`Error checking ${editLink}: ${error}`),
-        );
+      } catch (error) {
+        throw new Error(`Error checking ${editLink}: ${error}`);
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 027665d and 9b26d43.

📒 Files selected for processing (4)
  • components/CaseStudyCard.tsx (1 hunks)
  • components/campaigns/AnnouncementHero.tsx (1 hunks)
  • pages/tools/generator.tsx (1 hunks)
  • scripts/markdown/check-edit-links.ts (6 hunks)
🧰 Additional context used
🧠 Learnings (1)
scripts/markdown/check-edit-links.ts (1)
Learnt from: anshgoyalevil
PR: asyncapi/website#3557
File: scripts/markdown/check-editlinks.js:80-93
Timestamp: 2025-01-14T09:23:32.728Z
Learning: In the AsyncAPI website's edit link generation system, the `editOptions` array in `edit-page-config.json` includes a fallback entry with an empty string value (`''`) that matches any URL path, ensuring that `determineEditLink()` function always finds a target and never returns null.
🪛 GitHub Actions: PR testing - if Node project
scripts/markdown/check-edit-links.ts

[error] 84-84: Error walking directory /home/runner/work/website/website/markdown/docs: TypeError [ERR_INVALID_ARG_TYPE]: The "path" argument must be of type string. Received undefined

⏰ Context from checks skipped due to timeout of 180000ms (4)
  • GitHub Check: Redirect rules - asyncapi-website
  • GitHub Check: Header rules - asyncapi-website
  • GitHub Check: Pages changed - asyncapi-website
  • GitHub Check: Lighthouse CI
🔇 Additional comments (5)
components/campaigns/AnnouncementHero.tsx (1)

63-63: CSS class reordering looks fine, but inconsistent with PR objectives.

The Tailwind CSS class reordering doesn't affect functionality since CSS class order doesn't impact the styling output. However, there's a significant inconsistency between the PR objectives (optimizing directory traversal in check-edit-links script) and the actual changes (CSS class reordering in React components).

Likely an incorrect or invalid review comment.

components/CaseStudyCard.tsx (1)

22-22: CSS class reordering is functionally safe.

The reordering of justify-center and gap-3 classes maintains the same visual output since Tailwind CSS class order doesn't affect their application. This change continues the pattern of cosmetic class reordering across components.

pages/tools/generator.tsx (1)

56-56:

✅ Verification successful

CSS reordering is safe, but main PR objective missing.

The class reordering from 'mx-auto h-auto lg:w-1/2 object-cover' to 'mx-auto h-auto object-cover lg:w-1/2' doesn't impact functionality. However, the primary focus of this PR was supposed to be optimizing directory traversal in scripts/markdown/check-edit-links.ts, which is not included in the files for review.


🏁 Script executed:

#!/bin/bash
# Description: Search for the check-edit-links script that should be the main focus of this PR
# Expected: Find the script file and verify it contains the directory traversal optimizations mentioned in PR objectives

# Search for check-edit-links script file
fd check-edit-links

# Search for any TypeScript files that might contain edit link checking logic
fd -e ts | xargs rg -l "edit.*link|editLink" | head -10

# Search for directory traversal or async generator patterns in TypeScript files
rg -A 5 "async.*generator|for await.*of.*readdir" --type ts

Length of output: 1829


🏁 Script executed:

#!/bin/bash
# Inspect directory traversal implementation in the check-edit-links script
rg -n "readdir" scripts/markdown/check-edit-links.ts || true
rg -n "async" scripts/markdown/check-edit-links.ts || true
rg -n "for await" scripts/markdown/check-edit-links.ts || true

Length of output: 605


🏁 Script executed:

#!/bin/bash
# Compare the check-edit-links script in this PR branch against main to verify if it's been updated
set -e

# Ensure we have the main branch to compare against
git fetch origin main

echo "---- Diff for scripts/markdown/check-edit-links.ts (origin/main → PR branch) ----"
git diff origin/main -- scripts/markdown/check-edit-links.ts || echo "[No changes detected]"

Length of output: 69


#!/bin/bash
# Determine the remote default branch and diff the check-edit-links script against it
set -e

# Fetch all remote refs
git fetch origin

# Identify the default branch
remote_head=$(git symbolic-ref refs/remotes/origin/HEAD)
default_branch=${remote_head#refs/remotes/origin/}
echo "Default branch: $default_branch"

# Show diff for the check-edit-links script
echo "---- Diff for scripts/markdown/check-edit-links.ts (origin/$default_branch → PR branch) ----"
git diff origin/"$default_branch" -- scripts/markdown/check-edit-links.ts || echo "[No changes detected]"

All PR objectives met—no missing script and CSS reordering is safe

  • The file scripts/markdown/check-edit-links.ts is present in this branch.
  • It implements directory traversal optimizations using an async generator (walkDirectory) with fs.readdir and for await…of, covering the PR’s stated goals.
  • The CSS class reorder in pages/tools/generator.tsx remains purely cosmetic and has no functional impact.

No further action required.

scripts/markdown/check-edit-links.ts (2)

36-57: Great use of async generators for memory-efficient directory traversal!

The refactoring from recursive Promise.all to async generators is a significant improvement. This approach:

  • Reduces memory usage by processing files one at a time
  • Improves error handling granularity
  • Makes the code more maintainable and easier to understand

Also applies to: 67-80


41-56:

❌ Incorrect review comment

Add validation to prevent undefined path errors.

The pipeline error suggests that a path argument is undefined. This could happen if entry.name is undefined in certain edge cases.

Add defensive checks to prevent undefined values:

 async function* walkDirectory(
   dir: string,
   relativePath = '',
 ): AsyncGenerator<{ filePath: string; relativeFilePath: string }> {
   const entries = await fs.readdir(dir, { withFileTypes: true });
 
   for (const entry of entries) {
+    if (!entry || !entry.name) {
+      console.warn(`Skipping invalid entry in directory: ${dir}`);
+      continue;
+    }
     const absPath = path.join(dir, entry.name);
     const relPath = path.join(relativePath, entry.name);

Let me verify if there are any edge cases in the directory structure:


🏁 Script executed:

#!/bin/bash
# Description: Check for any unusual files or directories that might cause issues

# Check if the docs directory exists
if [ -d "markdown/docs" ]; then
  echo "Checking markdown/docs directory structure..."
  
  # Look for symbolic links
  echo "Symbolic links:"
  find markdown/docs -type l -ls 2>/dev/null | head -20
  
  # Look for files with unusual names
  echo -e "\nFiles with special characters or spaces:"
  find markdown/docs -name "*[[:space:]]*" -o -name "*[^[:alnum:]._/-]*" 2>/dev/null | head -20
  
  # Check for empty directories
  echo -e "\nEmpty directories:"
  find markdown/docs -type d -empty 2>/dev/null | head -20
else
  echo "markdown/docs directory not found"
fi

Length of output: 493


Ignore entry.name validation — Dirent.name is always defined

Our directory‐structure check found no symlinks, special‐character filenames, or empty directories under markdown/docs, and Node’s Dirent objects always include a non-empty name string. Adding a guard for entry.name isn’t necessary and won’t address the pipeline error.

• Instead, review where walkDirectory is first invoked and ensure the dir argument (or any upstream path) can’t be undefined.
• Check any dynamic or recursive calls to confirm they pass a valid string.
• Consider adding a validation or assertion at the very start of walkDirectory to catch unexpected inputs early:

async function* walkDirectory(dir: string, relativePath = '') {
  if (typeof dir !== 'string' || !dir) {
    throw new Error(`Invalid directory argument passed to walkDirectory: ${dir}`);
  }
  // …rest of implementation…
}

Likely an incorrect or invalid review comment.

Comment on lines +83 to +85
} catch (err) {
throw new Error(`Error walking directory ${folderPath}: ${err}`);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Improve error handling to preserve original error details.

The current error handling masks the original error details. The pipeline failure indicates a TypeError about an undefined path argument, but this information is lost when using string interpolation with ${err}.

Apply this diff to properly handle and display error details:

-  } catch (err) {
-    throw new Error(`Error walking directory ${folderPath}: ${err}`);
+  } catch (err) {
+    const errorMessage = err instanceof Error ? err.message : String(err);
+    const errorStack = err instanceof Error ? err.stack : '';
+    throw new Error(`Error walking directory ${folderPath}: ${errorMessage}\n${errorStack}`);

This will help identify the root cause of the "path argument must be of type string. Received undefined" error.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
} catch (err) {
throw new Error(`Error walking directory ${folderPath}: ${err}`);
}
} catch (err) {
const errorMessage = err instanceof Error ? err.message : String(err);
const errorStack = err instanceof Error ? err.stack : '';
throw new Error(`Error walking directory ${folderPath}: ${errorMessage}\n${errorStack}`);
}
🧰 Tools
🪛 GitHub Actions: PR testing - if Node project

[error] 84-84: Error walking directory /home/runner/work/website/website/markdown/docs: TypeError [ERR_INVALID_ARG_TYPE]: The "path" argument must be of type string. Received undefined

🤖 Prompt for AI Agents
In scripts/markdown/check-edit-links.ts around lines 83 to 85, the error
handling currently converts the caught error to a string using template
literals, which masks the original error details. To fix this, modify the catch
block to rethrow the original error or include the original error object
directly in the new Error, preserving its stack trace and message. This will
ensure the full error details, including the TypeError about the undefined path
argument, are visible for debugging.

@NalinDalal NalinDalal changed the title refactor: Optimize directory traversal in check-edit-links script refactor: optimize directory traversal in check-edit-links script May 25, 2025
Copy link
Collaborator

@TRohit20 TRohit20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @NalinDalal please follow relevant comments convention and address the coderabbit's comments please. Thank you

@NalinDalal
Copy link
Author

@CodeRabbit can you check again the code, I have added the code comments and documented it as asked by @TRohit20

Copy link
Contributor

coderabbitai bot commented Jun 7, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@NalinDalal, I'll check the updated code to review the documentation and comments you've added.


🏁 Script executed:

#!/bin/bash
# Read the current state of the check-edit-links.ts file to review documentation
cat scripts/markdown/check-edit-links.ts

Length of output: 9043


@NalinDalal Great work on the documentation! I've reviewed the updated code and the documentation improvements are excellent. Here's what I noticed:

🎉 Excellent Documentation Quality

Comprehensive JSDoc Coverage:

  • All major functions (processBatch, generatePaths, checkUrls, determineEditLink, main) now have detailed JSDoc comments
  • Proper use of @param, @returns, and @throws annotations
  • Clear descriptions of function purposes and behaviors

Outstanding Design Documentation:

  • The async generator approach is well-explained with comments about memory efficiency benefits
  • The generatePaths function documentation clearly explains why the async generator pattern was chosen
  • Error handling and timeout mechanisms are properly documented

Professional Code Comments:

  • Strategic inline comments like // ✅ NEW: Async generator for efficient directory traversal
  • Clear explanation of complex logic flows
  • Good balance between code clarity and documentation

Key Strengths:

  1. Educational Value: The documentation explains not just what the code does, but WHY certain approaches were taken
  2. Maintainability: Future developers will easily understand the codebase
  3. Error Handling: Comprehensive documentation of error scenarios and timeouts
  4. Type Safety: Clear parameter and return type documentation

The code is now well-documented, maintainable, and follows best practices. This addresses @TRohit20's feedback perfectly! The documentation quality is professional-grade and will help future contributors understand both the implementation and the reasoning behind design decisions.

@NalinDalal NalinDalal requested a review from TRohit20 June 14, 2025 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants