Skip to content

Add justhtml.html - Pyodide playground for HTML5 parsing#156

Merged
simonw merged 3 commits intomainfrom
claude/justhtml-pyodide-playground-01NrQTCRnXmwhrSnRTnQoopx
Dec 13, 2025
Merged

Add justhtml.html - Pyodide playground for HTML5 parsing#156
simonw merged 3 commits intomainfrom
claude/justhtml-pyodide-playground-01NrQTCRnXmwhrSnRTnQoopx

Conversation

@simonw
Copy link
Owner

@simonw simonw commented Dec 13, 2025

A mobile-friendly playground for the justhtml Python library that:

  • Installs justhtml from PyPI using micropip
  • Supports pasting HTML or fetching from CORS-enabled URLs
  • Includes multiple playground modes:
    • CSS Selector query with example selectors
    • Pretty print HTML
    • Tree structure visualization
    • Stream events display
  • No React dependencies, pure vanilla JS

Claude Code for web prompt:

Clone https://github.com/EmilStenstrom/justhtml into /tmp

Examine existing tools in this repo that use Pyodide

Build justhtml.html as a playground for trying out this new HTML Python parsing library

It should install the library from PyPI using micropip

It should have an option to paste in HTML, and another option to add a URL to something that can be fetched via CORS

Include various playground options, including one for trying out the CSS selector implementation and one for pretty printing HTML

Mobile friendly, no react

A mobile-friendly playground for the justhtml Python library that:
- Installs justhtml from PyPI using micropip
- Supports pasting HTML or fetching from CORS-enabled URLs
- Includes multiple playground modes:
  - CSS Selector query with example selectors
  - Pretty print HTML
  - Tree structure visualization
  - Stream events display
- No React dependencies, pure vanilla JS
@claude
Copy link
Contributor

claude bot commented Dec 13, 2025

Claude finished @simonw's task —— View job


Code Review Complete

I've completed a comprehensive review of the justhtml.html Pyodide playground. Here's my detailed feedback:

Review Tasks

  • Read and analyze the justhtml.html file
  • Evaluate code quality and structure
  • Check for security concerns (XSS, CORS, input validation)
  • Assess performance considerations
  • Review error handling and user experience
  • Check for potential bugs or issues
  • Provide feedback on best practices
  • Comment on test coverage considerations

🎯 Overall Assessment

This is a well-crafted, standalone HTML playground that demonstrates excellent attention to detail and user experience. The implementation is clean, functional, and follows modern web development practices.

✅ Strengths

Code Quality & Structure:

  • Clean, well-organized code with logical separation of HTML, CSS, and JavaScript
  • Semantic HTML structure with proper accessibility attributes
  • Modern CSS with flexbox, good mobile responsiveness, and thoughtful design
  • Modular JavaScript with clear function separation and consistent naming

User Experience:

  • Excellent mobile-first responsive design
  • Intuitive tabbed interface for input methods and processing modes
  • Helpful example selectors and placeholder content
  • Clear visual feedback with status indicators and loading states
  • Good error messaging with user-friendly explanations

Technical Implementation:

  • Smart use of Pyodide for client-side Python execution
  • Proper async/await patterns for loading and execution
  • Good error handling with try-catch blocks throughout
  • Efficient DOM manipulation and event handling

⚠️ Security Concerns (Minor)

Input Sanitization (justhtml.html:666-667, 709, 733, 757):

const escapedHtml = html.replace(/\\/g, '\\\\').replace(/"""/g, '\\"\\"\\"');
  • The escaping is basic but adequate for this use case
  • Consider using a more robust escaping library for production use
  • Current implementation should prevent most injection issues

CORS Dependency (justhtml.html:786-789):

const response = await fetch(url);
  • Relies on target sites having permissive CORS headers
  • This is correctly documented for users but limits functionality
  • Consider adding a warning about potential data exposure when fetching from untrusted URLs

🚀 Performance Considerations

Pyodide Loading (justhtml.html:482):

  • Large initial payload (~10MB) but cached by CDN
  • Loading indicator provides good UX during initialization
  • Consider lazy loading for better perceived performance

Memory Management:

  • No obvious memory leaks in JavaScript
  • Python objects are properly cleaned up by Pyodide's garbage collection
  • Long HTML documents could use more memory than expected

Text Processing (justhtml.html:586, 614):

if (len(text) > 50:
    text = text[:47] + "..."
  • Smart text truncation prevents overwhelming output
  • Good performance optimization for large documents

🐛 Potential Issues

Error Handling Enhancement (justhtml.html:692-696):

if (errorMsg.includes('SelectorError')) {
    const match = errorMsg.match(/SelectorError[:\s]*([\s\S]*?)(?:\n\s*at |$)/);
    if (match) errorMsg = 'Selector Error: ' + match[1].trim();
}
  • Good error message parsing but could be more robust
  • Consider handling other common error types (syntax errors, network failures)

URL Validation (justhtml.html:776-780):

const url = urlInputEl.value.trim();
if (!url) {
    alert('Please enter a URL');
    return;
}
  • Basic validation but could benefit from URL format checking
  • Alert() is functional but inconsistent with the rest of the error handling UX

📋 Best Practices Recommendations

  1. Replace alert() with consistent UI feedback (justhtml.html:778)
  2. Add URL format validation before fetch attempts
  3. Consider adding a loading spinner during Python operations
  4. Add keyboard navigation for better accessibility
  5. Consider CSP headers if serving this file (not applicable for standalone use)

🧪 Test Coverage Considerations

For this type of standalone HTML tool, traditional unit testing is challenging, but consider:

  • Manual testing checklist for different browser/device combinations
  • Integration tests for the Python function definitions
  • End-to-end testing with tools like Playwright for critical user flows
  • Performance testing with large HTML documents

📊 Code Metrics

  • Total lines: 828 (well-organized for a single-file app)
  • CSS: Clean, modern, mobile-responsive
  • JavaScript: ~350 lines of well-structured code
  • Python helpers: Robust utility functions for HTML processing

🎉 Conclusion

This is a high-quality implementation that successfully demonstrates the justhtml library's capabilities. The code is production-ready for its intended use case as a playground/demo tool. The minor security and performance considerations noted above are typical for this type of client-side application and don't detract from the overall excellent execution.

Recommendation:Approve - This PR adds valuable functionality with clean, well-structured code.


@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Dec 13, 2025

Deploying tools with  Cloudflare Pages  Cloudflare Pages

Latest commit: 0ef6377
Status: ✅  Deploy successful!
Preview URL: https://831360bf.tools-b1q.pages.dev
Branch Preview URL: https://claude-justhtml-pyodide-play.tools-b1q.pages.dev

View logs

@claude
Copy link
Contributor

claude bot commented Dec 13, 2025

Claude finished @simonw's task —— View job


Code Review Complete

I've completed a comprehensive review of the justhtml.html Pyodide playground. Here's my detailed feedback:

Review Tasks

  • Read and analyze the justhtml.html file
  • Evaluate code quality and structure
  • Check for security concerns (XSS, CORS, input validation)
  • Assess performance considerations
  • Review error handling and user experience
  • Check for potential bugs or issues
  • Provide feedback on best practices
  • Comment on test coverage considerations

🎯 Overall Assessment

This is a well-crafted, standalone HTML playground that demonstrates excellent attention to detail and user experience. The implementation is clean, functional, and follows modern web development practices.

✅ Strengths

Code Quality & Structure:

  • Clean, well-organized code with logical separation of HTML, CSS, and JavaScript
  • Semantic HTML structure with proper accessibility attributes
  • Modern CSS with flexbox, good mobile responsiveness, and thoughtful design
  • Modular JavaScript with clear function separation and consistent naming

User Experience:

  • Excellent mobile-first responsive design
  • Intuitive tabbed interface for input methods and processing modes
  • Helpful example selectors and placeholder content
  • Clear visual feedback with status indicators and loading states
  • Good error messaging with user-friendly explanations

Technical Implementation:

  • Smart use of Pyodide for client-side Python execution
  • Proper async/await patterns for loading and execution
  • Good error handling with try-catch blocks throughout
  • Efficient DOM manipulation and event handling

⚠️ Security Concerns (Minor)

Input Sanitization (justhtml.html:667, 710, 734, 758):

const escapedHtml = html.replace(/\\/g, '\\\\').replace(/"""/g, '\\"\\"\\"');
  • The escaping is basic but adequate for this use case
  • Consider using a more robust escaping library for production use
  • Current implementation should prevent most injection issues

CORS Dependency (justhtml.html:787-789):

const response = await fetch(url);
  • Relies on target sites having permissive CORS headers
  • This is correctly documented for users but limits functionality
  • Consider adding a warning about potential data exposure when fetching from untrusted URLs

🚀 Performance Considerations

Pyodide Loading (justhtml.html:531):

  • Large initial payload (~10MB) but cached by CDN
  • Loading indicator provides good UX during initialization
  • Consider lazy loading for better perceived performance

Memory Management:

  • No obvious memory leaks in JavaScript
  • Python objects are properly cleaned up by Pyodide's garbage collection
  • Long HTML documents could use more memory than expected

Text Processing (justhtml.html:587, 615):

if len(text) > 50:
    text = text[:47] + "..."
  • Smart text truncation prevents overwhelming output
  • Good performance optimization for large documents

🐛 Potential Issues

Error Handling Enhancement (justhtml.html:693-696):

if (errorMsg.includes('SelectorError')) {
    const match = errorMsg.match(/SelectorError[:\s]*([\s\S]*?)(?:\n\s*at |$)/);
    if (match) errorMsg = 'Selector Error: ' + match[1].trim();
}
  • Good error message parsing but could be more robust
  • Consider handling other common error types (syntax errors, network failures)

URL Validation (justhtml.html:777-780):

const url = urlInputEl.value.trim();
if (!url) {
    alert('Please enter a URL');
    return;
}
  • Basic validation but could benefit from URL format checking
  • Alert() is functional but inconsistent with the rest of the error handling UX

📋 Best Practices Recommendations

  1. Replace alert() with consistent UI feedback (justhtml.html:779)
  2. Add URL format validation before fetch attempts
  3. Consider adding a loading spinner during Python operations
  4. Add keyboard navigation for better accessibility
  5. Consider CSP headers if serving this file (not applicable for standalone use)

🧪 Test Coverage Considerations

For this type of standalone HTML tool, traditional unit testing is challenging, but consider:

  • Manual testing checklist for different browser/device combinations
  • Integration tests for the Python function definitions
  • End-to-end testing with tools like Playwright for critical user flows
  • Performance testing with large HTML documents

📊 Code Metrics

  • Total lines: 829 (well-organized for a single-file app)
  • CSS: Clean, modern, mobile-responsive
  • JavaScript: ~350 lines of well-structured code
  • Python helpers: Robust utility functions for HTML processing

🎉 Conclusion

This is a high-quality implementation that successfully demonstrates the justhtml library's capabilities. The code is production-ready for its intended use case as a playground/demo tool. The minor security and performance considerations noted above are typical for this type of client-side application and don't detract from the overall excellent execution.

Recommendation:Approve - This PR adds valuable functionality with clean, well-structured code.


@claude
Copy link
Contributor

claude bot commented Dec 13, 2025

Claude finished @simonw's task —— View job


Code Review Complete

I've completed a comprehensive review of the justhtml.html Pyodide playground. Here's my detailed feedback:

Review Tasks

  • Read and analyze the justhtml.html file
  • Evaluate code quality and structure
  • Check for security concerns (XSS, CORS, input validation)
  • Assess performance considerations
  • Review error handling and user experience
  • Check for potential bugs or issues
  • Provide feedback on best practices
  • Comment on test coverage considerations

🎯 Overall Assessment

This is a well-crafted, standalone HTML playground that demonstrates excellent attention to detail and user experience. The implementation is clean, functional, and follows modern web development practices.

✅ Strengths

Code Quality & Structure:

  • Clean, well-organized code with logical separation of HTML, CSS, and JavaScript
  • Semantic HTML structure with proper accessibility attributes
  • Modern CSS with flexbox, good mobile responsiveness, and thoughtful design
  • Modular JavaScript with clear function separation and consistent naming

User Experience:

  • Excellent mobile-first responsive design
  • Intuitive tabbed interface for input methods and processing modes
  • Helpful example selectors and placeholder content
  • Clear visual feedback with status indicators and loading states
  • Good error messaging with user-friendly explanations

Technical Implementation:

  • Smart use of Pyodide for client-side Python execution
  • Proper async/await patterns for loading and execution
  • Good error handling with try-catch blocks throughout
  • Efficient DOM manipulation and event handling

⚠️ Security Concerns (Minor)

Input Sanitization (justhtml.html:668, 711, 735, 759):

const escapedHtml = html.replace(/\\/g, '\\\\').replace(/"""/g, '\\"\\"\\"');
  • The escaping is basic but adequate for this use case
  • Consider using a more robust escaping library for production use
  • Current implementation should prevent most injection issues

CORS Dependency (justhtml.html:788):

const response = await fetch(url);
  • Relies on target sites having permissive CORS headers
  • This is correctly documented for users but limits functionality
  • Consider adding a warning about potential data exposure when fetching from untrusted URLs

🚀 Performance Considerations

Pyodide Loading (justhtml.html:531):

  • Large initial payload (~10MB) but cached by CDN
  • Loading indicator provides good UX during initialization
  • Consider lazy loading for better perceived performance

Memory Management:

  • No obvious memory leaks in JavaScript
  • Python objects are properly cleaned up by Pyodide's garbage collection
  • Long HTML documents could use more memory than expected

Text Processing (justhtml.html:587, 615):

if (len(text) > 50:
    text = text[:47] + "..."
  • Smart text truncation prevents overwhelming output
  • Good performance optimization for large documents

🐛 Potential Issues

Error Handling Enhancement (justhtml.html:694-698):

if (errorMsg.includes('SelectorError')) {
    const match = errorMsg.match(/SelectorError[:\s]*([\s\S]*?)(?:\n\s*at |$)/);
    if (match) errorMsg = 'Selector Error: ' + match[1].trim();
}
  • Good error message parsing but could be more robust
  • Consider handling other common error types (syntax errors, network failures)

URL Validation (justhtml.html:778-782):

const url = urlInputEl.value.trim();
if (!url) {
    alert('Please enter a URL');
    return;
}
  • Basic validation but could benefit from URL format checking
  • Alert() is functional but inconsistent with the rest of the error handling UX

Button State Management (justhtml.html:629-631):

document.querySelectorAll('button').forEach(btn => {
    btn.disabled = false;
});
  • Enables all buttons at once, which may not be desired behavior
  • More selective enabling based on current state would be better

📋 Best Practices Recommendations

  1. Replace alert() with consistent UI feedback (justhtml.html:780)
  2. Add URL format validation before fetch attempts
  3. Consider adding a loading spinner during Python operations
  4. Add keyboard navigation for better accessibility
  5. Consider CSP headers if serving this file (not applicable for standalone use)
  6. Add input length limits to prevent potential DoS via large inputs

🧪 Test Coverage Considerations

For this type of standalone HTML tool, traditional unit testing is challenging, but consider:

  • Manual testing checklist for different browser/device combinations
  • Integration tests for the Python function definitions
  • End-to-end testing with tools like Playwright for critical user flows
  • Performance testing with large HTML documents

📊 Code Metrics

  • Total lines: 830 (well-organized for a single-file app)
  • CSS: Clean, modern, mobile-responsive (~350 lines)
  • JavaScript: ~350 lines of well-structured code
  • Python helpers: Robust utility functions for HTML processing

🎉 Conclusion

This is a high-quality implementation that successfully demonstrates the justhtml library's capabilities. The code is production-ready for its intended use case as a playground/demo tool. The minor security and performance considerations noted above are typical for this type of client-side application and don't detract from the overall excellent execution.

Recommendation:Approve - This PR adds valuable functionality with clean, well-structured code.


@simonw
Copy link
Owner Author

simonw commented Dec 13, 2025

Couple of bug fixes:

IMG_1001 IMG_1002

@simonw simonw merged commit 92e4884 into main Dec 13, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants