Scraperr – A Self Hosted Webscraper

Viewed 24
Scraperr is a self-hosted webscraper that is garnering interest in the tech community. It seems to be positioned against other web scraping and crawling options like Xidel and Crawler Buddy, which some users prefer for their specific features such as link-following capabilities or the output format (JSON). There’s also a discussion on scraping methods with Selenium vs. Playwright, with a notable inclination towards Playwright for its ease of use and flexibility. The conversation also highlights concerns about browser fingerprinting and bot detection, suggesting that simply modifying user-agent strings may not be sufficient against advanced detection strategies. Some users expressed a wish for Scraperr to support markdown output to enhance integration with various tools, especially for embedding or LLM use cases.
0 Answers