"Your own personal internet archive" (网站存档 / 爬虫),一个自托管的网站时光机
🎭 Playwright integration for Scrapy
A package acting as a wrapper around the headless mode of existing web browsers to generate images from URLs and from HTML+CSS strings or files.
#网络爬虫#Run Selenium with Python via Github Actions using Headless or Non-Headless browsers!
Example of username and password proxy authentication for use in Selenium
#网络爬虫#Scrapfly Python SDK for headless browsers and proxy rotation
#网络爬虫#Web crawler and scraper based on Scrapy and Playwright's headless browser.
An embeddable headless browser package for Python that provides a simplified interface for interacting with web pages using Selenium and Selenium Hub.
🛰️ Python-based scraper that automates postal code lookups on the official Correos de Chile website. It simulates the public search form with autocomplete logic and returns clean, structured JSON—rea...
Smart Scraper: An AI-powered web scraping framework that uses headless browsers, asynchronous programming, and adaptive parsing to extract data efficiently from diverse websites. Includes a user-frien...
Simple web scraping from public website using headless mode.
Python script that sends a DM to all the users that follow your X (formerly Twitter) account. With headless browser option and detailed debugging logs.
Convert links that consist only of user IDs from follower.js provided by X (formerly Twitter) data archive, to a list of usernames
Easy-to-use API to scrape Google Lens results with proxy support, visual search extraction, and a built-in web UI for testing and debugging.
Script_spotycai is a Python-based tool that automates the process of downloading MP3 songs from Spotycai using Selenium. It scrapes album pages for song URLs and saves them locally, supporting headles...
A Python script that checks whether a password has been compromised using the Have I Been Pwned service. The script automates the process of querying the website and retrieving the results for the giv...