GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

web-scraping

Website
Wikipedia
scrapy/scrapy
https://static.github-zh.com/github_avatars/scrapy?size=40
scrapy / scrapy

#爬虫框架#一款流行,高效,生态丰富的Python爬虫框架

Pythonscrapingcrawling框架爬虫Hacktoberfestweb-scrapingweb-scraping-python
Python 57.07 k
1 天前
dgtlmoon/changedetection.io
https://static.github-zh.com/github_avatars/dgtlmoon?size=40
dgtlmoon / changedetection.io

changedetection.io 是一个用于监控网页内容修改的工具,并支持通过API、邮件、消息等多种方式发送通知

website-monitorwebsite-monitoringchange-detection监控自托管change-alertchange-monitoringwebsite-change-monitorurl-monitorchangedetectionwebsite-change-detectorwebsite-change-detectionwebsite-change-trackerwebsite-change-notificationnotificationsweb-scrapingrestock-monitorwebsite-defacement-monitoringback-in-stockwebsite-watcher
Python 24.45 k
4 天前
https://static.github-zh.com/github_avatars/ScrapeGraphAI?size=40
ScrapeGraphAI / Scrapegraph-ai

#网络爬虫#Python scraper based on AI

scrapingscraping-pythonautomated-scraper大语言模型人工智能web-crawlerweb-scrapingai-scraping爬虫html-to-markdownMarkdownrag
Python 20 k
2 天前
apify/crawlee
https://static.github-zh.com/github_avatars/apify?size=40
apify / crawlee

#网络爬虫#Crawlee - 一个用于Node.js 开发的网页爬虫和浏览器自动化库

web-scrapingweb-crawlingnpmheadless-chromePuppeteer自动化apifyscrapingcrawling爬虫headlessscraperweb-crawlerJavaScriptNode.jsPlaywrightTypeScript
TypeScript 17.92 k
2 天前
Evil0ctal/Douyin_TikTok_Download_API
https://static.github-zh.com/github_avatars/Evil0ctal?size=40
Evil0ctal / Douyin_TikTok_Download_API

#网络爬虫#🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。

PythonpywebioTikTokdouyinAPIscraperFastAPIno-watermarkonline-parsingasyncdouyin-tiktok-apidouyin-tiktok-download爬虫spiderweb-scrapingtiktok-scraperdouyin-scraperdouyin-apitiktok-apitiktok-signature
Python 13.03 k
3 个月前
https://static.github-zh.com/github_avatars/getmaxun?size=40
getmaxun / maxun

#网络爬虫#一个可视化,通过鼠标点击完成数据采集的爬虫平台

自动化无代码scraperweb-automationweb-scraperweb-scrapingAPIbrowserbrowser-automationPlaywright自托管website-to-apirobotic-process-automationrpano-code-web-scraperagentsweb-agentdata-extractionweb-scraping-agentwebscraping
TypeScript 13.03 k
2 天前
seleniumbase/SeleniumBase
https://static.github-zh.com/github_avatars/seleniumbase?size=40
seleniumbase / SeleniumBase

SeleniumBase 是一个 Python 浏览器自动化的库,用于web自动化,测试,验证码绕过

PythonSeleniumwebdriverselenium-pythone2e-testingseleniumbasepytest-pluginweb-automationpytestWebKitchromedriveranti-detectionbot-detectioncloudflare-bypassweb-scraping-pythonpython-scraperweb-scrapingTest automationcdpbehave
Python 11.11 k
3 天前
https://static.github-zh.com/github_avatars/mherrmann?size=40
mherrmann / helium

helium 是一个用于浏览器自动化如 Chrome/Firebox 的Python库

Seleniumselenium-pythonPythonwebdriverChromeFirefoxweb-automationweb-scrapinghelium
Python 7.88 k
2 个月前
https://static.github-zh.com/github_avatars/lorien?size=40
lorien / awesome-web-scraping

#网络爬虫#List of libraries, tools and APIs for web scraping and data processing.

web-scrapingcaptcha-recaptchacrawlingscrapingscraping-frameworkscraping-pythonscraping-toolwebscraping爬虫spider
Makefile 7.04 k
6 个月前
alirezamika/autoscraper
https://static.github-zh.com/github_avatars/alirezamika?size=40
alirezamika / autoscraper

#网络爬虫#A Smart, Automatic, Fast and Lightweight Web Scraper for Python

scrapingscraperscrapewebscraping爬虫web-scraping人工智能Pythonwebautomation自动化机器学习
Python 6.79 k
6 天前
https://static.github-zh.com/github_avatars/go-rod?size=40
go-rod / rod

#网络爬虫#Rod 是一个直接基于 DevTools Protocol 高级驱动程序。 它是为网页自动化和爬虫而设计的,既可用于高级应用开发也可用于低级应用开发,高级开发人员可以使用低级包和函数来轻松地定制或建立他们自己的Rod版本,高级函数只是建立Rod默认版本的例子。

cdpchrome-headlesschrome-devtoolschrome-devtools-protocolheadlessweb-scraping自动化scraperdevtoolsdevtools-protocolrodGoTestingWebgorodcrawling
Go 5.98 k
6 个月前
https://static.github-zh.com/github_avatars/apify?size=40
apify / crawlee-python

#网络爬虫#Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...

apify自动化beautifulsoup爬虫crawlingheadlessheadless-chromepipPlaywrightPythonscraperscrapingweb-crawlerweb-crawlingweb-scrapingHacktoberfest
Python 5.73 k
3 天前
D4Vinci/Scrapling
https://static.github-zh.com/github_avatars/D4Vinci?size=40
D4Vinci / Scrapling

#网络爬虫#🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!

爬虫crawlingHacktoberfestPlaywrightPythonscrapingselectorsstealth-gameweb-scraperweb-scrapingweb-scraping-pythonwebscrapingxpath自动化人工智能ai-scrapingdatadata-extraction
Python 5.4 k
15 天前
https://static.github-zh.com/github_avatars/adbar?size=40
adbar / trafilatura

#网络爬虫#Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

web-scrapingtext-extraction自然语言处理text-mining爬虫text-preprocessingarticle-extractorreadabilityscrapinghtml-to-markdowncorpus-toolsrss-feednews-aggregatorrag大语言模型
Python 4.36 k
16 天前
https://static.github-zh.com/github_avatars/lexiforest?size=40
lexiforest / curl_cffi

Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.

cURLhttp-clientHTTPja3ja3-fingerprintfingerprintingweb-scraping
Python 3.74 k
4 天前
https://static.github-zh.com/github_avatars/mendableai?size=40
mendableai / firecrawl-mcp-server

Official Firecrawl MCP Server - Adds powerful web scraping to Cursor, Claude and any other LLM clients.

batch-processingclaudecontent-extractiondata-collectionfirecrawlfirecrawl-aillm-toolsmcp-servermodel-context-protocolsearch-apiweb-crawlerweb-scrapingjavascript-rendering
JavaScript 3.44 k
11 天前
snooppr/snoop
https://static.github-zh.com/github_avatars/snooppr?size=40
snooppr / snoop

#网络爬虫#Snoop — инструмент разведки на основе открытых данных (OSINT world)

OSINTTermuxusername-searchusername-checkerpentestweb-scrapingctfscannerredteamblueteamCybersecurity安全nicknameipgeopoliceParserscrapinggeocoderusername
Python 3.37 k
10 天前
https://static.github-zh.com/github_avatars/jaypyles?size=40
jaypyles / Scraperr

#网络爬虫#Self-hosted webscraper.

Open Source自托管webscraperDockerhelmKubernetesPlaywrightPythonscrapingweb-scraperweb-scrapersweb-scrapingwebscraping
TypeScript 3.34 k
6 天前
https://static.github-zh.com/github_avatars/php-curl-class?size=40
php-curl-class / php-curl-class

PHP Curl Class makes it easy to send HTTP requests and integrate with web APIs

PHPcURLclassAPIclient框架HTTPhttp-clienthttp-proxyJSONphp-curlproxyrequestsrestfulweb-scraperweb-scrapingXML
PHP 3.29 k
5 天前
https://static.github-zh.com/github_avatars/x4nth055?size=40
x4nth055 / pythoncode-tutorials

#自然语言处理#The Python Code Tutorials

PythonScapyethical-hackingnetwork-programmingnetwork-securitynetwork-analysispython-tutorials教程机器学习text-classificationsocket-programmingface-detection机器视觉programming-tutorial自然语言处理web-scraping
Jupyter Notebook 2.85 k
16 天前
loading...