GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

crawling

Website
Wikipedia
scrapy/scrapy
https://static.github-zh.com/github_avatars/scrapy?size=40
scrapy / scrapy

#爬虫框架#一款流行,高效,生态丰富的Python爬虫框架

Pythonscrapingcrawling框架爬虫Hacktoberfestweb-scrapingweb-scraping-python
Python 57.07 k
1 天前
https://static.github-zh.com/github_avatars/gocolly?size=40
gocolly / colly

#爬虫框架#一个快速优雅的Golang爬虫框架

Goscraper框架爬虫scrapingcrawlingspider
Go 24.32 k
5 天前
apify/crawlee
https://static.github-zh.com/github_avatars/apify?size=40
apify / crawlee

#网络爬虫#Crawlee - 一个用于Node.js 开发的网页爬虫和浏览器自动化库

web-scrapingweb-crawlingnpmheadless-chromePuppeteer自动化apifyscrapingcrawling爬虫headlessscraperweb-crawlerJavaScriptNode.jsPlaywrightTypeScript
TypeScript 17.92 k
2 天前
https://static.github-zh.com/github_avatars/codelucas?size=40
codelucas / newspaper

#网络爬虫#一个Python数据采集框架,能自动提取新闻、文章的标题、关键词、作者、摘要、正文等元数据

Pythonnews爬虫crawlingscrapernews-aggregator
HTML 14.61 k
3 个月前
https://static.github-zh.com/github_avatars/lorien?size=40
lorien / awesome-web-scraping

#网络爬虫#List of libraries, tools and APIs for web scraping and data processing.

web-scrapingcaptcha-recaptchacrawlingscrapingscraping-frameworkscraping-pythonscraping-toolwebscraping爬虫spider
Makefile 7.04 k
6 个月前
https://static.github-zh.com/github_avatars/go-rod?size=40
go-rod / rod

#网络爬虫#Rod 是一个直接基于 DevTools Protocol 高级驱动程序。 它是为网页自动化和爬虫而设计的,既可用于高级应用开发也可用于低级应用开发,高级开发人员可以使用低级包和函数来轻松地定制或建立他们自己的Rod版本,高级函数只是建立Rod默认版本的例子。

cdpchrome-headlesschrome-devtoolschrome-devtools-protocolheadlessweb-scraping自动化scraperdevtoolsdevtools-protocolrodGoTestingWebgorodcrawling
Go 5.98 k
6 个月前
MontFerret/ferret
https://static.github-zh.com/github_avatars/MontFerret?size=40
MontFerret / ferret

#网络爬虫#Declarative web scraping

Goquery-languagedata-miningscrapingscraping-websitesdslcdpcrawlingscraper爬虫Chrome命令行界面工具Library
Go 5.82 k
4 天前
https://static.github-zh.com/github_avatars/apify?size=40
apify / crawlee-python

#网络爬虫#Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...

apify自动化beautifulsoup爬虫crawlingheadlessheadless-chromepipPlaywrightPythonscraperscrapingweb-crawlerweb-crawlingweb-scrapingHacktoberfest
Python 5.73 k
3 天前
https://static.github-zh.com/github_avatars/yujiosaka?size=40
yujiosaka / headless-chrome-crawler

#网络爬虫#Distributed crawler powered by Headless Chrome

headless-chromePuppeteerjQuery爬虫crawlingscraperscrapingChromeChromiumPromise
JavaScript 5.58 k
2 年前
D4Vinci/Scrapling
https://static.github-zh.com/github_avatars/D4Vinci?size=40
D4Vinci / Scrapling

#网络爬虫#🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!

爬虫crawlingHacktoberfestPlaywrightPythonscrapingselectorsstealth-gameweb-scraperweb-scrapingweb-scraping-pythonwebscrapingxpath自动化人工智能ai-scrapingdatadata-extraction
Python 5.4 k
15 天前
hakluke/hakrawler
https://static.github-zh.com/github_avatars/hakluke?size=40
hakluke / hakrawler

#网络爬虫#Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application

Bug BountycrawlingHackingOSINTpentestingReconnaissancereconnaissance
Go 4.76 k
6 个月前
https://static.github-zh.com/github_avatars/hardkoded?size=40
hardkoded / puppeteer-sharp

#网络爬虫#Headless Chrome .NET API

PuppeteerChromeChromium自动化爬虫crawlingC#e2ee2e-testingwebautomation
C# 3.66 k
4 天前
https://static.github-zh.com/github_avatars/apache?size=40
apache / nutch

#网络爬虫#Apache Nutch is an extensible and scalable web crawler

Javanutchweb-crawlercrawlinghadoopapache
Java 3.03 k
3 个月前
ai-robots-txt/ai.robots.txt
https://static.github-zh.com/github_avatars/ai-robots-txt?size=40
ai-robots-txt / ai.robots.txt

#网络爬虫#A list of AI agents and robots to block.

人工智能crawlerscrawling隐私
Python 2.75 k
5 天前
https://static.github-zh.com/github_avatars/transitive-bullshit?size=40
transitive-bullshit / awesome-puppeteer

#网络爬虫#A curated list of awesome puppeteer resources.

Puppeteerheadless-chromeAwesome Listsscrapingcrawling自动化
2.48 k
1 年前
https://static.github-zh.com/github_avatars/lorien?size=40
lorien / grab

#网络爬虫#Web Scraping Framework

web-scrapinghttp-client框架PythonpycurlasynchronousNetworkurllib3spider爬虫crawlingscrapingpython-library
Python 2.4 k
1 年前
https://static.github-zh.com/github_avatars/zorlan?size=40
zorlan / skycaiji

#网络爬虫#蓝天采集器是一款开源免费的爬虫系统,仅需点选编辑规则即可采集数据,可运行在本地、虚拟主机或云服务器中,几乎能采集所有类型的网页,无缝对接各类CMS建站程序,免登录实时发布数据,全自动无需人工干预!是网页大数据采集软件中完全跨平台的云端爬虫系统

爬虫crawlingspiderwebcrawlerPHP
PHP 2.01 k
12 天前
https://static.github-zh.com/github_avatars/edoardottt?size=40
edoardottt / cariddi

#网络爬虫#Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more

endpointsendpoint-discoveryBug Bounty爬虫secret-keyssecrets-detectionCybersecurityreconnaissanceReconnaissancecrawlingGopentesting安全OSINTpenetration-testingscraperHacktoberfestredteam
Go 1.73 k
1 个月前
NateScarlet/holiday-cn
https://static.github-zh.com/github_avatars/NateScarlet?size=40
NateScarlet / holiday-cn

#网络爬虫#📅🇨🇳中国法定节假日数据 自动每日抓取国务院公告

data自然语言处理crawlingholidaychina
Python 1.51 k
2 天前
https://static.github-zh.com/github_avatars/roach-php?size=40
roach-php / core

#网络爬虫#The complete web scraping toolkit for PHP.

PHPweb-scrapingcrawling
PHP 1.41 k
13 天前
loading...