#

scraping

scrapy/scrapy
https://static.github-zh.com/github_avatars/scrapy?size=40
Python 58.26 k
3 天前
https://static.github-zh.com/github_avatars/firecrawl?size=40

#网络爬虫#Firecrawl 是一种 API 服务,它爬取URL并将其转换为清洗过的 markdown 或结构化数据

TypeScript 58.03 k
7 小时前
https://static.github-zh.com/github_avatars/feder-cr?size=40

#网络爬虫#AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.

Python 28.82 k
4 个月前
https://static.github-zh.com/github_avatars/gocolly?size=40
Go 24.66 k
1 个月前
soxoj/maigret
https://static.github-zh.com/github_avatars/soxoj?size=40

#网络爬虫#Maigret 是一个OSINT用户名检查器。输入目标用户名,即可从各大社交网站采集该用户信息的工具。fork自sherlock开源项目

Python 17.62 k
3 天前
https://static.github-zh.com/github_avatars/ultrafunkamsterdam?size=40
Python 11.76 k
2 个月前
https://static.github-zh.com/github_avatars/code4craft?size=40

#网络爬虫#webmagic是一个开源的Java垂直爬虫框架,目标是简化爬虫的开发流程,让开发者专注于逻辑功能的开发。webmagic的核心非常简单,但是覆盖爬虫的整个流程,也是很好的学习爬虫开发的材料。

Java 11.63 k
1 个月前
D4Vinci/Scrapling
https://static.github-zh.com/github_avatars/D4Vinci?size=40
Python 7.31 k
5 小时前
https://static.github-zh.com/github_avatars/tabulapdf?size=40

#网络爬虫#Tabula is a tool for liberating data tables trapped inside PDF files

CSS 7.18 k
6 个月前
https://static.github-zh.com/github_avatars/apify?size=40

#网络爬虫#Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...

Python 6.31 k
14 小时前
https://static.github-zh.com/github_avatars/autoscrape-labs?size=40
Python 5.25 k
21 天前
https://static.github-zh.com/github_avatars/adbar?size=40
Python 4.68 k
6 天前
https://static.github-zh.com/github_avatars/sparklemotion?size=40

#网络爬虫#Mechanize is a ruby library that makes automated web interaction easy.

Ruby 4.42 k
2 个月前
loading...
Website
Wikipedia