集合主题趋势排行榜

scraper

firecrawl / firecrawl

#网络爬虫#Firecrawl 是一种 API 服务，它爬取URL并将其转换为清洗过的 markdown 或结构化数据

人工智能爬虫 data Markdown scraper html-to-markdown 大语言模型 rag scraping web-crawler ai-scraping webscraping

TypeScript 58.03 k

7 小时前

huginn / huginn

#自动化#你的代理人，随时待命。Huginn 是一个用于构建自动化任务的web平台。

自动化 notifications scraper webscraping feedgenerator RSS agent 监控 feed twitter-streaming huginn X (Twitter)

Ruby 47.45 k

7 天前

NaiboWang / EasySpider

#前端开发#A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化的设计和执行爬虫任务。别名：ServiceWrapper面向Web应用的智能化服务封装系统。

code-free 爬虫 GUI layman spider parameters Web input-parameters 前端 HTML batch-processing batch-script visual 可视化 visualprogramming scraper data-collection rpa Robotics

JavaScript 42.55 k

1 个月前

iawia002 / lux

#网络爬虫#一个Go语言开发命令行视频下载工具

下载器 Go 爬虫 scraper Video 哔哩哔哩 YouTube youku iqiyi tumblr qq download

Go 30.43 k

3 天前

cheeriojs / cheerio

#网络爬虫#一个运行在服务端的 jQuery 实现，用于解析和操作 HTML 及 XML

cheerio jQuery htmlparser2 Document Object Model (DOM)htmlparser selector scraper Parser HTML Hacktoberfest

TypeScript 29.74 k

2 小时前

feder-cr / Jobs_Applier_AI_Agent_AIHawk

#网络爬虫#AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.

自动化 Bot ChatGPT gpt job jobsearch jobseeker opeai Python resume scraper scraping application-resume Selenium Chrome human-resources jobs agent 人工智能

Python 28.82 k

4 个月前

gocolly / colly

#爬虫框架#一个快速优雅的Golang爬虫框架

Go scraper 框架爬虫 scraping crawling spider

Go 24.66 k

1 个月前

apify / crawlee

#网络爬虫#Crawlee - 一个用于Node.js 开发的网页爬虫和浏览器自动化库

web-scraping web-crawling npm headless-chrome Puppeteer 自动化 apify scraping crawling 爬虫 headless scraper web-crawler JavaScript Node.js Playwright TypeScript

TypeScript 19.51 k

17 小时前

codelucas / newspaper

#网络爬虫#一个Python数据采集框架，能自动提取新闻、文章的标题、关键词、作者、摘要、正文等元数据

Python news 爬虫 crawling scraper news-aggregator

HTML 14.78 k

1 个月前

Evil0ctal / Douyin_TikTok_Download_API

#网络爬虫#🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具，支持API调用，在线批量解析及下载。

Python 14.31 k

6 个月前

getmaxun / maxun

#网络爬虫#一个可视化，通过鼠标点击完成数据采集的爬虫平台

自动化无代码 scraper web-automation web-scraper web-scraping API browser browser-automation Playwright 自托管 website-to-api robotic-process-automation rpa no-code-web-scraper agents data-extraction webscraping

TypeScript 13.62 k

2 天前

pwxcoo / chinese-xinhua

#网络爬虫#📙 中华新华字典数据库。包括歇后语，成语，词语，汉字。

data scraper chinese-traditional Python 中文 chinese-characters chinese-nlp chinese-language chinese-simplified json-dataset JSON json-data

Python 11.34 k

2 年前

guyueyingmu / avbook

#网络爬虫#AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

javbus avmoo javlibrary spider 爬虫 Laravel scraper adult magnet-link magnet 数据库 adult-video guzzlehttp

PHP 9.81 k

1 年前

TeamWiseFlow / wiseflow

#网络爬虫#Use LLMs to dig out what you care about from massive amounts of information and a variety of sources daily.

爬虫 information-gathering 大语言模型 scraper

Python 7.8 k

4 天前

arc298 / instagram-scraper

#网络爬虫#instagram-scraper 是一个Python开发instagram爬虫，用于爬取 instagram 用户的图片和照片

Instagram instagram-scraper instagram-user-photos Python scraper instagram-client instagram-api

Python 6.96 k

3 年前

BruceDone / awesome-crawler

#网络爬虫#A collection of awesome web crawler,spider in different languages

web-crawler 爬虫 web-scraper spider scraper Awesome Lists

6.95 k

1 年前

alirezamika / autoscraper

#网络爬虫#A Smart, Automatic, Fast and Lightweight Web Scraper for Python

scraping scraper scrape webscraping 爬虫 web-scraping 人工智能 Python webautomation 自动化机器学习

Python 6.93 k

3 个月前

apify / crawlee-python

#网络爬虫#Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...