GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

web-archiving

Website
Wikipedia
ArchiveBox/ArchiveBox
https://static.github-zh.com/github_avatars/ArchiveBox?size=40
ArchiveBox / ArchiveBox

"Your own personal internet archive" (网站存档 / 爬虫),一个自托管的网站时光机

pocketwgetbrowser-bookmarkspinboardChromiumFirefoxbackupsRSSweb-archivingPythonwayback-machineyoutube-dl自托管headless-browserdigipreswarc
Python 24.05 k
1 个月前
https://static.github-zh.com/github_avatars/webrecorder?size=40
webrecorder / pywb

Core Python Web Archiving Toolkit for replay and recording of web archives

Pythonweb-archiving
JavaScript 1.51 k
1 个月前
Rhizome-Conifer/conifer
https://static.github-zh.com/github_avatars/Rhizome-Conifer?size=40
Rhizome-Conifer / conifer

Collect and revisit web pages.

web-archivingarchivesPythonDockerwarc
Python 1.5 k
5 个月前
https://static.github-zh.com/github_avatars/webrecorder?size=40
webrecorder / archiveweb.page

A High-Fidelity Web Archiving Extension for Chrome and Chromium based browsers!

Chromium插件web-archivingarchivingbrowser-extensionwarc
TypeScript 1.02 k
15 天前
https://static.github-zh.com/github_avatars/gildas-lormeau?size=40
gildas-lormeau / single-file-cli

#网络爬虫#CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)

命令行界面Node.jssingle-fileweb-archivingweb-scraperweb-scrapingarchivingscraping-websites爬虫web-crawlerDenoDockerfile
JavaScript 849
13 天前
https://static.github-zh.com/github_avatars/Ray-D-Song?size=40
Ray-D-Song / web-archive

Free web archiving and sharing service based on Cloudflare. 跑在 Cloudflare 上的免费网页归档和分享工具。

Cloudflarecloudflare-pagesd1免费Hono自托管Serverlessweb-archiveweb-archiving
TypeScript 831
9 天前
https://static.github-zh.com/github_avatars/webrecorder?size=40
webrecorder / browsertrix-crawler

#网络爬虫#Run a high-fidelity browser-based web archiving crawler in a single Docker container

爬虫crawlingwarcweb-archivingweb-crawler
TypeScript 802
4 天前
https://static.github-zh.com/github_avatars/webrecorder?size=40
webrecorder / replayweb.page

Serverless replay of web archives directly in the browser

web-archivingweb-archivewayback-machinewarcservice-worker
TypeScript 801
12 天前
https://static.github-zh.com/github_avatars/bellingcat?size=40
bellingcat / auto-archiver

#网络爬虫#Automatically archive links to videos, images, and social media content from Google Sheets (and more).

archiveDockeropen-source-researchPythonservicescrapingweb-archiving
Python 709
4 天前
https://static.github-zh.com/github_avatars/oduwsdl?size=40
oduwsdl / ipwb

InterPlanetary Wayback: A distributed and persistent archive replay system using IPFS

IPFSwarcweb-archivingPythonservice-workerDocker
Python 638
1 个月前
https://static.github-zh.com/github_avatars/akamhy?size=40
akamhy / waybackpy

Wayback Machine API interface & a command-line tool

internet-archivewayback-machineweb-archivingOSINT
Python 532
1 年前
https://static.github-zh.com/github_avatars/harvard-lil?size=40
harvard-lil / perma

Indelible links

web-archivingLibrary
JavaScript 468
19 天前
https://static.github-zh.com/github_avatars/webrecorder?size=40
webrecorder / webrecorder-player

Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder)

warcElectronweb-archiving
JavaScript 446
5 年前
https://static.github-zh.com/github_avatars/rahiel?size=40
rahiel / archiveror

Archiveror will help you preserve the webpages you love. 💾

archivingWebExtensionbrowser-extensionweb-archivingFirefox 插件Chrome 插件JavaScriptbookmark
JavaScript 443
6 年前
https://static.github-zh.com/github_avatars/oduwsdl?size=40
oduwsdl / archivenow

A Tool To Push Web Resources Into Web Archives

web-archivinginternet-archive
Python 421
1 年前
https://static.github-zh.com/github_avatars/webrecorder?size=40
webrecorder / warcio

Streaming WARC/ARC library for fast web archive IO

web-archivingwarcPython
Python 416
6 个月前
https://static.github-zh.com/github_avatars/Florents-Tselai?size=40
Florents-Tselai / WarcDB

#网络爬虫#WarcDB: Web crawl data as SQLite databases.

crawlingSQLitewarc命令行界面数据库web-archiving
Python 398
1 年前
https://static.github-zh.com/github_avatars/machawk1?size=40
machawk1 / wail

🐋 Web Archiving Integration Layer: One-Click User Instigated Preservation

web-archivingPythonGUIwarcpyinstaller
Roff 374
3 个月前
https://static.github-zh.com/github_avatars/ArchiveBox?size=40
ArchiveBox / archivebox-browser-extension

Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.

Chrome 插件Firefox 插件Sveltearchivingbrowser-extensiondigipresweb-archiving
JavaScript 318
1 个月前
https://static.github-zh.com/github_avatars/webrecorder?size=40
webrecorder / browsertrix

Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!

archivingcloudwarcweb-archiveweb-archivingKubernetes
TypeScript 280
4 天前
loading...