GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

robots-txt

Website
Wikipedia
https://static.github-zh.com/github_avatars/PuerkitoBio?size=40
PuerkitoBio / gocrawl

#网络爬虫#Polite, slim and concurrent web crawler.

爬虫robots-txt
Go 2.05 k
4 年前
https://static.github-zh.com/github_avatars/eliasdabbas?size=40
eliasdabbas / advertools

advertools - online marketing productivity and analysis tools

marketingadvertisingPythonkeywordstwitter-api搜索引擎优化 (SEO)social-mediaYouTuberobots-txtscrapyLogging
Python 1.24 k
3 天前
https://static.github-zh.com/github_avatars/PuerkitoBio?size=40
PuerkitoBio / fetchbot

#网络爬虫#A simple and flexible web crawler that follows the robots.txt policies and crawl delays.

爬虫robots-txt
Go 790
4 年前
https://static.github-zh.com/github_avatars/nuxt-modules?size=40
nuxt-modules / robots

Tame the robots crawling and indexing your Nuxt site.

Nuxt.jsVue.jsnuxt-modulerobots-txtssr
TypeScript 478
6 天前
https://static.github-zh.com/github_avatars/thedaviddias?size=40
thedaviddias / llms-txt-hub

🤖 The largest directory for AI-ready documentation and tools implementing the proposed llms.txt standard

directory大语言模型Nextrobots-txtSupabasecursorcursor-ai
TypeScript 321
2 个月前
https://static.github-zh.com/github_avatars/temoto?size=40
temoto / robotstxt

The robots.txt exclusion protocol implementation for Go language

Gogolang-libraryrobots-txtWebproduction-readygo-library
Go 274
3 年前
https://static.github-zh.com/github_avatars/TurnerSoftware?size=40
TurnerSoftware / InfinityCrawler

#网络爬虫#A simple but powerful web crawler library for .NET

爬虫web-crawlerweb-crawlingrobots-txtspider
C# 252
2 年前
https://static.github-zh.com/github_avatars/crawler-commons?size=40
crawler-commons / crawler-commons

A set of reusable Java components that implement functionality common to any web crawler

web-crawlerJavarobots-txtOpen SourceLibrary
Java 244
6 天前
https://static.github-zh.com/github_avatars/spatie?size=40
spatie / robots-txt

#网络爬虫#Determine if a page may be crawled from robots.txt, robots meta tags and robot headers

PHProbots-txt爬虫
PHP 239
24 天前
https://static.github-zh.com/github_avatars/GateNLP?size=40
GateNLP / ultimate-sitemap-parser

Ultimate Website Sitemap Parser

Pythonsitemapsitemap-xmlrobots-txtxml-sitemap
Python 219
7 天前
https://static.github-zh.com/github_avatars/alexjc?size=40
alexjc / weboptout

Opt-Out tool to check Copyright reservations in a way that even machines can understand.

command-line-toolrobots-txtwebscrapingterms-of-serviceDataOpscopyright
Python 194
1 年前
https://static.github-zh.com/github_avatars/beb7?size=40
beb7 / gflare-tk

#网络爬虫#Open-Source Python Based SEO Web Crawler

搜索引擎优化 (SEO)爬虫scraperPythontkinterrobots-txt
Python 173
2 年前
https://static.github-zh.com/github_avatars/samclarke?size=40
samclarke / robots-parser

NodeJS robots.txt parser with support for wildcard (*) matching.

user-agentJavaScriptNode.jsrobots-txt
JavaScript 156
8 个月前
https://static.github-zh.com/github_avatars/healsdata?size=40
healsdata / ai-training-opt-out

Known tags and settings suggested to opt out of having your content used for AI training.

人工智能metarobots-txt
HTML 148
1 年前
https://static.github-zh.com/github_avatars/alextim?size=40
alextim / astro-lib

Makes it easy to add robots.txt, sitemap and web app manifest during build to your Astro app.

Astro搜索引擎优化 (SEO)robots-txtsitemapsitemap-xml
TypeScript 119
2 年前
https://static.github-zh.com/github_avatars/seantomburke?size=40
seantomburke / sitemapper

#网络爬虫#Parse through any sitemap in Node.js

sitemapsitemap-xmlParsingJavaScript爬虫crawlingindexingrobots-txt搜索引擎优化 (SEO)WebXML
TypeScript 119
14 天前
https://static.github-zh.com/github_avatars/jimsmart?size=40
jimsmart / grobotstxt

grobotstxt is a native Go port of Google's robots.txt parser and matcher library.

Gorobots-txt
Go 110
3 年前
https://static.github-zh.com/github_avatars/mdreizin?size=40
mdreizin / gatsby-plugin-robots-txt

Gatsby plugin that automatically creates robots.txt for your site

gatsbygatsby-pluginrobots-txt
JavaScript 106
1 年前
https://static.github-zh.com/github_avatars/samber?size=40
samber / the-great-gpt-firewall

#网络爬虫#🤖 A curated list of websites that restrict access to AI Agents, AI crawlers and GPTs

agentanthropicblocklistcensorship爬虫genaigenerative-aigptgpt-4大语言模型openairobots-txtuser-agentfirewall
Python 91
14 天前
https://static.github-zh.com/github_avatars/jonasjacek?size=40
jonasjacek / robots.txt

#搜索#Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.

robots-txtuser-agent搜索引擎优化 (SEO)搜索引擎whitelistcrawlersweb-crawlingcrawling
85
4 个月前
loading...