remove(instagram): 移除 Instagram 渠道

Instagram 反爬封杀导致所有开源工具(instaloader 等)失效,
无论有无 cookies 都无法正常使用。

- 删除 instagram.py 渠道文件
- 移除 CLI 中 search-instagram、configure instagram-cookies 等命令
- 移除 setup/doctor 中 instaloader 依赖检查
- 更新 README、docs、SKILL.md、pyproject.toml

上游 issue: instaloader#2585, instaloader#2648
Relates to: #13
This commit is contained in:
Panniantong 2026-02-26 07:20:13 +01:00
parent c3a9813b1c
commit f70711e75e
11 changed files with 21 additions and 370 deletions

View file

@ -10,13 +10,10 @@ All notable changes to this project will be documented in this file.
### 🆕 New Channels / 新增渠道
#### 📷 Instagram
- Read public posts and profiles via [instaloader](https://github.com/instaloader/instaloader)
- Search via Exa (free, no API key)
- Optional cookie login for private content
- 通过 instaloader 读取公开帖子和 Profile
- 搜索通过 Exa免费无需 API Key
- 可选 Cookie 登录解锁私密内容
#### ~~📷 Instagram~~ (removed — upstream blocked)
- ~~Read public posts and profiles via [instaloader](https://github.com/instaloader/instaloader)~~
- **Removed:** Instagram's aggressive anti-scraping measures broke all available open-source tools (instaloader, etc.). See [instaloader#2585](https://github.com/instaloader/instaloader/issues/2585). Will re-add when upstream recovers.
- **已移除:** Instagram 反爬封杀导致所有开源工具instaloader 等)失效。上游恢复后会重新加回。
#### 💼 LinkedIn
- Read person profiles, company pages, and job details via [linkedin-scraper-mcp](https://github.com/stickerdaniel/linkedin-mcp-server)
@ -38,12 +35,12 @@ All notable changes to this project will be documented in this file.
- Channel count: 9 → 12
- `agent-reach doctor` now detects all 12 channels
- CLI: added `search-instagram`, `search-linkedin`, `search-bosszhipin` subcommands
- CLI: added `search-linkedin`, `search-bosszhipin` subcommands
- Updated install guide with setup instructions for new channels
- 渠道数量9 → 12
- `agent-reach doctor` 现在检测全部 12 个渠道
- CLI新增 `search-instagram`、`search-linkedin`、`search-bosszhipin` 子命令
- 安装指南新增三个渠道配置说明
- 渠道数量9 → 11
- `agent-reach doctor` 现在检测全部 11 个渠道
- CLI新增 `search-linkedin`、`search-bosszhipin` 子命令
- 安装指南新增渠道配置说明
---

View file

@ -69,7 +69,6 @@ AI Agent 已经能帮你写代码、改文档、管项目——但你让它去
| 📺 **B站** | 本地:字幕提取 + 搜索 | 服务器也能用 | 告诉 Agent「帮我配代理」 |
| 📖 **Reddit** | 搜索(通过 Exa 免费) | 读帖子和评论 | 告诉 Agent「帮我配代理」 |
| 📕 **小红书** | — | 阅读、搜索、发帖、评论、点赞 | 告诉 Agent「帮我配小红书」 |
| 📷 **Instagram** | 搜索(通过 Exa 免费) | 读取帖子和 Profile | 告诉 Agent「帮我配 Instagram」 |
| 💼 **LinkedIn** | Jina Reader 读公开页面 | Profile 详情、公司页面、职位搜索 | 告诉 Agent「帮我配 LinkedIn」 |
| 🏢 **Boss直聘** | Jina Reader 读职位页 | 搜索职位、向 HR 打招呼 | 告诉 Agent「帮我配 Boss直聘」 |
@ -148,7 +147,6 @@ channels/
├── bilibili.py → yt-dlp ← 可以换成 bilibili-api……
├── reddit.py → JSON API + Exa ← 可以换成 PRAW、Pushshift……
├── xiaohongshu.py → mcporter MCP ← 可以换成其他 XHS 工具……
├── instagram.py → instaloader ← 可以换成 instagrapi、官方 API……
├── linkedin.py → linkedin-mcp ← 可以换成 LinkedIn API……
├── bosszhipin.py → mcp-bosszp ← 可以换成其他招聘工具……
├── rss.py → feedparser ← 可以换成 atoma……
@ -167,7 +165,6 @@ channels/
| GitHub | [gh CLI](https://cli.github.com) | 官方工具,认证后完整 API 能力 |
| 读 RSS | [feedparser](https://github.com/kurtmckee/feedparser) | Python 生态标准选择2.3K Star |
| 小红书 | [xiaohongshu-mcp](https://github.com/xpzouying/xiaohongshu-mcp) | ⭐9K+Go 语言Docker 一键部署 |
| Instagram | [instaloader](https://github.com/instaloader/instaloader) | ⭐9.8KPython CLICookie 登录,免费 |
| LinkedIn | [linkedin-scraper-mcp](https://github.com/stickerdaniel/linkedin-mcp-server) | ⭐900+MCP 服务,浏览器自动化 |
| Boss直聘 | [mcp-bosszp](https://github.com/mucsbr/mcp-bosszp) | MCP 服务,支持职位搜索和打招呼 |
@ -189,7 +186,7 @@ Agent Reach 在设计上重视安全:
### 🍪 Cookie 安全建议
需要 Cookie 的平台Twitter、小红书、Instagram)建议使用**专用小号**不要用主账号。Cookie 等同于完整登录权限,用小号可以在凭据泄露时限制影响范围。
需要 Cookie 的平台Twitter、小红书建议使用**专用小号**不要用主账号。Cookie 等同于完整登录权限,用小号可以在凭据泄露时限制影响范围。
### 📦 安装方式
@ -268,14 +265,14 @@ Yes! Agent Reach is a standard CLI tool — any AI coding agent that can run she
<details>
<summary><strong>Is this free? Any API costs?</strong></summary>
100% free. All backends are open-source tools (bird CLI, yt-dlp, Jina Reader, instaloader, Exa, etc.) that don't require paid API keys. The only optional cost is a residential proxy (~$1/month) if you need Reddit/Bilibili access from a server.
100% free. All backends are open-source tools (bird CLI, yt-dlp, Jina Reader, Exa, etc.) that don't require paid API keys. The only optional cost is a residential proxy (~$1/month) if you need Reddit/Bilibili access from a server.
</details>
---
## 致谢
[Jina Reader](https://github.com/jina-ai/reader) · [yt-dlp](https://github.com/yt-dlp/yt-dlp) · [bird](https://www.npmjs.com/package/@steipete/bird) · [Exa](https://exa.ai) · [mcporter](https://github.com/steipete/mcporter) · [feedparser](https://github.com/kurtmckee/feedparser) · [xiaohongshu-mcp](https://github.com/xpzouying/xiaohongshu-mcp) · [instaloader](https://github.com/instaloader/instaloader) · [linkedin-scraper-mcp](https://github.com/stickerdaniel/linkedin-mcp-server) · [mcp-bosszp](https://github.com/mucsbr/mcp-bosszp)
[Jina Reader](https://github.com/jina-ai/reader) · [yt-dlp](https://github.com/yt-dlp/yt-dlp) · [bird](https://www.npmjs.com/package/@steipete/bird) · [Exa](https://exa.ai) · [mcporter](https://github.com/steipete/mcporter) · [feedparser](https://github.com/kurtmckee/feedparser) · [xiaohongshu-mcp](https://github.com/xpzouying/xiaohongshu-mcp) · [linkedin-scraper-mcp](https://github.com/stickerdaniel/linkedin-mcp-server) · [mcp-bosszp](https://github.com/mucsbr/mcp-bosszp)
## License

View file

@ -20,7 +20,6 @@ from .rss import RSSChannel
from .bilibili import BilibiliChannel
from .exa_search import ExaSearchChannel
from .xiaohongshu import XiaoHongShuChannel
from .instagram import InstagramChannel
from .linkedin import LinkedInChannel
from .bosszhipin import BossZhipinChannel
@ -33,7 +32,6 @@ ALL_CHANNELS: List[Channel] = [
RedditChannel(),
BilibiliChannel(),
XiaoHongShuChannel(),
InstagramChannel(),
LinkedInChannel(),
BossZhipinChannel(),
RSSChannel(),

View file

@ -1,248 +0,0 @@
# -*- coding: utf-8 -*-
"""Instagram — via instaloader (free, open source).
Backend: instaloader (9.8K stars, Python CLI + library)
Swap to: any Instagram access tool
"""
import re
import shutil
import subprocess
from pathlib import Path
from urllib.parse import urlparse
from .base import Channel, ReadResult, SearchResult
from typing import List
class InstagramChannel(Channel):
name = "instagram"
description = "Instagram 帖子和 Profile"
backends = ["instaloader"]
tier = 2 # Needs login for full access
def can_handle(self, url: str) -> bool:
domain = urlparse(url).netloc.lower()
return "instagram.com" in domain or "instagr.am" in domain
def check(self, config=None):
# Check both CLI and Python module
has_cli = shutil.which("instaloader")
has_module = False
try:
import instaloader
has_module = True
except ImportError:
pass
if not has_cli and not has_module:
return "off", (
"需要安装 instaloaderpip install instaloader\n"
" 安装后可读取 Instagram 帖子和 Profile\n"
" 登录: agent-reach configure instagram-cookies \"sessionid=xxx; csrftoken=yyy; ...\""
)
# Check if cookies are configured
cookie_file = Path.home() / ".agent-reach" / "instagram-cookies.txt"
if cookie_file.exists():
return "ok", "已登录,可读取 Instagram 帖子和 Profile"
return "ok", "可读取公开帖子和 Profile。登录可访问更多内容:\n agent-reach configure instagram-cookies \"sessionid=xxx; csrftoken=yyy; ...\""
async def read(self, url: str, config=None) -> ReadResult:
# Try instaloader (module or CLI)
try:
import instaloader
return await self._read_instaloader(url, config)
except ImportError:
pass
# Fallback: Jina Reader
return await self._read_jina(url)
async def _read_instaloader(self, url: str, config=None) -> ReadResult:
"""Read Instagram content using instaloader Python API."""
import asyncio
import concurrent.futures
def _sync_read():
import instaloader
L = instaloader.Instaloader(
download_pictures=False,
download_videos=False,
download_video_thumbnails=False,
download_geotags=False,
download_comments=False,
save_metadata=False,
compress_json=False,
max_connection_attempts=1, # Don't retry on rate limit
)
# Try to load session: cookie file > saved session
cookie_file = Path.home() / ".agent-reach" / "instagram-cookies.txt"
if cookie_file.exists():
try:
cookie_str = cookie_file.read_text().strip()
cookies = {}
for part in cookie_str.split(";"):
part = part.strip()
if "=" in part:
k, v = part.split("=", 1)
cookies[k.strip()] = v.strip()
if "sessionid" in cookies and "csrftoken" in cookies:
# Extract username from ds_user_id or use generic
username = cookies.get("ds_user_id", "user")
L.context.load_session(username, cookies)
except Exception:
pass
elif config and config.get("instagram_username"):
try:
L.load_session_from_file(config.get("instagram_username"))
except Exception:
pass
path = urlparse(url).path.strip("/")
if "/p/" in url or "/reel/" in url:
return self._read_post_sync(L, url, path)
else:
return self._read_profile_sync(L, url, path)
try:
# Run with 15s timeout to avoid instaloader's 30-min retry
loop = asyncio.get_event_loop()
with concurrent.futures.ThreadPoolExecutor() as pool:
result = await asyncio.wait_for(
loop.run_in_executor(pool, _sync_read),
timeout=15,
)
return result
except (asyncio.TimeoutError, Exception):
# Any error or timeout → Jina fallback
return await self._read_jina(url)
def _read_post_sync(self, L, url: str, path: str) -> ReadResult:
"""Read a single Instagram post (sync, runs in executor)."""
import instaloader
# Extract shortcode from URL
match = re.search(r"/(?:p|reel)/([A-Za-z0-9_-]+)", url)
if not match:
raise ValueError("Cannot extract shortcode from URL")
shortcode = match.group(1)
try:
post = instaloader.Post.from_shortcode(L.context, shortcode)
lines = []
if post.caption:
lines.append(post.caption)
lines.append("")
lines.append(f"👤 @{post.owner_username}")
lines.append(f"❤️ {post.likes} likes")
if post.comments:
lines.append(f"💬 {post.comments} comments")
lines.append(f"📅 {post.date_utc.strftime('%Y-%m-%d %H:%M')}")
if post.location:
lines.append(f"📍 {post.location}")
if post.hashtags:
lines.append(f"#️⃣ {' '.join('#' + h for h in post.hashtags)}")
return ReadResult(
title=f"@{post.owner_username}: {(post.caption or '')[:80]}",
content="\n".join(lines),
url=url,
author=f"@{post.owner_username}",
date=post.date_utc.strftime("%Y-%m-%d"),
platform="instagram",
extra={"likes": post.likes, "comments": post.comments},
)
except Exception:
raise # Let executor timeout handle fallback
def _read_profile_sync(self, L, url: str, path: str) -> ReadResult:
"""Read an Instagram profile (sync, runs in executor)."""
import instaloader
# Extract username from path
username = path.split("/")[0] if path else ""
if not username or username in ("p", "reel", "stories", "explore"):
raise ValueError("Cannot extract username from URL")
try:
profile = instaloader.Profile.from_username(L.context, username)
lines = []
lines.append(f"👤 {profile.full_name} (@{profile.username})")
if profile.biography:
lines.append(f"📝 {profile.biography}")
if profile.external_url:
lines.append(f"🔗 {profile.external_url}")
lines.append("")
lines.append(f"📊 {profile.mediacount} posts · "
f"{profile.followers} followers · "
f"{profile.followees} following")
if profile.is_verified:
lines.append("✅ Verified")
if profile.is_business_account and profile.business_category_name:
lines.append(f"🏢 {profile.business_category_name}")
# Get recent posts (up to 5)
lines.append("")
lines.append("📸 Recent posts:")
count = 0
for post in profile.get_posts():
if count >= 5:
break
caption = (post.caption or "")[:100].replace("\n", " ")
lines.append(f" • ❤️{post.likes} | {post.date_utc.strftime('%m-%d')} | {caption}")
count += 1
return ReadResult(
title=f"{profile.full_name} (@{profile.username}) - Instagram",
content="\n".join(lines),
url=url,
author=f"@{profile.username}",
platform="instagram",
extra={
"followers": profile.followers,
"posts": profile.mediacount,
},
)
except Exception:
raise # Let executor timeout handle fallback
async def _read_jina(self, url: str) -> ReadResult:
"""Fallback: use Jina Reader."""
import requests
try:
resp = requests.get(
f"https://r.jina.ai/{url}",
headers={"Accept": "text/markdown"},
timeout=15,
)
resp.raise_for_status()
text = resp.text
return ReadResult(
title=text[:100] if text else url,
content=text,
url=url,
platform="instagram",
)
except Exception:
return ReadResult(
title="Instagram",
content=(
f"⚠️ 无法读取此 Instagram 内容: {url}\n\n"
"提示:\n"
"- 确保 URL 正确\n"
"- 安装 instaloader: pip install instaloader\n"
"- 登录以访问更多内容: instaloader --login YOUR_USERNAME"
),
url=url,
platform="instagram",
)
async def search(self, query: str, config=None, **kwargs) -> List[SearchResult]:
"""Search Instagram via Exa."""
limit = kwargs.get("limit", 10)
from agent_reach.channels.exa_search import ExaSearchChannel
exa = ExaSearchChannel()
return await exa.search(f"site:instagram.com {query}", config=config, limit=limit)

View file

@ -89,11 +89,6 @@ def main():
p_sx.add_argument("query", nargs="+", help="Search query")
p_sx.add_argument("-n", "--num", type=int, default=10, help="Number of results")
# ── search-instagram ──
p_si = sub.add_parser("search-instagram", help="Search Instagram")
p_si.add_argument("query", nargs="+", help="Search query")
p_si.add_argument("-n", "--num", type=int, default=10, help="Number of results")
# ── search-linkedin ──
p_sl = sub.add_parser("search-linkedin", help="Search LinkedIn")
p_sl.add_argument("query", nargs="+", help="Search query")
@ -122,8 +117,7 @@ def main():
p_conf = sub.add_parser("configure", help="Set a config value or auto-extract from browser")
p_conf.add_argument("key", nargs="?", default=None,
choices=["proxy", "github-token", "groq-key",
"twitter-cookies", "youtube-cookies",
"instagram-cookies"],
"twitter-cookies", "youtube-cookies"],
help="What to configure (omit if using --from-browser)")
p_conf.add_argument("value", nargs="*", help="The value(s) to set")
p_conf.add_argument("--from-browser", metavar="BROWSER",
@ -436,23 +430,6 @@ def _install_system_deps():
except Exception:
print(" ⬜ undici install failed (optional — bird may not work behind proxies)")
# ── instaloader (for Instagram) ──
if shutil.which("instaloader"):
print(" ✅ instaloader already installed")
else:
print(" 📥 Installing instaloader...")
try:
subprocess.run(
[sys.executable, "-m", "pip", "install", "instaloader"],
capture_output=True, text=True, timeout=120,
)
if shutil.which("instaloader"):
print(" ✅ instaloader installed (Instagram reading)")
else:
print(" ⬜ instaloader install failed (optional — try: pip install instaloader)")
except Exception:
print(" ⬜ instaloader install failed (optional — try: pip install instaloader)")
def _install_system_deps_safe():
"""Safe mode: check what's installed, print instructions for what's missing."""
@ -464,7 +441,6 @@ def _install_system_deps_safe():
("gh", ["gh"], "GitHub CLI", "https://cli.github.com — or: apt install gh / brew install gh"),
("node", ["node", "npm"], "Node.js", "https://nodejs.org — or: apt install nodejs npm"),
("bird", ["bird", "birdx"], "bird CLI (Twitter)", "npm install -g @steipete/bird"),
("instaloader", ["instaloader"], "instaloader (Instagram)", "pip install instaloader"),
]
missing = []
@ -495,7 +471,6 @@ def _install_system_deps_dryrun():
("gh CLI", ["gh"], "apt install gh / brew install gh"),
("Node.js", ["node"], "curl NodeSource setup | bash + apt install nodejs"),
("bird CLI", ["bird", "birdx"], "npm install -g @steipete/bird"),
("instaloader", ["instaloader"], "pip install instaloader"),
]
for label, binaries, method in checks:
@ -764,9 +739,6 @@ def _cmd_configure(args):
config.set("groq_api_key", value)
print(f"✅ Groq key configured!")
elif args.key == "instagram-cookies":
_configure_instagram_cookies(value)
def _cmd_doctor():
from agent_reach.config import Config
@ -787,30 +759,6 @@ def _parse_cookie_header(cookie_str: str) -> dict:
return cookies
def _configure_instagram_cookies(value: str):
"""Save Instagram cookies from Cookie-Editor Header String."""
from pathlib import Path
cookies = _parse_cookie_header(value)
if "sessionid" not in cookies:
print("❌ Cookie 里缺少 sessionid。")
print(" 确保你已登录 Instagram然后用 Cookie-Editor 导出 Header String。")
print(' 格式: agent-reach configure instagram-cookies "sessionid=xxx; csrftoken=yyy; ..."')
return
cookie_dir = Path.home() / ".agent-reach"
cookie_dir.mkdir(parents=True, exist_ok=True)
cookie_file = cookie_dir / "instagram-cookies.txt"
cookie_file.write_text(value.strip())
cookie_file.chmod(0o600)
print(f"✅ Instagram cookies 已保存!")
print(f" sessionid: {cookies['sessionid'][:8]}...")
if "csrftoken" in cookies:
print(f" csrftoken: ✅")
if "ds_user_id" in cookies:
print(f" ds_user_id: {cookies['ds_user_id']}")
print(f" 文件: {cookie_file}")
def _cmd_setup():
@ -952,8 +900,6 @@ async def _cmd_search(args):
results = await eyes.search_bilibili(query, limit=num)
elif args.command == "search-xhs":
results = await eyes.search_xhs(query, limit=num)
elif args.command == "search-instagram":
results = await eyes.search_instagram(query, limit=num)
elif args.command == "search-linkedin":
results = await eyes.search_linkedin(query, limit=num)
elif args.command == "search-bosszhipin":

View file

@ -101,12 +101,6 @@ class AgentReach:
results = await ch.search(query, config=self.config, limit=limit)
return [r.to_dict() for r in results]
async def search_instagram(self, query: str, limit: int = 10) -> List[Dict[str, Any]]:
"""Search Instagram via Exa."""
ch = get_channel("instagram")
results = await ch.search(query, config=self.config, limit=limit)
return [r.to_dict() for r in results]
async def search_linkedin(self, query: str, limit: int = 10) -> List[Dict[str, Any]]:
"""Search LinkedIn via MCP or Exa."""
ch = get_channel("linkedin")

View file

@ -2,11 +2,11 @@
name: agent-reach
description: >
Give your AI agent eyes to see the entire internet. Read and search across
Twitter/X, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu, Instagram, LinkedIn,
Twitter/X, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu, LinkedIn,
Boss直聘, RSS, and any web page — all from a single CLI.
Use when: (1) reading content from URLs (tweets, Reddit posts, articles, videos),
(2) searching across platforms (web, Twitter, Reddit, GitHub, YouTube, Bilibili,
XiaoHongShu, Instagram, LinkedIn, Boss直聘),
XiaoHongShu, LinkedIn, Boss直聘),
(3) user asks to configure/enable a platform channel,
(4) checking channel health or updating Agent Reach.
Triggers: "search Twitter/Reddit/YouTube", "read this URL", "find posts about",
@ -31,7 +31,7 @@ pip install https://github.com/Panniantong/agent-reach/archive/main.zip
agent-reach install --env=auto
```
`install` auto-detects your environment and installs core dependencies (Node.js, mcporter, bird CLI, gh CLI, instaloader). Read the output and run `agent-reach doctor` to see what's active.
`install` auto-detects your environment and installs core dependencies (Node.js, mcporter, bird CLI, gh CLI). Read the output and run `agent-reach doctor` to see what's active.
## Commands
@ -40,7 +40,7 @@ agent-reach install --env=auto
agent-reach read <url>
agent-reach read <url> --json # structured output
```
Handles: tweets, Reddit posts, articles, YouTube/Bilibili (transcripts), GitHub repos, Instagram posts, LinkedIn profiles, Boss直聘 jobs, XiaoHongShu notes, RSS feeds, and any web page.
Handles: tweets, Reddit posts, articles, YouTube/Bilibili (transcripts), GitHub repos, LinkedIn profiles, Boss直聘 jobs, XiaoHongShu notes, RSS feeds, and any web page.
### Search
@ -52,7 +52,6 @@ agent-reach search-github "query" # GitHub (--lang <language>)
agent-reach search-youtube "query" # YouTube
agent-reach search-bilibili "query" # Bilibili (B站)
agent-reach search-xhs "query" # XiaoHongShu (小红书)
agent-reach search-instagram "query" # Instagram
agent-reach search-linkedin "query" # LinkedIn
agent-reach search-bosszhipin "query" # Boss直聘
```
@ -71,7 +70,6 @@ agent-reach check-update # check for new versions
```bash
agent-reach configure twitter-cookies "auth_token=xxx; ct0=yyy"
agent-reach configure instagram-cookies "sessionid=xxx; csrftoken=yyy; ..."
agent-reach configure proxy http://user:pass@ip:port
agent-reach configure --from-browser chrome # auto-extract cookies from local browser
```

View file

@ -58,7 +58,6 @@ Copy that to your Agent. A few minutes later, it can read tweets, search Reddit,
| 🌐 **Web** | Read | Zero config | Any URL → clean Markdown ([Jina Reader](https://github.com/jina-ai/reader) ⭐9.8K) |
| 🐦 **Twitter/X** | Read · Search | Zero config / Cookie | Single tweets readable out of the box. Cookie unlocks search, timeline, posting ([bird](https://github.com/steipete/bird)) |
| 📕 **XiaoHongShu** | Read · Search · **Post · Comment · Like** | mcporter | Via [xiaohongshu-mcp](https://github.com/user/xiaohongshu-mcp) internal API, install and go |
| 📷 **Instagram** | Search (via Exa) | Read posts and profiles | Tell your Agent "help me set up Instagram" |
| 💼 **LinkedIn** | Jina Reader (public pages) | Full profiles, companies, job search | Tell your Agent "help me set up LinkedIn" |
| 🏢 **Boss直聘** | Jina Reader (job pages) | Job search, greet recruiters | Tell your Agent "help me set up Boss直聘" |
| 🔍 **Web Search** | Search | Auto-configured | Auto-configured during install, free, no API key ([Exa](https://exa.ai) via [mcporter](https://github.com/nicepkg/mcporter)) |
@ -185,7 +184,6 @@ channels/
├── bilibili.py → yt-dlp ← swap to bilibili-api…
├── reddit.py → JSON API + Exa ← swap to PRAW, Pushshift…
├── xiaohongshu.py → mcporter MCP ← swap to other XHS tools…
├── instagram.py → instaloader ← swap to instagrapi, official API…
├── linkedin.py → linkedin-mcp ← swap to LinkedIn API…
├── bosszhipin.py → mcp-bosszp ← swap to other job tools…
├── rss.py → feedparser ← swap to atoma…
@ -204,7 +202,6 @@ channels/
| GitHub | [gh CLI](https://cli.github.com) | Official tool, full API after auth |
| Read RSS | [feedparser](https://github.com/kurtmckee/feedparser) | Python ecosystem standard, 2.3K stars |
| XiaoHongShu | [xiaohongshu-mcp](https://github.com/user/xiaohongshu-mcp) | Internal API, bypasses anti-bot |
| Instagram | [instaloader](https://github.com/instaloader/instaloader) | 9.8K stars, Python CLI, cookie auth, free |
| LinkedIn | [linkedin-scraper-mcp](https://github.com/stickerdaniel/linkedin-mcp-server) | 900+ stars, MCP server, browser automation |
| Boss直聘 | [mcp-bosszp](https://github.com/mucsbr/mcp-bosszp) | MCP server, job search + recruiter greeting |
@ -253,7 +250,7 @@ Yes! Agent Reach is a standard CLI tool. Any AI coding agent that can execute sh
<details>
<summary><strong>Is Agent Reach free? Any API costs?</strong></summary>
100% free and open source. All backends (bird CLI, yt-dlp, Jina Reader, instaloader, Exa) are free tools that don't require paid API keys. The only optional cost is a residential proxy (~$1/month) if you need Reddit/Bilibili access from a server.
100% free and open source. All backends (bird CLI, yt-dlp, Jina Reader, Exa) are free tools that don't require paid API keys. The only optional cost is a residential proxy (~$1/month) if you need Reddit/Bilibili access from a server.
</details>
<details>
@ -272,7 +269,7 @@ Agent Reach integrates with xiaohongshu-mcp (runs in Docker). After setup, use `
## Credits
[Jina Reader](https://github.com/jina-ai/reader) · [yt-dlp](https://github.com/yt-dlp/yt-dlp) · [bird](https://github.com/steipete/bird) · [Exa](https://exa.ai) · [feedparser](https://github.com/kurtmckee/feedparser) · [instaloader](https://github.com/instaloader/instaloader) · [linkedin-scraper-mcp](https://github.com/stickerdaniel/linkedin-mcp-server) · [mcp-bosszp](https://github.com/mucsbr/mcp-bosszp)
[Jina Reader](https://github.com/jina-ai/reader) · [yt-dlp](https://github.com/yt-dlp/yt-dlp) · [bird](https://github.com/steipete/bird) · [Exa](https://exa.ai) · [feedparser](https://github.com/kurtmckee/feedparser) · [linkedin-scraper-mcp](https://github.com/stickerdaniel/linkedin-mcp-server) · [mcp-bosszp](https://github.com/mucsbr/mcp-bosszp)
## License

View file

@ -80,7 +80,7 @@ Only ask the user when you genuinely need their input (credentials, permissions,
Some channels need credentials only the user can provide. Based on the doctor output, ask for what's missing:
> 🔒 **Security tip:** For platforms that need cookies (Twitter, XiaoHongShu, Instagram), we recommend using a **dedicated/secondary account** rather than your main account. Cookie-based auth grants full account access — using a separate account limits the blast radius if credentials are ever compromised.
> 🔒 **Security tip:** For platforms that need cookies (Twitter, XiaoHongShu), we recommend using a **dedicated/secondary account** rather than your main account. Cookie-based auth grants full account access — using a separate account limits the blast radius if credentials are ever compromised.
**Twitter search & posting (server users):**
> "To unlock Twitter search, I need your Twitter cookies. Install the Cookie-Editor Chrome extension, go to x.com/twitter.com, click the extension → Export → Header String, and paste it to me."
@ -125,20 +125,6 @@ mcporter config add xiaohongshu http://localhost:18060/mcp
> - **本地电脑(有浏览器):** 打开 http://localhost:18060 扫码登录即可。
> - **服务器(无 UI 界面):** 服务器上通常没有浏览器,无法直接扫码。最方便的方式是在自己的电脑上用浏览器登录小红书,然后用 [Cookie-Editor](https://chromewebstore.google.com/detail/cookie-editor/hlkenndednhfkekhgcdicdfddnkalmdm) 插件导出 CookieHeader String 格式),发给 Agent 即可完成配置。详见 [Cookie 导出指南](cookie-export.md)。
**Instagram (需要 instaloader):**
> "Instagram 需要 instaloader。我来帮你安装。"
```bash
pip install instaloader
```
> **登录方式(解锁私密内容):**
> - **方法 1推荐Cookie-Editor 导入:** 在浏览器登录 Instagram → 用 [Cookie-Editor](https://chromewebstore.google.com/detail/cookie-editor/hlkenndednhfkekhgcdicdfddnkalmdm) 导出 Header String → 粘贴:
> ```bash
> agent-reach configure instagram-cookies "sessionid=xxx; csrftoken=yyy; ..."
> ```
> - **方法 2instaloader 命令行登录:** `instaloader --login YOUR_USERNAME`(需要输密码,有 2FA 的话还要输验证码)
**LinkedIn (可选 — linkedin-scraper-mcp):**
> "LinkedIn 基本内容可通过 Jina Reader 读取。完整功能Profile 详情、职位搜索)需要 linkedin-scraper-mcp。"
@ -250,6 +236,5 @@ If the user wants a different agent to handle it, let them choose.
| `agent-reach search-youtube "query"` | Search YouTube |
| `agent-reach search-bilibili "query"` | Search Bilibili |
| `agent-reach search-xhs "query"` | Search XiaoHongShu |
| `agent-reach search-instagram "query"` | Search Instagram |
| `agent-reach search-linkedin "query"` | Search LinkedIn |
| `agent-reach search-bosszhipin "query"` | Search Boss直聘 |

View file

@ -68,16 +68,3 @@ bird search "test"
**解决方案:**
- **本地电脑:** 正常使用,一般不会被拦
- **服务器:** 使用 Jina Reader 读取职位页面 + Exa 搜索职位信息作为替代
---
## Instagram: Checkpoint / 安全验证
**症状:** `instaloader --login` 触发 Instagram 安全验证
**原因:** Instagram 检测到从未见过的设备/位置登录。
**解决方案:**
1. 在自己的浏览器登录 Instagram
2. 用 Cookie-Editor 导出 Cookie
3. 配置:`agent-reach configure instagram-cookies "sessionid=xxx; csrftoken=yyy; ..."`

View file

@ -10,7 +10,7 @@ keywords = [
"ai-agent", "llm-tools", "agent-infrastructure", "mcp",
"web-reader", "web-scraper", "search",
"twitter-scraper", "reddit-scraper", "youtube-transcript",
"bilibili", "xiaohongshu", "instagram",
"bilibili", "xiaohongshu",
"ai-search", "cli", "automation",
"claude-code", "cursor", "openai",
"free-api", "no-api-key",