Add Harvard test example to README
- Add detailed test results table with script paths - Include Harvard test example with commands and sample output - List covered Harvard schools 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
51
README.md
51
README.md
@ -118,11 +118,52 @@ uv run university-agent generate \
|
||||
|
||||
## 测试过的大学
|
||||
|
||||
| 大学 | 状态 | 备注 |
|
||||
|------|------|------|
|
||||
| Harvard | ✅ | 找到 277 个链接 |
|
||||
| RWTH Aachen | ✅ | 找到 108 个链接 |
|
||||
| KAUST | ✅ | 需使用 Firefox,网站较慢 |
|
||||
| 大学 | 状态 | 结果 | 生成的脚本 |
|
||||
|------|------|------|-----------|
|
||||
| Harvard | ✅ | 277 链接 (8 项目, 269 教职, 265 个人主页) | `artifacts/harvard_faculty_scraper.py` |
|
||||
| RWTH Aachen | ✅ | 108 链接 (103 项目, 5 教职) | `artifacts/rwth_aachen_playwright_scraper.py` |
|
||||
| KAUST | ✅ | 9 链接 (需使用 Firefox) | `artifacts/kaust_faculty_scraper.py` |
|
||||
|
||||
### Harvard 测试示例
|
||||
|
||||
**生成爬虫脚本:**
|
||||
```bash
|
||||
uv run python generate_scraper.py --url "https://www.harvard.edu/" --name "Harvard"
|
||||
```
|
||||
|
||||
**运行爬虫:**
|
||||
```bash
|
||||
cd artifacts
|
||||
uv run python harvard_faculty_scraper.py --max-pages 30 --no-verify
|
||||
```
|
||||
|
||||
**结果输出** (`artifacts/university-scraper_results.json`):
|
||||
```json
|
||||
{
|
||||
"statistics": {
|
||||
"total_links": 277,
|
||||
"program_links": 8,
|
||||
"faculty_links": 269,
|
||||
"profile_pages": 265
|
||||
},
|
||||
"program_links": [
|
||||
{"url": "https://www.harvard.edu/programs/?degree_levels=graduate", "text": "Graduate Programs"},
|
||||
...
|
||||
],
|
||||
"faculty_links": [
|
||||
{"url": "https://www.gse.harvard.edu/directory/faculty", "text": "Faculty Directory"},
|
||||
{"url": "https://faculty.harvard.edu", "text": "Harvard Faculty"},
|
||||
...
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
爬取覆盖了 Harvard 的多个学院:
|
||||
- Graduate School of Design (GSD)
|
||||
- Graduate School of Education (GSE)
|
||||
- Faculty of Arts and Sciences (FAS)
|
||||
- Graduate School of Arts and Sciences (GSAS)
|
||||
- Harvard Divinity School (HDS)
|
||||
|
||||
## 故障排除
|
||||
|
||||
|
||||
Reference in New Issue
Block a user