Add Harvard test example to README
- Add detailed test results table with script paths - Include Harvard test example with commands and sample output - List covered Harvard schools 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
51
README.md
51
README.md
@ -118,11 +118,52 @@ uv run university-agent generate \
|
|||||||
|
|
||||||
## 测试过的大学
|
## 测试过的大学
|
||||||
|
|
||||||
| 大学 | 状态 | 备注 |
|
| 大学 | 状态 | 结果 | 生成的脚本 |
|
||||||
|------|------|------|
|
|------|------|------|-----------|
|
||||||
| Harvard | ✅ | 找到 277 个链接 |
|
| Harvard | ✅ | 277 链接 (8 项目, 269 教职, 265 个人主页) | `artifacts/harvard_faculty_scraper.py` |
|
||||||
| RWTH Aachen | ✅ | 找到 108 个链接 |
|
| RWTH Aachen | ✅ | 108 链接 (103 项目, 5 教职) | `artifacts/rwth_aachen_playwright_scraper.py` |
|
||||||
| KAUST | ✅ | 需使用 Firefox,网站较慢 |
|
| KAUST | ✅ | 9 链接 (需使用 Firefox) | `artifacts/kaust_faculty_scraper.py` |
|
||||||
|
|
||||||
|
### Harvard 测试示例
|
||||||
|
|
||||||
|
**生成爬虫脚本:**
|
||||||
|
```bash
|
||||||
|
uv run python generate_scraper.py --url "https://www.harvard.edu/" --name "Harvard"
|
||||||
|
```
|
||||||
|
|
||||||
|
**运行爬虫:**
|
||||||
|
```bash
|
||||||
|
cd artifacts
|
||||||
|
uv run python harvard_faculty_scraper.py --max-pages 30 --no-verify
|
||||||
|
```
|
||||||
|
|
||||||
|
**结果输出** (`artifacts/university-scraper_results.json`):
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"statistics": {
|
||||||
|
"total_links": 277,
|
||||||
|
"program_links": 8,
|
||||||
|
"faculty_links": 269,
|
||||||
|
"profile_pages": 265
|
||||||
|
},
|
||||||
|
"program_links": [
|
||||||
|
{"url": "https://www.harvard.edu/programs/?degree_levels=graduate", "text": "Graduate Programs"},
|
||||||
|
...
|
||||||
|
],
|
||||||
|
"faculty_links": [
|
||||||
|
{"url": "https://www.gse.harvard.edu/directory/faculty", "text": "Faculty Directory"},
|
||||||
|
{"url": "https://faculty.harvard.edu", "text": "Harvard Faculty"},
|
||||||
|
...
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
爬取覆盖了 Harvard 的多个学院:
|
||||||
|
- Graduate School of Design (GSD)
|
||||||
|
- Graduate School of Education (GSE)
|
||||||
|
- Faculty of Arts and Sciences (FAS)
|
||||||
|
- Graduate School of Arts and Sciences (GSAS)
|
||||||
|
- Harvard Divinity School (HDS)
|
||||||
|
|
||||||
## 故障排除
|
## 故障排除
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user