菜单
  • 字节-豆包AI
  • 阿里-通义AI
  • 讯飞-星火AI
  • 百度-文心AI
  • GPT-代码AI
  • 写作助手
  • 论文助手

import requests from bs4 import beautifulsoup # 爬取唐诗三百首的内容 def scrape_data(): url = "https://so.gushiwen.cn/gushi/tangshi.aspx" headers = { "user-agent": "mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml, like gecko) chrome/89.0.4389.82 safari/537.36" } response = requests.get(url, headers=headers) if response.status_code == 200: soup = beautifulsoup(response.text, "html.parser") content_divs = soup.find_all('div', c

ontentdiv") poems = [] for div in content_divs: title = div.find('p', class_="cont") author = div.find('p', class_="source") poem = div.find('div', class_="contson") if title and author and poem: poems.append({ "title": title.text.strip(), "author": author.text.strip(), "content": poem.text.strip() }) return poems else: return None # 打印爬取结果 poems = scrape_data() for poem in poems: print(poem["title"]) print(poem["author"]) print(poem["content"]) print() [2024-06-24 01:16:00 | GPT-代码助手 | 443字解答]

相关提问