Skip to content

Commit 90f45e9

Browse files
authored
Scraping all the links in a web page.
1 parent f425068 commit 90f45e9

File tree

1 file changed

+23
-0
lines changed

1 file changed

+23
-0
lines changed

links_all.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
import requests
2+
import bs4
3+
link_list=[]
4+
res = requests.get('https://www.apple.com/')
5+
soup = bs4.BeautifulSoup(res.text, 'lxml')
6+
7+
for link in soup.find_all('a', href=True):
8+
'''link_list.append(link)
9+
# if link[0]=='#':
10+
# link_list.remove(link)
11+
if link[0]=='/':
12+
link_list.remove(link)
13+
14+
link_list
15+
'''
16+
# if link['href']!="#":
17+
# print(link["href"])
18+
if link['href'][0]=='#':
19+
pass
20+
elif link['href'][0]=='/':
21+
pass
22+
else:
23+
print(link['href'])

0 commit comments

Comments
 (0)