Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google cannot parse "Tallest mountain in the world" #149

Open
fanzhuyifan opened this issue Jul 4, 2021 · 7 comments
Open

google cannot parse "Tallest mountain in the world" #149

fanzhuyifan opened this issue Jul 4, 2021 · 7 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@fanzhuyifan
Copy link

Description
The google engine cannot parse the return results of "Tallest mountain in the world"

To Reproduce
Steps to reproduce the behavior:

from search_engine_parser.core.engines.google import Search
searcher = Search()
results = searcher.search("Tallest mountain in the world")

Expected behavior
Correctly parsed results

Screenshots

Traceback (most recent call last):
  File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 240, in get_results
    search_results = self.parse_result(results, **kwargs)
  File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 151, in parse_result
    rdict = self.parse_single_result(each, **kwargs)
  File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/engines/google.py", line 74, in parse_single_result
    title = r_elem.find('div', class_='BNeawe').text
AttributeError: 'NoneType' object has no attribute 'text'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "XXXXX/temp.py", line 4, in <module>
    results = searcher.search("Tallest mountain in the world")
  File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 270, in search
    return self.get_results(soup, **kwargs)
  File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 243, in get_results
    raise NoResultsOrTrafficError(
search_engine_parser.core.exceptions.NoResultsOrTrafficError: The returned results could not be parsed. This might be due to site updates or server errors. Drop an issue at https://github.com/bisoncorps/search-engine-parser if this persists

Desktop (please complete the following information):

  • OS: [Linux]
  • Python Version [3.9.5]
  • Search-engine-parser version [0.6.2] (latest)

Additional context
The result that cannot be parsed:

<div class="ZINbbc xpd O9g5cc uUPGi"><div><div class="kCrYT"><a href="/url?q=https://www.infoplease.com/world/geography/top-ten-worlds-highest-mountains&amp;sa=U&amp;ved=2ahUKEwih5sjUusjxAhWPFjQIHbqKDhEQFnoECAoQCw&amp;usg=AOvVaw1pflhmM0gRBSRK5KlKcTT6"><span></span></a></div><div class="CgE3Ac I9mEQ"><table class="LnMnt"><thead><tr><td class="IxZjcf sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe uEec3 AP7Wnd">Rank</div></div></td><td class="IxZjcf sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe uEec3 AP7Wnd">Mountain</div></div></td><td class="IxZjcf sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe uEec3 AP7Wnd">Country</div></div></td></tr></thead><tbody><tr><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">1.</div></div></td><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Everest</div></div></td><td class="sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Nepal/Tibet</div></div></td></tr><tr><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">2.</div></div></td><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">K2 (Mount Godwin Austen)</div></div></td><td class="sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Pakistan/China</div></div></td></tr><tr><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">3.</div></div></td><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Kangchenjunga</div></div></td><td class="sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">India/Nepal</div></div></td></tr><tr><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">4.</div></div></td><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Lhotse</div></div></td><td class="sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Nepal/Tibet</div></div></td></tr></tbody></table></div><div class="hwc"><div class="Q0HXG"></div><div class="kCrYT"><a href="/url?q=https://www.infoplease.com/world/geography/top-ten-worlds-highest-mountains&amp;sa=U&amp;ved=2ahUKEwih5sjUusjxAhWPFjQIHbqKDhEQFnoECAoQDA&amp;usg=AOvVaw39wAm-G8SzoUzVMu-r2DX6"><div><span><div class="BNeawe vvjwJb AP7Wnd">The Top Ten: The World's Highest Mountains - Infoplease</div></span><span><div class="BNeawe UPmit AP7Wnd">www.infoplease.com &gt; world &gt; geography &gt; top-ten-worlds-highest-mount...</div></span></div></a></div></div></div></div>

The corresponding result of https://github.com/bisoncorps/search-engine-parser/blob/0418867b3529980d5a4eb71899dec37092fe7df1/search_engine_parser/core/engines/google.py#L66

[<div class="kCrYT"><a href="/url?q=https://www.infoplease.com/world/geography/top-ten-worlds-highest-mountains&amp;sa=U&amp;ved=2ahUKEwih5sjUusjxAhWPFjQIHbqKDhEQFnoECAoQCw&amp;usg=AOvVaw1pflhmM0gRBSRK5KlKcTT6"><span></span></a></div>,
 <div class="kCrYT"><a href="/url?q=https://www.infoplease.com/world/geography/top-ten-worlds-highest-mountains&amp;sa=U&amp;ved=2ahUKEwih5sjUusjxAhWPFjQIHbqKDhEQFnoECAoQDA&amp;usg=AOvVaw39wAm-G8SzoUzVMu-r2DX6"><div><span><div class="BNeawe vvjwJb AP7Wnd">The Top Ten: The World's Highest Mountains - Infoplease</div></span><span><div class="BNeawe UPmit AP7Wnd">www.infoplease.com &gt; world &gt; geography &gt; top-ten-worlds-highest-mount...</div></span></div></a></div>]

The first div does not contain the title.

@fanzhuyifan fanzhuyifan added the bug Something isn't working label Jul 4, 2021
@deven96 deven96 added the help wanted Extra attention is needed label Sep 18, 2021
@MeNsaaH
Copy link
Member

MeNsaaH commented Sep 20, 2021

Are you running this on heroku?

@KennBro
Copy link

KennBro commented Dec 20, 2021

I have version 0.6.6 installed and I get the same error. And I am not running on heroku.

@GuyKh
Copy link

GuyKh commented Feb 8, 2022

Same error on various search queries

@MeNsaaH
Copy link
Member

MeNsaaH commented Feb 8, 2022

Is this on Heroku?

@icc-sundar
Copy link

I am getting the same error on various search queries. I also tried running this locally and not on Heroku, but it is still not working.

@GigglePocket
Copy link

I am also receiving the same exceptions for all but a few of the simplest single-word search terms.

Specs

  • OS: Windows 10 Pro
    • Version: 21H2
    • Build: 19044.1645
  • Parser Version: 0.6.6

Other

  • Not running Heroku

@bentsi
Copy link
Contributor

bentsi commented Jul 13, 2022

#168 should fix it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

8 participants