#93 Change xpath based on 'changed tvtropes' #94

miaekim · 2015-03-07T01:16:09Z

I changed what to patching html, because tvtropes.org has been changed a lot.

dahlia · 2015-03-10T04:51:16Z

cliche/services/tvtropes/crawler.py

-    process_redirections(session, url, final_url, namespace, name)
-    return True, tree, namespace, name, final_url
+    else:
+        (namespace, name) = map(lambda x: x.strip(), name.split(':'))


It seems pretty complex.

This is code is exactly same with below code.

namespace = name.split(':')[0].strip() name = name.split(':')[1].strip()

or

(namespace, name) = name.split(':') namespace = namespace.strip() name = name.strip()

or

(namespace,name)=name.replace(r'₩s','').split(':')

Do you think that it is better?

I could add a variable.
in line 137,

tvtropes_title = tree.xpath(name_path)[0].text.strip()

So I can write code like that

namespace = name.split(':')[0].strip() name = name.split(':')[1].strip()

How about this?

namespace, name = re.match(r'^\s*([^:]+?)\s*:\s*(.+?)\s*$', name).groups()

I think it is good. But before apply, you must complie to use.

name_pattern = re.compile(r'^\s*([^:]+?)\s*:\s*(.+?)\s*$')

both are available ideas. I will try it.

miaekim · 2015-04-22T00:58:03Z

If you merge this pr, issue #93 will be resolved.

coveralls · 2015-04-22T01:17:10Z

Coverage decreased (-0.63%) to 72.96% when pulling b064a84 on miaekim:dummies into 5452cee on clicheio:master.

coveralls · 2015-04-22T01:56:16Z

Coverage decreased (-0.63%) to 72.96% when pulling b064a84 on miaekim:dummies into 5452cee on clicheio:master.

coveralls · 2015-04-22T01:56:17Z

Coverage decreased (-0.63%) to 72.96% when pulling b064a84 on miaekim:dummies into 5452cee on clicheio:master.

coveralls · 2015-04-22T11:31:20Z

Coverage decreased (-0.63%) to 72.96% when pulling ba291d2 on miaekim:dummies into 5452cee on clicheio:master.

dahlia · 2015-04-26T05:06:11Z

It should be reviewed by @tkiapril.

tkiapril · 2015-05-09T05:31:01Z

cliche/services/tvtropes/crawler.py

@@ -133,23 +133,20 @@ def fetch_link(url, session, *, log_prefix=''):
        return False, None, None, None, final_url
    tree = document_fromstring(r.text)
    try:
-        namespace = tree.xpath('//div[@class="pagetitle"]')[0] \
-            .text.strip()[:-1]
+        name = (tree.find_class('article_title')[0]).text_content()


Does this work? I inspected tvtropes, and it seems the structure is like below:

<div class="pagetitle"> <div class="article_title"><h1><span>Home Page</span></h1></div> </div>

tkiapril · 2015-05-09T05:41:49Z

Sorry for being late for reviewing, I was like out of service for a few days after my trip.

coveralls · 2015-05-09T14:29:23Z

Coverage decreased (-0.63%) to 72.98% when pulling 33389f6 on miaekim:dummies into fab6743 on clicheio:master.

#93 Change xpath based on 'changed tvtropes'

dahlia reviewed Mar 10, 2015
View reviewed changes

miaekim force-pushed the dummies branch from f6f5f4c to 4e6747f Compare April 22, 2015 00:30

miaekim added 5 commits April 27, 2015 19:13

clicheio#93 Change xpath based on 'changed tvtropes'

5e3dc92

adjust new tvtropes.org

167d7c4

test tvtropes crawler fetch_link

b5284a8

pep8 and fix raven report

5a3b94b

remove unused args

2fa821a

tkiapril reviewed May 9, 2015
View reviewed changes

change log and remove unused parentheses

33389f6

miaekim force-pushed the dummies branch from ba291d2 to 33389f6 Compare May 9, 2015 14:19

tkiapril added a commit that referenced this pull request May 9, 2015

Merge pull request #94 from miaekim/dummies

dcfa34b

#93 Change xpath based on 'changed tvtropes'

tkiapril merged commit dcfa34b into clicheio:master May 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#93 Change xpath based on 'changed tvtropes' #94

#93 Change xpath based on 'changed tvtropes' #94

miaekim commented Mar 7, 2015

dahlia Mar 10, 2015

miaekim Mar 10, 2015

miaekim Mar 10, 2015

dahlia Mar 10, 2015

item4 Mar 15, 2015

miaekim Mar 16, 2015

miaekim commented Apr 22, 2015

coveralls commented Apr 22, 2015

coveralls commented Apr 22, 2015

coveralls commented Apr 22, 2015

coveralls commented Apr 22, 2015

dahlia commented Apr 26, 2015

tkiapril May 9, 2015

miaekim May 9, 2015

tkiapril commented May 9, 2015

coveralls commented May 9, 2015

#93 Change xpath based on 'changed tvtropes' #94

#93 Change xpath based on 'changed tvtropes' #94

Conversation

miaekim commented Mar 7, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

miaekim commented Apr 22, 2015

coveralls commented Apr 22, 2015

coveralls commented Apr 22, 2015

coveralls commented Apr 22, 2015

coveralls commented Apr 22, 2015

dahlia commented Apr 26, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tkiapril commented May 9, 2015

coveralls commented May 9, 2015