-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
不能用python执行spark job #39
Comments
@xiayank 能确认一下面路径是否失效?
|
是的。确实我的版本不是这个。解决了。 |
我的版本是py4j-0.10.4-src.zip. 把版本更改一下就可以运行了。感谢助教! |
@hackjutsu 我用 Collecting six (from nltk)
Downloading six-1.10.0-py2.py3-none-any.whl
Installing collected packages: six, nltk
Found existing installation: six 1.4.1
DEPRECATION: Uninstalling a distutils installed project (six) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.
Uninstalling six-1.4.1:
Exception:
Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/pip-9.0.1-py2.7.egg/pip/basecommand.py", line 215, in main
status = self.run(options, args)
File "/Library/Python/2.7/site-packages/pip-9.0.1-py2.7.egg/pip/commands/install.py", line 342, in run
prefix=options.prefix_path,
File "/Library/Python/2.7/site-packages/pip-9.0.1-py2.7.egg/pip/req/req_set.py", line 778, in install
requirement.uninstall(auto_confirm=True)
File "/Library/Python/2.7/site-packages/pip-9.0.1-py2.7.egg/pip/req/req_install.py", line 754, in uninstall
paths_to_remove.remove(auto_confirm)
File "/Library/Python/2.7/site-packages/pip-9.0.1-py2.7.egg/pip/req/req_uninstall.py", line 115, in remove
renames(path, new_path)
File "/Library/Python/2.7/site-packages/pip-9.0.1-py2.7.egg/pip/utils/__init__.py", line 267, in renames
shutil.move(old, new)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 302, in move
copy2(src, real_dst)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 131, in copy2
copystat(src, dst)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 103, in copystat
os.chflags(dst, st.st_flags)
OSError: [Errno 1] Operation not permitted: '/tmp/pip-xXGrka-uninstall/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/six-1.4.1-py2.7.egg-info' 我试图用 |
建议在virtualenv环境下跑python。要安装的package和macOS自带的Python package冲突了。 @xiayank 能给出测试用的代码?把代码简化一下,让别人也能快速地重复你遇到的问题。 |
@xiayank 我周四CodeLab时候讲讲如何使用Python Virtual environment吧。 |
@hackjutsu 好的 谢谢助教 我先自己研究一下。 |
@hackjutsu TypeError Traceback (most recent call last)
/Users/NIC/Documents/504_BankEnd/DemoCode/week7_codelab1/generate_word2vec_training_data.py in <module>()
30 title = entry["title"].lower().encode('utf-8')
31 query = entry["query"].lower().encode('utf-8')
---> 32 query_tokens = cleanData(query)
33
34
/Users/NIC/Documents/504_BankEnd/DemoCode/week7_codelab1/generate_word2vec_training_data.py in cleanData(input)
15 def cleanData(input) :
16 #remove stop words
---> 17 list_of_tokens = [i.lower() for i in wordpunct_tokenize(input) if i.lower() not in stop_words ]
18 return list_of_tokens
19
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nltk/tokenize/regexp.py in tokenize(self, text)
127 # If our regexp matches tokens, use re.findall:
128 else:
--> 129 return self._regexp.findall(text)
130
131 def span_tokenize(self, text):
TypeError: cannot use a string pattern on a bytes-like object
|
@hackjutsu (ENV) NIC@Yan-Mac ~/Documents/504_BankEnd/DemoCode/week7_codelab1 spark-submit --master "local[4]" generate_word2vec_training_data.py ads_0502.txt traning_data_0502.txt
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/IPython/core/interactiveshell.py:706: UserWarning: Attempting to work in a virtualenv. If you encounter problems, please install IPython inside the virtualenv.
warn("Attempting to work in a virtualenv. If you encounter problems, please "
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/Users/NIC/Documents/504_BankEnd/DemoCode/week7_codelab1/generate_word2vec_training_data.py in <module>()
30 title = entry["title"].lower().encode('utf-8')
31 query = entry["query"].lower().encode('utf-8')
---> 32 query_tokens = cleanData(query)
33
34
/Users/NIC/Documents/504_BankEnd/DemoCode/week7_codelab1/generate_word2vec_training_data.py in cleanData(input)
15 def cleanData(input) :
16 #remove stop words
---> 17 list_of_tokens = [i.lower() for i in wordpunct_tokenize(input) if i.lower() not in stop_words ]
18 return list_of_tokens
19
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nltk/tokenize/regexp.py in tokenize(self, text)
127 # If our regexp matches tokens, use re.findall:
128 else:
--> 129 return self._regexp.findall(text)
130
131 def span_tokenize(self, text):
TypeError: cannot use a string pattern on a bytes-like object |
因为 -- Update -- |
我可以用spark-submit来执行spark job。但是用python直接执行就会报错
ModuleNotFoundError: No module named 'py4j'
.这是log:
这是我的环境变量。
The text was updated successfully, but these errors were encountered: