faiss_learn
- step 1:先安装指定py包
pip install sklearn
pip install faiss-cpu
pip install jieba
#pip install smart_open
- step 2:再打包成zip
cd /Users/liuning11/.conda/envs/myfaiss
zip -r myfaiss.zip *
- step 3:让pyspark能够使用py包
pyspark --archives /Users/liuning11/.conda/envs/myfaiss/myfaiss.zip --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=/Users/liuning11/.conda/envs/myfaiss/myfaiss.zip/bin/python
备注:
--archives <环境包地址,最好放在hdfs集群上,免得上传> --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=./XX.zip/XX/bin/python
如果是yarn-cluster模式,最好也设置下以下参数:
spark.yarn.appMasterEnv.PYSPARK_PYTHON = ./XX.zip/XX/bin/python spark.yarn.appMasterEnv.PYSPARK_DRIVER_PYTHON = ./XX.zip/XX/bin/python