-
-
Notifications
You must be signed in to change notification settings - Fork 600
Google Summer of Code 2019
Owner: Abhishek Kumar
Mentor: Philippe Ombredanne
# Problem: Since Python 2.7 will retire in few months and no longer maintained.
# Solution: Scancode needs to be ported to python 3 and all tests suit must passed on both version of Python.The main difference that makes Python 3 better than Python 2.x is that the support for unicode is greatly improved in Python 3 .This will also be useful for scancode as scancode has users in more than 100 languages and it's easy to translate strings from unicode to other languages.
# Objective: To make scancode-toolkit installable on Python 3.6.8 just as it install with python 2.7 .
- It was started in development mode(editable mode) and then i was move to work in virtual environments.
- I have worked module by module according to order of hierarchy .
For example :All module is dependent on commoncode, so it must be ported first.In this way we create the Porting order:
- commoncode
- plugincode
- typecode
- extractcode
- textcode
- scancode basics (some tests are integration tests and will have to wait to be ported)
- formattedcode, starting with JSON (some tests are integration tests and will have to wait to be ported)
- cluecode
- licensedcode
- packagedcode (depends on licensecode)
- summarycode
- fixup the remaining bits and tests
After porting, i have marked these modules as ported scanpy3
with help of conffest plugin(created by @pombredane). Conffest plugin is heart of this project. Without this, it was very difficult to do.
Dependencies has fixed at the time of porting module.
It is very difficult to deal with paths on different operating systems.The issue is around macOS/Windows/Linux. The first two OS handle unicode paths alright on Python 2 and 3 but not completely on macOS Mojave because its filesystem encoding is APFS. Linux paths are bytes and os.listdir is broken on Python 2. As a result you can only sanely handle Linux paths as bytes on Python2. But on Python3 things seems to be corrected and working on unicode and Linux.
For more details visit here : https://vstinner.github.io/painful-history-python-filesystem-encoding.html and jaraco/path.py#130
We came with various Solution:
-
To use pathlib which generally handle things correctly across platforms. And for backports we use pathlib2. But this solution also fails because pathlib2 does not work as expected wrt unicode vs bytes. And os.listdir also doesn't work properly.
-
To use path.py which handles the paths across all the platforms even on macOS Mojave .
-
Use
bytes
on linux and python 3 andunicode
everywhere.
We choose the third solution because it is most fundamental and simple and easy to use.
Project was tracked in this ticket nexB/scancode-toolkit#295
Project link : Port Scancode to Python 3
My contribution : Commits
Note : Please give your feedback here
Now we have liftoff on Python 3 . We are able to run basic scans without errors on develop branch.You check it by running scancode -clipeu samples/ --json-pp - -n4
.
At last I would like to thanks my Mentor @pombredanne aka Philippe Ombredanne .He has helped lot in completing this project.He is very supportive and responsive .I have learned a lot from him .By his encouragement and motivation, I am very improving day by day ,built and develop my skills and complete the project during GSoC timeline.
See http://nexb.com for more.