Skip to content

Commit

Permalink
<v0.2.5> create a new version.
Browse files Browse the repository at this point in the history
  • Loading branch information
Tishacy committed Aug 2, 2019
1 parent 2817de9 commit 64288b0
Show file tree
Hide file tree
Showing 12 changed files with 276 additions and 159 deletions.
84 changes: 52 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,67 +9,83 @@ Download pdfs from Scihub via DOI.

## Install
```bash
pip3 install scidownl
$ pip3 install -U scidownl
```

## Usage
### Command line
```bash
$ scidownl -h
usage: Command line tool to download pdf via DOI from Scihub.
[-h] [-D DOI] [-o OUTPUT] [-u]
[-h] [-c CHOOSE] [-D DOI] [-o OUTPUT] [-u] [-l]

optional arguments:
-h, --help show this help message and exit
-c CHOOSE, --choose CHOOSE
choose scihub url by index
-D DOI, --DOI DOI the DOI number of the paper
-o OUTPUT, --output OUTPUT
directory to download the pdf
-u, --update update available Scihub links
-l, --list list current saved sichub urls.
-l, --list list current saved sichub urls
```
#### Examples
```bash
# download to the current directory
$ scidownl -D 10.1021/ol9910114
$ scidownl -D 10.1021/ol9910114 -o .

# download to the specified directory
$ scidownl -D 10.1021/ol9910114 -o paper

# update available links of Scihub
# Update available links of Scihub
$ scidownl -u
[INFO] Updating links ...
[INFO] http://sci-hub.ren
[INFO] https://sci-hub.ren
[INFO] http://sci-hub.tw
[INFO] https://sci-hub.run
[INFO] http://sci-hub.ren
[INFO] http://sci-hub.red
[INFO] http://sci-hub.se
[INFO] https://sci-hub.tw
[INFO] https://sci-hub.se
[INFO] http://sci-hub.tw

# if show 'PermessionError' when updating, just use sudo.
$ sudo scidownl -u
# Choose scihub url by the index.
$ scidownl -c 5
Current scihub url: http://sci-hub.tw

# list available links of Scihub
# List available links of Scihub. You can see the current scihub url is pointing to the 5th scihub url.
$ scidownl -l
[0] http://sci-hub.ren
[1] https://sci-hub.ren
[2] http://sci-hub.tw
[3] https://sci-hub.run
[4] http://sci-hub.se
[5] https://sci-hub.tw
[6] https://sci-hub.se
[0] https://sci-hub.ren
[1] http://sci-hub.ren
[2] http://sci-hub.red
[3] http://sci-hub.se
[4] https://sci-hub.se
* [5] http://sci-hub.tw

# Download to the current directory
$ scidownl -D 10.1021/ol9910114
$ scidownl -D 10.1021/ol9910114 -o .

# Download to the specified directory, ie. '-o paper' for downloading to paper directory.
$ scidownl -D 10.1021/ol9910114 -o paper

# if 'PermessionError' shows, just use sudo. ie:
$ sudo scidownl -u
```

### Module
Download a paper via DOI.

If you have a list of DOIs, using `scidownl` in your python scripts for downloading all of the papers is recommended.

Download single paper via DOI.
```python
from scidownl.scihub import *

DOI = "10.1021/ol9910114"
out = 'paper'
sci = SciHub(DOI, out)
sci.download()
sci = SciHub(DOI, out).download(choose_scihub_url_index=3)
```

Dowloading a list of DOIS by simply using a for loop.
```python
from scidownl.scihub import *

DOIs = [...]
out = 'paper'
for doi in DOIS:
SciHub(doi, out).download(choose_scihub_url_index=3)
```

Update available Scihub links.
Expand All @@ -94,11 +110,15 @@ update_link(mod='b')
- Add new source website.
- Add `-l/--list` argument in command line tool.
- v0.2.3:
- Fix bugs of empty filename and wrong scidhub urls.
- Fix bugs in the brute-force method of updating scihub urls.
- Fix bugs of empty filename and wrong scidhub urls.
- Fix bugs in the brute-force method of updating scihub urls.
- V0.2.4:
- Fix #2.
- Fix bugs of error: file name too long.
- Fix #2.
- Fix bugs of error: file name too long.
- V0.2.5:
- Reconstruct code.
- Fix 'no content-length' error.
- Add `-c/--choose` argument for manually choosing scihub url used.

## LICENSE

Expand Down
1 change: 1 addition & 0 deletions build/lib/scidownl/cur_scihub_index.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
5
13 changes: 6 additions & 7 deletions build/lib/scidownl/link.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
https://sci-hub.ren
http://sci-hub.ren
https://sci-hub.se
https://sci-hub.run
http://sci-hub.se
http://sci-hub.tw
https://sci-hub.tw
https://sci-hub.ren
http://sci-hub.ren
http://sci-hub.red
http://sci-hub.se
https://sci-hub.se
http://sci-hub.tw
17 changes: 14 additions & 3 deletions build/lib/scidownl/scidownl.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,32 +4,43 @@
from .scihub import *
from .update_link import *


def main():
"""Command line tool to download pdfs via DOI from Scihub.
"""
parser = argparse.ArgumentParser("Command line tool to download pdf via DOI from Scihub.")
# parser.add_argument('DOI', help="the DOI number of the paper")
parser.add_argument('-c', '--choose', help="choose scihub url by index")
parser.add_argument('-D', '--DOI', help="the DOI number of the paper")
parser.add_argument('-o', '--output', help="directory to download the pdf")
parser.add_argument('-u', '--update', action='store_true', help="update available Scihub links")
parser.add_argument('-l', '--list', action='store_true', help="list current saved sichub urls.")
parser.add_argument('-l', '--list', action='store_true', help="list current saved sichub urls")
args = parser.parse_args()

if args.DOI:
SCIHUB_URL_INDEX = int(open(get_resource_path('cur_scihub_index.txt'), 'r').read())
if not args.output:
sci = SciHub(args.DOI)
else:
sci = SciHub(args.DOI, args.output)
sci.download()
sci.download(choose_scihub_url_index=SCIHUB_URL_INDEX)
elif args.update:
update_link()
elif args.list:
link_file_path = get_resource_path('link.txt')
cur_scihub_index = int(open(get_resource_path('cur_scihub_index.txt'), 'r').read())
if not os.path.isfile(link_file_path):
open(link_file_path, 'w')
with open(link_file_path, 'r') as link_file:
for i, link in enumerate(link_file.readlines()):
print('[{0}] {1}'.format(i, link[:-1]))
if i == cur_scihub_index:
print('* [{0}] {1}'.format(i, link[:-1]))
else:
print(' [{0}] {1}'.format(i, link[:-1]))
elif args.choose:
open(get_resource_path('cur_scihub_index.txt'), 'w').write(args.choose)
cur_scihub_url = open(get_resource_path('link.txt'), 'r').readlines()[int(args.choose)].replace('\n', '')
print("Current scihub url: %s" %(cur_scihub_url))
else:
print("Command line tool to download pdfs via DOI from Scihub.")

Expand Down
Loading

0 comments on commit 64288b0

Please sign in to comment.