[Sankaku] API URL change #7155

taskhawk · 2025-03-11T15:08:16Z

Looks like Sankaku changed their API URL.

I started getting:

[sankaku][error] Unable to download data:  JSONDecodeError: Expecting value: line 1 column 1 (char 0)

And checking in the browser showed a 403 Forbidden message. Their main site (https://www.sankakucomplex.com) was working fine so I checked what they were using and replaced the existing API URL in the sankaku.py extractor:

https://capi-v2.sankakucomplex.com

for the one I found:

https://sankakuapi.com/v2

and at least for pagination and posts it seems to be working fine again.

The text was updated successfully, but these errors were encountered:

taskhawk · 2025-03-11T16:00:10Z

Doing a couple of long test runs I was getting the same error at different places for different tags, had a hunch that it may be posts with notes and confirmed that it was that.

For whatever reason the API URL for notes is just https://sankakuapi.com, without the v2 part.

Don't know if pools may have the same issue, I don't download those.

and fix errors due to other changes

mikf · 2025-03-11T17:52:38Z

Everything should hopefully be fixed for the time being (1254c4e), but I suspect there are going to be more site-related changes in the coming days, especially for pools.

Note: extended / categorized tags now require an extra API request.

edit: I am not sure if the extended tags information from just one API request is always complete. Posts with more than 50/100/? tags might require even more API requests to truly fetch all extended tag information. I've only tested this with a 40 tag post.

edit2: The /tags endpoint does support a limit parameter, but it only returns a max of 100 tags even when limit is set to a number greater than that.

taskhawk · 2025-03-11T21:26:43Z

Note: extended / categorized tags now require an extra API request.

edit: I am not sure if the extended tags information from just one API request is always complete. Posts with more than 50/100/? tags might require even more API requests to truly fetch all extended tag information. I've only tested this with a 40 tag post.

edit2: The /tags endpoint does support a limit parameter, but it only returns a max of 100 tags even when limit is set to a number greater than that.

Yeah, it definitely needs additional API requests for posts with more than 40 tags.

Ugh, almost every single change those guys do ends up in a worse experience. At this point I'm just glad the old Chan site is still available, the new site is terrible.

Is it practical to scrape the Chan version when a post has more than 80 tags to save on API requests? All the tags are present there from the start, so just one more request. Though their API has a good rate limit, so maybe it's not that costly?

Couple of examples I have in my archive with the most tags (SFW):

https://chan.sankakucomplex.com/en/posts/XbayBEEg1rG
https://chan.sankakucomplex.com/en/posts/XEa1WoVNERq

mikf · 2025-03-11T22:03:03Z

Is it practical to scrape the Chan version when a post has more than 80 tags to save on API requests?

That's a very good suggestion.

https://chan.sankakucomplex.com/en/posts/XbayBEEg1rG with its 1460 tags would need 15 requests just to fetch tag category data, which takes ~3 seconds on my end.

Then again, opening this post in a browser takes a lot longer than that. It also redirects to https://chan.sankakucomplex.com/en/posts/show_empty when not logged in and the auth tokens for the API don't appear to work.

taskhawk · 2025-03-12T07:08:05Z

Yeah but it's not like we need to render the page and load its assets, we just need the HTML code so it should be faster than loading in the browser.

Is handling the session cookies difficult? Feel like it's already done for a bunch of other sites, but it's an additional hurdle, yeah.

Maybe it's not worth it, though. Those two examples are extremes. Got curious and took a look in my archive. I have downloaded 1,853,920 posts in total from Sankaku, and on average each of my tag files has 100.19 lines (I'm still calculating the median). Each tag file is a text file with the following structure:

CATEGORY
    tag
    tag

CATEGORY
    tag
    tag

There's a bit of overhead with the lines for each category (5 of them) and the empty lines as category separators.

So let's say the average is 90 tags per post (higher than I thought, tbh). That means on average 3 API calls are needed to the tags endpoint to get all the available data.

The most balanced way to go about it for the average post, I think, would be to make 1 API call to figure out the total amount of tags, if it's 80 or less we stick with the API (1 or 2 API calls total), but if it's more than 80 we scrape the Chan page (1 API call and 1 page request), saving 1 API call per post on average, at the cost of increased processing time.

Is that enough justification to implement it? Probably not if we consider that their rate limit is relatively permissive, at least so far.

As thing are now, I think it's fine to stick to the API to get all the tags data, but that's like my opinion, man.

Edit: the median ended up being 47 lines per tag file, again with the same overhead, as low as 38 tags. More in line with what I expected, it reinforces my conclusion that sticking to the API is fine for now.

mikf · 2025-03-12T07:55:23Z

So let's say the average is 90 tags per post (higher than I thought, tbh). That means on average 3 API calls are needed to the tags endpoint to get all the available data.

It is possible to fetch 100 tags per API call, so it would need only one extra.

The most balanced way to go about it for the average post, I think, would be to make 1 API call to figure out the total amount of tags

The total number of tags as well as the tag names themselves are known before making any extra API calls. It is only tag categories that are missing.

I've tested loading the Chan page in gallery-dl and it took ~24 seconds for the 1460 tag post, so API it is.

The code to fetch all tag information has already been written and committed locally, by the way, and will be available with the next git push

taskhawk · 2025-03-12T08:14:46Z

It is possible to fetch 100 tags per API call, so it would need only one extra.

Somehow I failed to register that the limit was 100 and not 40.

Went on a bit of a tangent there, didn't I? Sorry about that.

The code to fetch all tag information has already been written and committed locally, by the way, and will be available with the next git push

Thanks, mikf!

ImVantexHD · 2025-03-12T10:27:31Z

Note: extended / categorized tags now require an extra API request.

is there an option that i need to add to the command for extended tags?

chazz1560 · 2025-03-12T14:06:18Z

question for yall, I've been using a simple command for chan.sankakucomplex.com being gallery-dl -u "username" -p "password" "URL".
Obviously this no longer works so I'm wondering if you guys could point me in the right direction, I'm not exactly an expert when it comes to python it took a bit for me to even get Gallery-DL running but will this issue actually be fixed in a future update or am I going to need to manually install the new API?
Also if I do how would I do that?

Ikkoru · 2025-03-12T14:07:59Z

If you don't want to dabble with building the program yourself, just wait for the update. This program gets updated quite frequently, so you shouldn't have to wait for too long (mb a week or two?).

taskhawk · 2025-03-12T14:30:08Z

is there an option that i need to add to the command for extended tags?

Directly in the command you can enable it by using the --option argument:

gallery-dl --option tags=true ...

Or the short version:

gallery-dl -o tags=true

chazz1560 · 2025-03-12T14:37:14Z

If you don't want to dabble with building the program yourself, just wait for the update. This program gets updated quite frequently, so you shouldn't have to wait for too long (mb a week or two?).

So in other words since im illiterate at python I should just wait for the update and then my command will work again? 😅

Ikkoru · 2025-03-12T14:37:51Z

Yes.

ImVantexHD · 2025-03-12T15:53:17Z

is there an option that i need to add to the command for extended tags?

Directly in the command you can enable it by using the --option argument:
gallery-dl -o tags=true

thanks, but it doesn't seem to do what I thought will do, the latest version of gallery-dl won't grab all the tags from a post

using the latest version of gallery-dl as of today:
gallery-dl --write-tags -o tags=true --no-download https://chan.sankakucomplex.com/en/posts/zjrmmWK7BrD

old version of gallery-dl (version 1.27.0):
gallery-dl --write-tags -o tags=true --no-download https://chan.sankakucomplex.com/en/posts/zjrmmWK7BrD

ImVantexHD · 2025-03-12T16:01:40Z

the latest version of gallery-dl won't grab all the tags from a post

and i can confirm that's not because the api changes since i can still use the old version to extract the tags just fine, of course i replaced the old api with the new one so it can work again

rename 'tag_names' to 'tags'

mikf · 2025-03-12T16:16:49Z

@ImVantexHD regular tags are fixed in 898a09b
@taskhawk tags categories are fixed in 94bbbbb

ImVantexHD · 2025-03-12T16:18:29Z

@ImVantexHD regular tags are fixed in 898a09b @taskhawk tags categories are fixed in 94bbbbb

well that was fast, thank you @mikf

taskhawk · 2025-03-14T00:39:40Z

Found two instances where the current code enters an infinite loop retrieving the tags data. Using --verbose shows it keeps making requests non-stop with an increasing page parameter value.

These are the specific posts (NSFW):

https://www.sankakucomplex.com/posts/8yrxO7WOvaE
https://www.sankakucomplex.com/posts/26MPkn6JRKx

Edit:
Found another one (SFW-ish?):
https://www.sankakucomplex.com/posts/8JaGEODYeRL

#7155 (comment)

ForxBase · 2025-03-15T00:53:04Z

I updated to the latest version but still get

[sankaku][error] Unable to download data: JSONDecodeError: Expecting value: line 1 column 1 (char 0)

taskhawk · 2025-03-15T05:45:40Z

The latest version (1.29.1) was released before the issue started, that's why you still get the error.

Try installing the latest dev version with Pip: https://github.com/mikf/gallery-dl?tab=readme-ov-file#pip

FoxieP · 2025-03-15T08:32:42Z

Get and error on Version 1.29.2-dev when using sankaku download extractor. Disk X exists and accessible. Other extractors like e621, kemono works pretty well with the same baseDir and Acrhive paths.

Extractor Config:

    "sankaku":
    {
  	"username": "*******",
                 "password": "*******",
  	"base-directory": "X:/Sankaku",
  	"archive":   ["X:/Sankaku", "{search_tags}", "archive.db"],
  	"metadata": false,
  	"directory": ["{search_tags}"],
  	"filename": "{md5}.{extension}",
    },

Error Message:

C:\WINDOWS\system32>gallery-dl -v "https://chan.sankakucomplex.com/?tags=naytlayt"
gallery-dl: Version 1.29.2-dev
gallery-dl: Python 3.12.1 - Windows-10-10.0.19045-SP0
gallery-dl: requests 2.31.0 - urllib3 2.1.0
gallery-dl: Configuration Files ['%APPDATA%\gallery-dl\config.json']
gallery-dl: Starting DownloadJob for 'https://chan.sankakucomplex.com/?tags=naytlayt'
sankaku: Using SankakuTagExtractor for 'https://chan.sankakucomplex.com/?tags=naytlayt'
urllib3.connectionpool: Starting new HTTPS connection (1): sankakuapi.com:443
urllib3.connectionpool: https://sankakuapi.com:443 "GET /v2/posts/keyset?tags=naytlayt&lang=en&limit=100 HTTP/1.1" 200 None
urllib3.connectionpool: https://sankakuapi.com:443 "GET /posts/6ea4l7wp8a3/tags?lang=en&page=1&limit=100 HTTP/1.1" 200 None
sankaku: Failed to open download archive at '['X:/Sankaku', '{search_tags}', 'archive.db']' (FileNotFoundError: [WinError 3] System cannot find specified path: 'X:/')
urllib3.connectionpool: Starting new HTTPS connection (1): s.sankakucomplex.com:443
urllib3.connectionpool: https://s.sankakucomplex.com:443 "GET /data/d3/c3/d3c37909d4f42e1de3effddae08402be.png?e=1742029185&expires=1742029185&m=oYQammQGmRahnmZ6F8S1pg&token=c9OkPqj2uinHoq2YjbRqVc2a8HwwrAYiIP2Gx3eLl7c HTTP/1.1" 416 592
sankaku: Unable to download data: FileNotFoundError: [WinError 3] System cannot find specified path: '\\?\X:\'
sankaku:
Traceback (most recent call last):
File "C:\Python312\Lib\site-packages\gallery_dl\path.py", line 343, in finalize
os.replace(self.temppath, self.realpath)
FileNotFoundError: [WinError 3] System cannot find specified path: '/tmp/.download/1740768478 d3c37909d4f42e1de3effddae08402be.png.part' -> '\\?\X:\Sankaku\naytlayt\1740768478 d3c37909d4f42e1de3effddae08402be.png'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Python312\Lib\site-packages\gallery_dl\job.py", line 153, in run
self.dispatch(msg)
File "C:\Python312\Lib\site-packages\gallery_dl\job.py", line 197, in dispatch
self.handle_url(url, kwdict)
File "C:\Python312\Lib\site-packages\gallery_dl\job.py", line 368, in handle_url
pathfmt.finalize()
File "C:\Python312\Lib\site-packages\gallery_dl\path.py", line 347, in finalize
os.makedirs(self.realdirectory)
File "", line 215, in makedirs
File "", line 215, in makedirs
File "", line 225, in makedirs
FileNotFoundError: [WinError 3] System cannot find specified path: '\\?\X:\'

mikf · 2025-03-15T09:13:44Z

@FoxieP Your error is completely unrelated to this issue. You should open a new one instead of posting it here.

mikf added the site:change label Mar 11, 2025

mikf pinned this issue Mar 11, 2025

mikf added a commit that referenced this issue Mar 11, 2025

[sankaku] update API URLs (#7154 #7155)

1254c4e

and fix errors due to other changes

mikf added the fixed label Mar 11, 2025

mikf marked this as a duplicate of #7163 Mar 12, 2025

mikf added a commit that referenced this issue Mar 12, 2025

[sankaku] fix categorized tags for posts with >100 tags (#7155)

94bbbbb

mikf added a commit that referenced this issue Mar 12, 2025

[sankaku] fix 'tags' metadata (#7155)

898a09b

rename 'tag_names' to 'tags'

mikf added a commit that referenced this issue Mar 14, 2025

[sankaku] fix potential infinite loop (#7155)

f395a3e

#7155 (comment)

mikf closed this as completed Mar 15, 2025

mikf unpinned this issue Mar 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Sankaku] API URL change #7155

[Sankaku] API URL change #7155

taskhawk commented Mar 11, 2025

taskhawk commented Mar 11, 2025

mikf commented Mar 11, 2025 •

edited

Loading

taskhawk commented Mar 11, 2025

mikf commented Mar 11, 2025

taskhawk commented Mar 12, 2025 •

edited

Loading

mikf commented Mar 12, 2025

taskhawk commented Mar 12, 2025

ImVantexHD commented Mar 12, 2025

chazz1560 commented Mar 12, 2025

Ikkoru commented Mar 12, 2025 •

edited

Loading

taskhawk commented Mar 12, 2025

chazz1560 commented Mar 12, 2025

Ikkoru commented Mar 12, 2025

ImVantexHD commented Mar 12, 2025

ImVantexHD commented Mar 12, 2025

mikf commented Mar 12, 2025

ImVantexHD commented Mar 12, 2025

taskhawk commented Mar 14, 2025 •

edited

Loading

ForxBase commented Mar 15, 2025 •

edited

Loading

taskhawk commented Mar 15, 2025

FoxieP commented Mar 15, 2025 •

edited

Loading

mikf commented Mar 15, 2025

[Sankaku] API URL change #7155

[Sankaku] API URL change #7155

Comments

taskhawk commented Mar 11, 2025

taskhawk commented Mar 11, 2025

mikf commented Mar 11, 2025 • edited Loading

taskhawk commented Mar 11, 2025

mikf commented Mar 11, 2025

taskhawk commented Mar 12, 2025 • edited Loading

mikf commented Mar 12, 2025

taskhawk commented Mar 12, 2025

ImVantexHD commented Mar 12, 2025

chazz1560 commented Mar 12, 2025

Ikkoru commented Mar 12, 2025 • edited Loading

taskhawk commented Mar 12, 2025

chazz1560 commented Mar 12, 2025

Ikkoru commented Mar 12, 2025

ImVantexHD commented Mar 12, 2025

ImVantexHD commented Mar 12, 2025

mikf commented Mar 12, 2025

ImVantexHD commented Mar 12, 2025

taskhawk commented Mar 14, 2025 • edited Loading

ForxBase commented Mar 15, 2025 • edited Loading

taskhawk commented Mar 15, 2025

FoxieP commented Mar 15, 2025 • edited Loading

mikf commented Mar 15, 2025

mikf commented Mar 11, 2025 •

edited

Loading

taskhawk commented Mar 12, 2025 •

edited

Loading

Ikkoru commented Mar 12, 2025 •

edited

Loading

taskhawk commented Mar 14, 2025 •

edited

Loading

ForxBase commented Mar 15, 2025 •

edited

Loading

FoxieP commented Mar 15, 2025 •

edited

Loading