WikiArt scraper only scraping <3000 images #27

fk798 · 2020-12-07T03:39:52Z

Hi! When scraping and downloading images to train the DCGAN on, the scraper is unable to get access to the full dataset. Instead, for example when I try downloading images using the command python art.py --genre=landscape --num_pages=250 --output_dir=landscape_scraped I am only able to download around 2400 images before the prorgram ends. However, when you go to the WikiArt website, it shows that for landscape there are around 22000 images available.

Here's what I think the issue is: when you go to the landscape page, the webpage shows that there are a total of 3600 images you can see. I tried scrolling all the way down to see if there were other pages I could access with different images, but it doesn't show any buttons to go to any other pages (if there are any). It looks like WikiArt has their website so that you can only view those 3600 images instead of the entire dataset, which poses a problem since we have less data to train the network on. I might be wrong since I don't really know how WikiArt works, but how can I obtain more images than just the 2400 images?

Thanks in advance!

The text was updated successfully, but these errors were encountered:

sebamacchia · 2021-05-13T14:02:29Z

hi!, are you using the genre-scraper.py file?

rosefeller · 2021-05-13T15:36:23Z

Faisal Karim,can I have your email address,I have some idea.My email: ***@***.*** Many Thanks Faisal Karim ***@***.***> ezt írta (időpont: 2020. dec. 7., H, 3:40):

…

Hi! When scraping and downloading images to train the DCGAN on, the scraper is unable to get access to the full dataset. Instead, for example when I try downloading images using the command python art.py --genre=landscape --num_pages=250 --output_dir=landscape_scraped I am only able to download around 2400 images before the prorgram ends. However, when you go to the WikiArt website, it shows that for landscape there are around 22000 images available. Here's what I think the issue is: when you go to the landscape page, the webpage shows that there are a total of 3600 images you can see. I tried scrolling all the way down to see if there were other pages I could access with different images, but it doesn't show any buttons to go to any other pages (if there are any). It looks like WikiArt has their website so that you can only view those 3600 images instead of the entire dataset, which poses a problem since we have less data to train the network on. I might be wrong since I don't really know how WikiArt works, but how can I obtain more images than just the 2400 images? Thanks in advance! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#27>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AN4OZKFFGZGJEGYWSAXSFJLSTRFBTANCNFSM4UP3H25A> .

fk798 · 2021-05-13T17:47:55Z

Ah my bad, @sebamacchia yeah I meant the genre-scraper.py file. I just renamed it to art.py but its the same thing.

@rosefeller sure, my email address is [email protected]. If it doesn't show (for some reason your email is starred out with asterisks), its just my GitHub handle at the rate nyu.edu

rosefeller · 2021-05-14T10:13:08Z

I can,t see your email address,the sustem not show ,can you send me letter. I.m on facebook or instagram Rosefellerart,you can Google me please.

…

On Thu, 13 May 2021, 16:36 rose feller, ***@***.***> wrote: Faisal Karim,can I have your email address,I have some idea.My email: ***@***.*** Many Thanks Faisal Karim ***@***.***> ezt írta (időpont: 2020. dec. 7., H, 3:40): > Hi! When scraping and downloading images to train the DCGAN on, the > scraper is unable to get access to the full dataset. Instead, for example > when I try downloading images using the command python art.py > --genre=landscape --num_pages=250 --output_dir=landscape_scraped I am > only able to download around 2400 images before the prorgram ends. However, > when you go to the WikiArt website, it shows that for landscape there are > around 22000 images available. > > Here's what I think the issue is: when you go to the landscape page, the > webpage shows that there are a total of 3600 images you can see. I tried > scrolling all the way down to see if there were other pages I could access > with different images, but it doesn't show any buttons to go to any other > pages (if there are any). It looks like WikiArt has their website so that > you can only view those 3600 images instead of the entire dataset, which > poses a problem since we have less data to train the network on. I might be > wrong since I don't really know how WikiArt works, but how can I obtain > more images than just the 2400 images? > > Thanks in advance! > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#27>, or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AN4OZKFFGZGJEGYWSAXSFJLSTRFBTANCNFSM4UP3H25A > > . > — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#27 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AN4OZKDIS3IHCXDLNGH2HULTNPWYVANCNFSM4UP3H25A> .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WikiArt scraper only scraping <3000 images #27

WikiArt scraper only scraping <3000 images #27

fk798 commented Dec 7, 2020

sebamacchia commented May 13, 2021

rosefeller commented May 13, 2021 via email

fk798 commented May 13, 2021

rosefeller commented May 14, 2021 via email

WikiArt scraper only scraping <3000 images #27

WikiArt scraper only scraping <3000 images #27

Comments

fk798 commented Dec 7, 2020

sebamacchia commented May 13, 2021

rosefeller commented May 13, 2021 via email

fk798 commented May 13, 2021

rosefeller commented May 14, 2021 via email