Skip to content

Remote download not working #85

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
JacquesOrbiton opened this issue Mar 5, 2025 · 10 comments
Open

Remote download not working #85

JacquesOrbiton opened this issue Mar 5, 2025 · 10 comments

Comments

@JacquesOrbiton
Copy link

Hello,

The bug I am pointing out is not related to your code but rather due to anti-bot procedure on thorlabs website side. They use Incapsula Imperva and deny access using the requests library.

I tried to bypass the blocker using headers from my web browsers but nothing works. Even tried using cloudscraper but nothing works.

One way to solve the problem would be to use Selenium or another library in the same fashion and create a session that looks like a real user. This is a bit weird to implemant and maybe just downloading the zmx files directly from the website would be faster on the user side (rather than setting up the session for their own PC).

This is my first bug report sorry for the bad formatting...

@SeckinBerkay
Copy link
Collaborator

SeckinBerkay commented Mar 5, 2025

Hi,

Thanks for opening up the issue, just trying to understand. Were you following the example 9b? And it gave an error: ValueError: Failed to download Zemax file., for the code:

import numpy as np
import matplotlib.pyplot as plt
from optiland.fileio import load_zemax_file
from optiland import analysis

# link to the .zmx file on Thorlabs website
url = 'https://www.thorlabs.com/_sd.cfm?fileName=20565-S03.zmx&partNumber=MAP051950-A'

lens = load_zemax_file(url)

@JacquesOrbiton
Copy link
Author

JacquesOrbiton commented Mar 5, 2025

Yes I used your example,

import numpy as np
import matplotlib.pyplot as plt
from optiland.fileio import load_zemax_file
from optiland import analysis
from optiland.samples.simple import AsphericSinglet

# link to the .zmx file on Thorlabs website
url = 'https://www.thorlabs.com/_sd.cfm?fileName=20565-S03.zmx&partNumber=MAP051950-A'

lens = load_zemax_file(url)
lens.draw()

Ouput : ValueError: Failed to download Zemax File.

I actually found a fix for that but I don't know if it really is maintainnable.

Under your existing imports you can add this :

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from webdriver_manager.chrome import ChromeDriverManager

# Setup Chrome options
chrome_options = Options()
chrome_options.add_argument("--headless")  # Run headless (remove this if you want to see the browser)
chrome_options.add_argument("--disable-blink-features=AutomationControlled")  # Helps evade detection

# Use your real User-Agent
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
chrome_options.add_argument(f"user-agent={user_agent}")

# Existing code ...

class ZemaxFileReader:
         [...]

      def _configure_source_input(self):
            """
            Checks if the source is a URL and writes to a temporary file if so.
            Otherwise, sets the source to the filename.
            """
            if self._is_url(self.source):
                  # Start WebDriver
                  service = Service(ChromeDriverManager().install())
                  driver = webdriver.Chrome(service=service, options=chrome_options)

                  # Open the target URL
                  url = self.source           
                  response = driver.get(url)
                  # Extract cookies from Selenium
                  cookies = driver.get_cookies()
                  session = requests.Session()
            
                  # Convert cookies for requests
                  for cookie in cookies:
                        session.cookies.set(cookie['name'], cookie['value'])
            
                  # Use requests to get the status code
                  response = session.get(url, headers={"User-Agent": user_agent})
                  print(response.status_code)
            
                  if response.status_code == 200:
                        with tempfile.NamedTemporaryFile(delete=False) as file:
                              file.write(response.content)
                              self.filename = file.name
                              print(self.filename)
                              # Close browser
                              driver.quit()

This is quite basic and maybe there are other ways to do that are more efficient but it actually works !
Hope this can help you

@SeckinBerkay
Copy link
Collaborator

SeckinBerkay commented Mar 6, 2025

Thanks for providing a solution, nice code. I reckon using selenium needs careful attention, i.e there were some additional steps downloading the chrome driver, I don't know if they changed it but I guess your code works without installing it. On the other hand, we don't know if they (thorlabs etc.) will fix the selenium solution, and as you rightfully pointed out we don't know if the same code structure will work in the near or far future.

However, this is the nature of open source, and the community is there to fix the issues, but we should consider the optimal solution here. @HarrisonKramer

@HarrisonKramer
Copy link
Owner

Thanks for opening this, @JacquesOrbiton. I don't use this functionality too often, so wouldn't have caught it for a while. Also, thanks for providing a workaround!

I am inclined to simply update the error handling to include a more descriptive error when this occurs. I don't think it's critical that we can programmatically download these files. I also don't want to add another dependency (selenium) only for a workaround. Also, if they really prefer that files are not downloaded in this way, then so be it.

My proposal would be:

  • Update the error handling, possibly detecting when the request is rejected for this reason, then raising a meaningful error. This could include a link to the docs outlining this issue.
  • Update docs (in code and probably on readthedocs too) explaining the issue. We can even link directly to this issue, so people may choose to use the nice workaround above.
  • Update the example to use a downloaded version of the file.

@SeckinBerkay - what do you think? If you agree, do you want to look into this one? I can also add this to the to-do list and we can get to it later on.

Regards,
Kramer

@mabl
Copy link

mabl commented Apr 1, 2025

Thorlabs also provides a full optics catalog at https://www.thorlabs.com/software_pages/ViewSoftwarePage.cfm?Code=Zemax. Maybe it would be more beneficial to provide/ship a way to load that?

@SeckinBerkay
Copy link
Collaborator

@HarrisonKramer Sorry, I didn't see the reply, must have missed the notification, and I am reminded of the post by the comment of @mabl . I see Tutorial 9a is with the download version, while Tutorial 9b is still the direct URL version, showing both downloading and url use cases to the users.
@mabl Thanks for providing an alternative and the link. I had tried it in the past, couldn't get it working, but that's just me I guess, so we may work it out. We may need to add another zemax extension parser to get it working.
Currently, I am occupied with a lot of things, and I haven't been able to pay attention developing optiland. But I can work on this issue and more in a couple of weeks.

@HarrisonKramer
Copy link
Owner

Hi @mabl,

Thanks for providing this. This would definitely be a nice workaround. I think we can add a new functionality to be able to parse these catalogs into something usable by Optiland. We can add that to the roadmap.

Kramer

@SeckinBerkay
Copy link
Collaborator

SeckinBerkay commented Apr 12, 2025

I was able to retrieve some information from ZMF files, which include lens_name, focal_length, diameter, and extra. I guess they are stacked ZMX files, as described in pages 247-248 of Zemax User's Manual July 8 2011. But decoding is tough.
Two examples:
{'lens_name': "â\x10\x1d\x9dÎïW7'¦", 'focal_length': -2.511197962575537e+306, 'diameter': 3.916145334697509e-266, 'material_id': 246412726, 'extra': b'j\xb2\xe4h\x93\xa5\\\xbc\r\x9c5@\x99\xb5\x1ae\xa2\xbes]\x98vH,\\\xddq\x8c\xd7B\x99\x99\xffE'} {'lens_name': 'P\x19ϰ¢\n3cc', 'focal_length': -2.1822850138460976e-30, 'diameter': 1.2392362606688896e+255, 'material_id': -900866346, 'extra': b"\xaa6\x13'5[%W\xe4\x88T\xb8\x05\x03\x970\xb2\xadDPLJ1277L1-A\x00\x00\x00\x00"}

Any idea how to decode those strings? Tried ansi, ascii, utf-8 etc.

@HarrisonKramer
Copy link
Owner

I would need to take a deeper look here. Not sure how to tackle it. Perhaps there are existing open source tools that can read these files.

Copy link

Hi there! This issue has been automatically marked as stale due to inactivity for 30 days. If you believe this issue is still relevant, please comment with any updates or additional information. Otherwise, this issue will be closed in 14 days.

@github-actions github-actions bot added the stale label May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Ready
Development

No branches or pull requests

4 participants