Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance(zotero): better multi-profile experience #10430

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

e-zz
Copy link
Contributor

@e-zz e-zz commented Oct 29, 2023

This commit handles the error MissingPDFException, which bothers a lot of users when using Zotero multiple profiles. Usually file-path:: property will get polluted by another profile. And sometimes Logseq crashes out of this. This commit tries to avoid the error and makes Logseq more stable.
It should fix #9261 on the condition that config.edn contains all the profiles you use. Also, though not intended, this commit is helpful for issues like #6417.

Summary

A function handle-missing-pdf-error! is added as the exception handler.

Background

When using PDF highlighting in Logseq, a page with the format "hls_xxx" is generated. This page typically starts with two properties:

  • file:: [GitHub.-.Wikipedia_1698436816342_0.pdf](../assets/GitHub.-.Wikipedia_1698436816342_0.pdf)
  • file-path:: ../assets/GitHub.-.Wikipedia_1698436816342_0.pdf

It is crucial that the file-path:: accurately reflects the path to the PDF file. Failure to do so can result in a "MissingPDFException" error when attempting to click highlight blocks of the PDF, and in some cases, Logseq might even crash, leading to potential data loss (I experienced this once).

On a positive note, Logseq features a Zotero profile system, making Logseq's PDF functionality compatible with Zotero. The file-path:: is where these two distinct components can interact.

The only other location where this interaction occurs is on the "open" button on the Zotero entry page.

Moreover, if you use Logseq on multiple devices, each with a unique profile, Logseq ensures that the Zotero extension continues to work seamlessly after syncing your config.edn file across devices.

It's worth noting that the Zotero extension is well-designed to function across different devices. It securely stores the Zotero API key locally, along with the selected profile name, ensuring protection from interference by other devices.

Some users may have noticed that in the Zotero settings menu, the "default" profile (the first profile in the list more accurately) is always selected and may seem problematic. However, there's no need to worry, as this is just a minor bug during menu initialization. We don't need to do anything. The actual profile remains the one you selected on that specific machine during your last session.

However, complications arise when we start creating highlights and syncing pages. In general, the file-path:: stores the correct path for a specific machine, but not necessarily for all of them. This inconsistency leads to the recurring "MissingPDFException" error, especially when working with multiple profiles.

Solution

To address the issue of incorrect file-path:: causing "MissingPDFException" errors, the following solution is proposed:

  • Logseq will now check all profiles, allowing it to determine whether the file-path:: has been altered by another machine. If the issue is not associated with any known profile, the same error message will persist.

Possible alternatives (worked for me):

  • This alternative involves using a placeholder for the PDF base directory in the file-path:: property. For example, you can refer to my experimental commit here. This approach has proven effective, but it does alter the style of the file-path:: property, which is something we should ideally avoid.
  • Or from the very begenning one should use symbolic links for PDF storage. It's a perfect solution but for only part of the users.

So the solution I suggested is the best compromise in my mind.

User data possibly modified

:file-path property of a PDF highlight page (the "hls_xxx.md" page) will be replaced by the absolute path on the device. This will happen only after MissingPDFException error is encountered with Zotero integration enabled.

Possible issue

Notice that usually it works out of the box, but not if config.edn doesn't contain the profiles that pollute file-path. The key step here is, by adding profiles into your config.edn, we tell Logseq what are those Zotfile base directories on other devices.

PS: In the documentation, there is only limited information available on how to use profiles. Instructions and detailed explanations will be really helpful.

This commit handles  the `MissingPDFException` frequently encountered in using multiple profiles.

add: `function handle-missing-pdf-error!`

`:file-path` property of a PDF highlight page (the "hls_xxx.md" page) will now be  updated based on the profile selected.
@e-zz e-zz marked this pull request as ready for review October 29, 2023 20:16
@andelf andelf requested a review from xyhp915 October 31, 2023 18:19
@e-zz
Copy link
Contributor Author

e-zz commented Nov 6, 2023

If one uses only windows, the trick in the commit is good enough. Unfortunately, on Linux the problem perists.

The relevant error information was mentioned also here #7260 a long time ago.
And I think this is what collapses Logseq again and again on Linux when working with multiple profiles.

Here is a short summary of what I found after few hours' hard work.

PDFjs is called in get-doc$

(-> (get-doc$ (clj->js opts))
(p/then (fn [doc]

But PDFjs didn't even try to open the pdf

After a checking into PDFjs, I found the error is raised from
https://github.com/mozilla/pdf.js/blob/72338ce05df18f77746ab256efb71c310db9f45e/src/display/api.js#L517-L529

_util.isNodeJS is false on linux and thus prevents PDFjs from retrieving the document. That's why MissingPDFException is not triggered.

function getUrlProp(val) {
  console.log("getUrlProp", val, typeof val === "string", "\n\t\t", _util.isNodeJS);
  if (val instanceof URL) {
    return val.href;
  }
  try {
    return new URL(val, window.location).href;
  } catch {
    if (_util.isNodeJS && typeof val === "string") {
    // if ( typeof val === "string") {           // this will give a `MissingPDFException` error instead
      return val;
    }
  }
  throw new Error("Invalid PDF url data: " + "either string or URL-object is expected in the url property.");
}

A workaround is to remove the file:// prefix

So I test a bit, and finally realize

  • if the file:// link passed to PDFjs is wrong, PDFjs will throw the error above
    i.e., this will give an "invalid PDF url data" error if PDF.pdf doesn't exist.
file-path:: file://C:User/Document/PDF.pdf
  • the same link without a file:// prefix will lead to a MissingPDFException error, which can be caught by Logseq and handled by handle-missing-pdf-error! function.
    ie, this will give a MissingPDFException error
file-path:: C:User/Document/PDF.pdf

After seeing the difference above, it's easier to understand why Logseq sometimes crashes due to a PDF path problem while sometimes doesn't.

In the end, one could say that the irritating bug of Logseq on Linux is actually from the inconsistency of error messages thrown by PDFjs.

How to fix?

So to completely solve the problem, to my knowledge, we have three options:

  • Wait until an upstream fix in PDFjs that correct the exception message (throw MissingPDFException)
    (or maybe by letting _util.isNodeJS=true on Linux. Afterall, platform shouldn't make a difference. Instead, it should only be dependent on NodeJs?)
  • Stop using "file://" prefix. In hls_xxx pages, file-path:: is added by a function named ensure-ref-page! in the pdf extension during annotation and thus easy to change (I showed how it's done in #dfbcbcf. It's not the best solution but works perfectly for me till now.)
  • Or, trim any "file://" prefix before passing it to PDFjs so it won't break anything.

With either of these methods, we will be able to fix #7260 or similar problems where Logseq collapses.

PS: Sorry if it's too lengthy to read. Missing PDF handling is a long standing problem. And I wrote all these in a verbose way just to make all details I know clear enough.

e-zz pushed a commit to e-zz/logseq that referenced this pull request Nov 6, 2023
- handle a false pdf path in `file-path::` property that starts with a "file://" prefix

This (or other similar method) should fix logseq#7260
e-zz pushed a commit to e-zz/logseq that referenced this pull request Dec 10, 2023
e-zz pushed a commit to e-zz/logseq that referenced this pull request Dec 17, 2023
e-zz pushed a commit to e-zz/logseq that referenced this pull request Dec 22, 2023
e-zz pushed a commit to e-zz/logseq that referenced this pull request Jan 2, 2024
e-zz pushed a commit to e-zz/logseq that referenced this pull request Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Zotero file location in config.edn does not work between platforms
1 participant