-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
select.lua: select from the watch history with g-h #15655
base: master
Are you sure you want to change the base?
Conversation
Download the artifacts for this pull request: |
DOCS/man/mpv.rst
Outdated
@@ -328,6 +328,12 @@ g-l | |||
g-d | |||
Select an audio device. | |||
|
|||
g-h | |||
Select a file from the watch history. Requires | |||
``--script-opt=select-save_watch_history=yes``. The history path is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be native option. The fact that it is implemented in select.lua shouldn't force use to use --script-opt=select-
which is little bit verbose. Also I consider script-opts to something that should be used to already specific script behavior. Here we have whole separate feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean only save_watch_history
should be an option but watch_history_path
and any future config should be a script-opt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I mean all options should be an option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also on that note, I think the history saving logic should be moved to own script, select should be left for selection only imho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The saving part is just 17 lines out of 85 though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I felt that it didn't make sense to separate them when select.lua has to be perfectly aware of which format it was saved in right above
player/lua/select.lua
Outdated
@@ -18,6 +18,13 @@ License along with mpv. If not, see <http://www.gnu.org/licenses/>. | |||
local utils = require "mp.utils" | |||
local input = require "mp.input" | |||
|
|||
local options = { | |||
save_watch_history = false, | |||
watch_history_path = "~~state/watch_history.txt", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For symmetry with watch_later
, maybe don't use .txt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
watch_later
is a directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh you meant the files inside watch_later
. Dunno I think those are a weird case. I think the txt extension makes it easy to open it in a text editor by default e.g. in Windows or with zsh suffix aliases which are what I use. I also used console_history.txt
in the other PR so that would have to be changed too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, maybe use .json
or .jsonl
in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well as I wrote in the commit message I didn't use jsonl only because I can't figure out whether ndjson or jsonl is preferable. Will switch to jsonl.
player/lua/select.lua
Outdated
local path = mp.command_native({"normalize-path", mp.get_property("path")}) | ||
local title = mp.get_property("playlist/" .. mp.get_property("playlist-pos") .. "/title") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move those after if not history_file
check.
fd03b7c
to
d80ee92
Compare
This works well. Comparing it to memo.lua which is a history script. It appends at end of file. When items hit around 400 history entries or so it won't open in one or two seconds. Sometimes it takes 5 seconds or more. So I trim the history file just keeping the last 333 entries.
It doesn't show duplicates which is fine. However, a suggestion I have if it doesn't create a problem with long text that won't fit on the osd that is. Not sure if it's wrapped or what. Is to add the current chapter within the file appending it after a # symbol. In the bookmarker-menu.lua I modified it to save bookmarks with the current chapter. This will help differentiate rather than just timestamps or time position to indicate where one is within the file.
input.conf
mpv.conf
|
This has a clear history entry and is usable with tens of thousands of lines, in fact I was testing the performance of searching in 150k lines yesterday and it was still usable if we switch to the C version of fzy, though This doesn't trim duplicates but it doesn't really have duplicates as files are logged with different timestamps every time which we can be useful to check. Text longer than the OSD is cropped. This appends to history when files are opened so there is no chapter or time position information at all. We have watch later files for resuming from the last position, so I assume you were talking about the watch later selection? That also doesn't show any time position, let alone chapters. I don't know if showing the time position before resuming is needed, I have never felt the need for that as it automatically resumes from that position. If you mean to use it for bookmarks of scenes watch later is not suitable for that since config files are deleted on resume, EDL is. |
423c3ba
to
30e0c38
Compare
sounds great performance wise. "Clear history" sounds better than "Clear the history" Oh wasn't aware time it appends upon open so yeah that's a no-go like you say. So watch-later probably wouldn't work since path is there already and text would be too long. Forgot also on memo.lua I add chapters to the history entries
|
If opening is slow, we should consider not using json. It brings no value. Simple separated values in each line will work the same and avoid parsing json.
I'd prefer for it to be de-duplicated in select, else searching for a file, with list this file with multiple dates, which makes searching history more troublesome. |
I don't have a real stake in this, but since it already came up before, IMO the range of input sizes should be discussed, and agreed upon, because this can affect the solution. Personally I don't care about 150K lines of anything. I don't use playlists with more than 20 items, I don't have thousands of tracks I need to choose from, etc. I'm using my own console with my own history implementation (deduplicatd, recent-sorted), and I know that it just doesn't keep growing beyond orders of magnitude less than 150K, despite me using it very frequently for many years now, without cleaning up the history. But maybe there could be cases of many many thousands of entries, like playlists, etc, that people agree should be handled well, and in that case the target sizes should be agreed. If there's an agreement that such sizes are required to be handled well, then IMO more appropriate solutions should also be considered, like sqlite, or some performant key-value store etc. Maybe specifically 150K can still be handled carefully without external tools/libraries, also on the RPI, but at some stage, if the size being mentioned keeps growing - as it seems to be happening all the time, it can becomes too big to keep handling it internally in some naive fashion IMO. |
How can loading 150k lines not be slow JSON or not? CSV will not work or is not simple when paths or titles contain newlines, and a new parser may be slower if not written in C like |
Well, there are databases that processes billions of rows fast. It's all matter of perspective. You brought up that it is slow. So I'm offering a different view on this. Whether we want to keep it slow is another question.
There is nothing wrong with C. Whole mpv is written in C. watch-later is also C and I believe it correctly escapes filenames, no? EDIT: Also loading a full file on every search is slow, which could be easily avoided. Again the question is how much slowness is acceptable. |
Databases can search through billions of rows quickly with indexes but if you actually retrieve all of them at once to use them somewhere else that is slow. watch later indeed escapes incorrectly since it converts newlines to _, such files don't work in the watch later selector. There was also an issue about cookie options with newlines not working in watch later files. Added deduplication. |
Should this be fixed? |
player/lua/select.lua
Outdated
for i, seen_path in ipairs(paths) do | ||
if seen_path == path then | ||
table.remove(items, i) | ||
table.remove(paths, i) | ||
break | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this n^2 search will be slow. How about we keep a map for such that map[path] = i
and get i
if exist, and instead of removing, just update the items[i]
with the new date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did that at first but it doesn't work because i is updated as you prepend items. Also I tried doing this with the console history and it was actually much slower than looping with huge histories, maybe huge tables with long string keys cause issues. I must have hallucinated this, maps are faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested again with 100k+ lines and performance is almost the same between looping and keeping a map. Also updating items[i]
as you said doesn't work because the new watched entry should be at the beginning. So while with the loop so we can just prepend items, with the map we have to decrement every i
in the map after the removed one and revert tables at the end, that's probably slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless, I would still like it to be gone from reading loop.
How about we add after for line in history_file:lines()
local seen = {}
local _items = {}
local _paths = {}
for i, path in ipairs(paths) do
if not seen[path] then
seen[path] = true
table.insert(_items, items[i])
table.insert(_paths, path)
end
end
EDIT: wait, I'm still thinking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should really use a BST to store history in this case since you want the history to be always sorted by date with frequent search/insert/delete. You can keep a map such that map[path] = date
and then finding the entry with such date and moving it to the front would be only O(log n)
instead of O(n)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested that this is only marginally faster with 100k unique lines and 50% faster by repeatedly concating 600 history lines until you get a 100k lines file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind I was still prepending to the tables. It's 3 faster even with unique lines by just replacing prepending with appending.
Ignore my wrong measurements, this is over 40 times faster than before with unique lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I like simple "append only" file, because we don't need to modify the rest of it.
Isn't modification needed if you want history deduplication at runtime without reloading the history file? That's where BST is useful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it is reloaded on each search.
Do you think added complexity is worth it? After first file load it would be cached by the kernel anyway, so consecutive loads should be fast, except maybe next time after it is evicted. Keeping local state of history is valid, but not sure it is worth the added complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends whether if the current performance is good enough. I only mentioned the "big O" because there seemed to be some performance problems with a very large history file.
It seems the author already had a fix in #13016, I didn't see that before. Though paths in comments would have to be fixed separately from options, select.lua doesn't have access to the config file parser. At least with JSON we get it working for free. |
DOCS/man/options.rst
Outdated
|
||
``--save-watch-history`` | ||
Whether to save which files are played (default: no). These can be then | ||
selected with the default ``g-h`` key binding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
documenting it like this might make people think it's an option built into the mpv core, so it should say that it belongs to select.lua
(or at least that it requires Lua)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well it could have been kept as a script-opt if that's an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it an issue? It's in implementation detail how it's implemented imho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I think it would be nice to know what exactly you lose when you disable Lua, but it's not of high importance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should remain a script-opt. No other core options do nothing, like this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as many other options that do nothing if mpv is compiled without some library dependency.
See #14624: the direction is that core options should either warn or don't exist at all when compiled without dependency. And there will be even more complexity when dealing with this history feature which can be unloaded at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to add a 20 line script just to append to the history? Since the writing code is 20 lines vs 90 lines for the selection code, the selection code needs to know exactly how the writing code above formats the JSON, and they both needed the same watch_history_path
script-opt, I thought they might as well be together. And if not using global options it would have been very confusing if users have to configure script-opts related to saving the history in history.lua and script-opts related to reading and formatting the history in select.lua, and that 1 of the 2 scripts has to call read_options
with the name of the other script to get shared script-opts like watch_history_path
. Not sane.
select.lua doesn't really handle selection, it gathers and formats data to be selected in console. I don't see how gathering and formatting history is different from any of its other selectors.
I don't understand what not overriding OSD styles has to do with options vs script-opts for these history ones. I made scripts respect existing OSD options when feasible to not make users configure the same thing multiple times. Actually this does exactly the same thing with --osd-playlist-entry
. But IMO it makes sense that options that only have effect within a script are its script-opts, it makes it clear what script they affect, what script you need to edit if you want to alter the behavior, and that you lose that functionality by disabling Lua. It also avoids bloating global options if this gains several script-opts in the future, imagine if every OSC script-opt was a global option. Maybe there is value in having easier to type shortcuts like we already have in --ytdl-format
and --ytdl-raw-options
but dunno if that's worth losing clarity and consistency. At that point if we want to make it more usable a better way would be some GUI button that appends the option to enable history in mpv.conf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe only the writing part can be converted to C since that's small, after mp_delete_watch_later_conf()
. That provides a reason to use global options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really need to put an end to this ongoing debate about Lua vs. C implementation.
This issue came up in the other PR as well. Initially, it was implemented in C, then it was requested to be reimplemented in Lua. Afterward, a major concern arose: "What if Lua isn't enabled?"... can we not troll ourselves again?
We need to make a clear decision: either we allow "core" functionalities to be implemented in Lua, or we don't. We can't keep standing in the middle of the road, indecisive, every time this comes up.
In my personal opinion which may not reflect opinion of others. I think implementing features like history in Lua is fine and many of mpv features could be implemented as such. It is also valid to implement it in C, in this cases, there is not really a difference, as the code to save/load watch-history is trivial in both cases (it's calling our json parser anyway). I'm fine with having it in Lua. But at the same time, I think it should be considered "core" option, meaning options should be top-level, not script-opt
, as the fact it is implemented in Lua shouldn't dictate the user interface to configure such feature. And if this is a problem, we should stop implementing features in Lua.
I'm fine with the current state of this PR. I don't see technical issue with the current implementation. It's small, works. But if others don't agree on this, we will need to think of other implementation. But please make it clear that we don't want to use Lua, so we don't make this mistake again in the future.
The only concern about Lua implementation is that, each new script spawns new mpv_client
with own event loop, which may not be a big deal, but the constant overhead is there. I agree though that putting everything in select.lua is bit out of place. But in the same time, spawning clients for 30 lines scripts is little bit meh.
EDIT: Also lua code will be bigger than compiled C code in final binary. So there is that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe only the writing part can be converted to C since that's small
Yeah, maybe. I think in this case small save function in C will resolve all concerns about options and need for separate history.lua and select.lua will have only selector, same as for watch-later. The save code in C will be less than it is in Lua actually...
02ef821
to
6afdfe2
Compare
382af52
to
d0e0679
Compare
history file may expose privacy-sensitive information, request to add a blacklist option to prevent some files from being stored in the history file, the syntax can be consistent with ytdl_hook-exclude . |
player/lua/select.lua
Outdated
local time, path, title = unpack(entry) | ||
|
||
if not path then | ||
mp.msg.warn(line .. " in " .. history_file_path .. " has invalid data.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this in else branch of if entry then
in main loop. And instead printing line
, which will leak history in log and also if it is malformed it may not be printable, I would log i
, number of the line that fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The else branch also runs if the entry is well formed but an already seen path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reorganized it again so what you suggested here can be done and it also no longer keeps the file lines in memory.
player/lua/select.lua
Outdated
for i, entry in pairs(entries) do | ||
local item = entry.path | ||
if entry.title and osd_playlist_entry == "title" then | ||
item = entry.title | ||
elseif entry.title and osd_playlist_entry == "both" then | ||
item = entry.title .. " (" .. entry.path .. ")" | ||
end | ||
items[i] = entry.date .. item | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this back it the main loop, can be inside if entry then
branch. I will see about making it more generic in another PR.
player/loadfile.c
Outdated
} | ||
|
||
char *title = (char *)find_non_filename_media_title(mpctx); | ||
mpv_node items[3] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mpv_node items[3] = { | |
mpv_node items[] = { |
return; | ||
|
||
void *ctx = talloc_new(NULL); | ||
char *history_path = mp_get_user_path(ctx, mpctx->global, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be null when no-config
is used. Or empty string, either way it has to be checked, before fopen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is watch_history.jsonl
with --no-config --save-watch-history
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what happens it saves history in cwd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
player/lua/select.lua
Outdated
elseif entry.title and osd_playlist_entry == "both" then | ||
item = entry.title .. " (" .. entry.path .. ")" | ||
end | ||
items[i] = entry.date .. item |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
items[i] = entry.date .. item | |
local status, date = pcall(os.date, "(%Y-%m-%d %H:%M) ", time) | |
items[i] = (status and date or entry.time) .. item |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is showing UNIX timestamps to end users useful? It would only cause confusion.
player/lua/select.lua
Outdated
return { | ||
["date"] = date, | ||
["path"] = path, | ||
["title"] = title, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return { | |
["date"] = date, | |
["path"] = path, | |
["title"] = title, | |
} | |
return { | |
time = entry[1], | |
path = entry[2], | |
title = entry[3], | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or whatever, keep it as simple as possible.
782f5db
to
30134d9
Compare
Watch History | ||
------------- | ||
|
||
``--save-watch-history`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the file format should be documented.
player/loadfile.c
Outdated
mpv_node items[] = { | ||
{.format = MPV_FORMAT_INT64, .u = {.int64 = time(NULL)}}, | ||
{.format = MPV_FORMAT_STRING, .u = {.string = mp_normalize_path(ctx, mpctx->filename)}}, | ||
{.format = MPV_FORMAT_STRING, .u = {.string = title}}, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments on the file format:
- Since JSON is used per line, each line should be an object with meaningful key names rather than an array. This way we can add new fields in any position without breaking existing parsers.
- There should be a header indicating the format version, in case we need to make incompatible changes later.
- Time stamp has only second precision, I would like to at least use milliseconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since JSON is used per line, each line should be an object with meaningful key names rather than an array. This way we can add new fields in any position without breaking existing parsers.
I was thinking about this, but this would mean we save a lot of redundant characters, just for the sake of saving them. We use json export mostly to escape paths and filenames.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also thought of saving keys but didn't know if it's worth bloating the file. A version header is overkill, this isn't an API or a database of critical data, users can just clear the history in the unlikely event the format changes. I don't know why you would need milliseconds for watch history, they don't even work with os.date
which errors with decimal values and currently not even the seconds are printed since they are noise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A version header is overkill, this isn't an API or a database of critical data, users can just clear the history in the unlikely event the format changes.
Header version would be good idea. But I think it will be easy to preserve compatibility with old version. Just add new fields to existing structure, don't change existing ones (even if they become unused). It should be fine.
Also having forward compatible structure allows to mix old and new in one file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this, but this would mean we save a lot of redundant characters, just for the sake of saving them.
A version header is overkill, this isn't an API or a database of critical data
Being a core functionality, the file can be expected to be parsed by third party scripts, and is part of the player interface. Other custom file formats currently used by mpv are written in a higher standard than this, such as EDL which is extensively documented and has a version header.
users can just clear the history in the unlikely event the format changes.
In the current implementation, mpv core isn't the one parsing the history. mpv can change format but it must be easily detectable so external programs can reliably know how to parse them. Users don't need to clear history if the external program supports all past and future formats.
they don't even work with
os.date
which errors with decimal values
You can just separate the integer part.
and currently not even the seconds are printed since they are noise.
Just because select.lua doesn't need it doesn't mean other programs aren't interested in it. I'm OK if you don't want to do it, but this also brings back to my file format point. If someone wants a high precision later and changes the timestamp unit to milliseconds, it should be clearly indicated in the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being a core functionality, the file can be expected to be parsed by third party scripts, and is part of the player interface.
Even more reason to use extensible structure and only extend it, not break compatibility with existing tools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think EDL is comparable since users have to generate EDL playlists by themselves so of course they need to know how to format them, I have my own published script generating EDL files to save bookmarks. Question is do we want to use this file to just show recent files or is it supposed to be some kind of documented database format? It's probably convenient to just open the jsonl file and check how each line is formatted and reuse it e.g. to make a script to show recent files in a different menu, but the Lua code to generate was literally 20 lines so it's not like it buys you much compared to creating your own history file. If someone wants to save arbitrary data like milliseconds for arbitrary use cases they're probably better off doing their own saving in file-loaded
so they can save whatever they want in any format they want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is still good to document the new format like @na-na-hi requested and once documented with assurances about the stability etc. we can discuss about the details. I think for it being very simple structure there is no need much, just description of the field and how possibly it can be extended/changed in the future.
player/loadfile.c
Outdated
json_write(&dst, &node); | ||
dst = talloc_strdup_append(dst, "\n"); | ||
|
||
bool fail = fwrite(dst, strlen(dst), 1, history_file) != 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A short write here could leave garbage data behind and cause parsing error even when future writes succeed.
It should try to recover (truncate) the file back to its original state if that happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you do that? ftell
before write and truncate on errors? Generally fwrite is buffered so this won't fail on writes unless we are out of space and even then I think it will fail on fclose
likely. To make sure it is written we would need to call fflush
check that too and try to restore. But it may be likely not possible if storage is gone (in case of network) or remounted as RO. Realistically we could recover out of space error.
Generally speaking we can make it robust, but I don't think the impact is that big given that we will skip broken entries without crashing. IIRC zsh history has similar problem, because I could break it when my filesystem remouts to RO before sync...
EDIT: Generally to ensure consistency we shouldn't update file in-place. We should copy the file, update in the temporary file, close and rename
when fully written. Rename is atomic and will replace a file with fully working one or not replace it at all if we had errors. Though this changes the simple "append to file" to read, append, replace.
EDIT2: Of course we should also call sync
to ensure the file is actually written to the storage device.
EDIT3: Do you think this is a blocker for this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally to ensure consistency we shouldn't update file in-place. We should copy the file, update in the temporary file, close and rename when fully written.
When the history file is large, copying it every time a new file is played will result in a lot of disk IO, which is not cost-effective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you do that?
ftell
before write and truncate on errors? Generally fwrite is buffered so this won't fail on writes unless we are out of space and even then I think it will fail on fclose likely.
ftruncate(2)
. The file size before modification is already known. fwrite is buffered, but we don't know what's the configured buffer size, and lines can be larger than buffer size.
Generally speaking we can make it robust, but I don't think the impact is that big given that we will skip broken entries without crashing.
This works only because lines are independent in JSONL format. It won't work if some other formats are used, like a regular JSON array for entries. And even in this case it still would be very annoying to see broken line warnings every time the history file is read.
It's better to at least make the warning inform the user to delete the bad lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ftruncate(2). The file size before modification is already known. fwrite is buffered, but we don't know what's the configured buffer size, and lines can be larger than buffer size.
That's what I meant, but like I said before this will not fix any error except no disk space in practice. ftruncate
will fail the same way if filesystem becomes unwritable. Also needs wrapper for Windows.
This works only because lines are independent in JSONL format. It won't work if some other formats are used, like a regular JSON array for entries.
Did anyone propose different format?
It's better to at least make the warning inform the user to delete the bad lines.
There is a warning during writting and reading, where else do you want a warning?
a9b847a
to
26fe1df
Compare
DOCS/man/options.rst
Outdated
|
||
This file contains one JSON array per line. The first field is the UNIX | ||
timestamp when the file was opened, the second field is its normalized path, | ||
and the third file is its title, or missing if no title was present. Any new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
title should be empty string if not available. Or follow nanahi suggestion and use proper key-value json.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also show how it looks like [timestamp, "file_path", "title"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated it before reading your edit. I saved null
as the 3rd column since that's more correct for unavailable values and easier to check for thrutiness in Lua, also I've had actual videos with an empty string as title. But yeah we can just use keys if predictable column order is an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But yeah we can just use keys if predictable column order is an issue.
I think we should try that, it will be easier to read and update this format. (the idea of adding null in every entry is iffy to me)
player/lua/select.lua
Outdated
|
||
seen[path] = true | ||
|
||
local status, date = pcall(os.date, "(%Y-%m-%d %H:%M) ", time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This date format should be configurable and should include seconds by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a script-opt but there's nowhere to document it until I add a select.lua section to the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also added hide_history_duplicates
script-opt that's true by default since I prefer to show them.
DOCS/man/options.rst
Outdated
|
||
This file contains one JSON array per line. The first field is the UNIX | ||
timestamp when the file was opened, the second field is its normalized path, | ||
and the third file is its title, or missing if no title was present. Any new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also show how it looks like [timestamp, "file_path", "title"]
783f7fa
to
f92f186
Compare
player/lua/select.lua
Outdated
return entry.title .. " (" .. entry.path .. ")" | ||
end | ||
|
||
return entry.path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the media-title is not available, it will display only the filename.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I wrote in the commit message how are you supposed to understand the history with filenames like 1.jpg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Act like both
always if that an issue. Titles are also often completely non informative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both
doesn't show paths either?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is both not showing path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both
show filenames, wasn't your quote referring to the fact that this shows paths instead of filenames?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return entry.title .. " (" .. entry.path .. ")"
shows title and path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean in show-text and select-playlist both
shows filenames, while here I made it show the path because you usually have files from completely different directories unlike in those case so the full path is needed to understand the history.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you say that a full path is required to understand the history it should always be shown.
<title|filename> (path)
DOCS/man/options.rst
Outdated
|
||
This file contains one JSON array per line. The first field is the UNIX | ||
timestamp when the file was opened, the second field is its normalized path, | ||
and the third file is its title, or missing if no title was present. Any new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But yeah we can just use keys if predictable column order is an issue.
I think we should try that, it will be easier to read and update this format. (the idea of adding null in every entry is iffy to me)
c653be6
to
e7fbb73
Compare
The history could be formatted as CSV, but this requires escaping the separator in the fields and doesn't work with paths and titles with newlines, or as JSON, but it is inefficient to reread and rewrite the whole history on each new file, and doing so overwrites the history with an empty file when writing without disk space left. So this uses a hybrid of one JSON object per line to get the best of both worlds. This is called NDJSON or JSONL.
55dff1a
to
9f8f9b1
Compare
Implement selection of the entries in the watch history. The last entry in the selector deletes the history file.
Implement saving watched paths and selecting them.
--osd-playlist-entry determines whether titles and/or filenames are shown. But unlike in show-text ${playlist} and select-playlist, "file" and "both" print full paths because history is much more likely to have files from completely different directories, so showing the directory conveys where files are located. This is particularly helpful for filenames like 1.jpg.
The last entry in the selector deletes the history file, as requested by Samillion.
The history could be formatted as CSV, but this requires escaping the separator in the fields and doesn't work with paths and titles with newlines, or as JSON, but it is inefficient to reread and rewrite the whole history on each new file, and doing so overwrites the history with an empty file when writing without disk space left. I went with an hybrid of one JSON array per line to get the best of both worlds. And I discovered afterwards that this was an existing thing called NDJSON or JSONL. Since there are these 2 competing standards it is not clear if the file extension should be ndjson or jsonl, so I just used txt.
watch_history_path is awkwardly documented along with the key binding because I don't think it's worth adding a select.lua section to the manual just for this. I will add it and move it there if I add more script-opts in the future.