Function to analyze history branch/data #50

llrs · 2022-02-11T15:30:33Z

I know cransays is not really to deliver code, but I have some code to merge all the csv files of the history branch that I think it would be helpful to others (and myself) if it were documented here.
The code solves merging some files with different headers efficiently (previous iterations of the code lasted 30 minutes and now I can do it in just 1).

I think it doesn't have dependencies and wouldn't need to be run or tested but it could help others if they want to analyze the data.

Let me know if it would be helpful/appropriate and I would create a PR with the code.

Bisaloo · 2022-02-11T18:11:30Z

I think it's definitely interesting to have this code somewhere. But I'm not 100% sure where.

We have been talking with @maelle about changing the location of the historical data. Possibly to a separate repo. If this happens, then I think your code would be better there than in the main repo(?). Not sure... What do you think?

llrs · 2022-02-11T23:35:16Z

Currently it is on some file of my code for some presentations so it is public (but probably hard to find :)

Yes, she mentioned something on #36 (comment). Currently with this code I haven't found a problem dealing with these many files, the code I previously used was highly inefficient (ultimately too many files might become a problem, but I'm not sure of the OS limits or R limits on this).
I think it would be better to have the data in a database somewhere, a branch with a single SQLite database, a server somewhere? Which I think comes down to maintenance costs and usage of the data you want to promote.

maelle · 2022-02-14T07:31:44Z

A package to consult historical data could be called {cransaid} 😁

If the data were in a separate repo shouldn't the package be in a third repo?

llrs · 2022-02-14T09:29:02Z

If the data is in a different repo there is really a need for a new package? To split the functionality between recording data {cransays}, storing data {cranwas} and analyzing data {cransaid}?

Bisaloo · 2022-02-14T11:16:07Z

I thought this over again and I think it actually makes sense to have the function to load the historical data inside cransays.

I think having a short analysis of historical data on the cransays website would be useful to give users of the dashboard an idea of a typical path and what they can expect for their submission.

In particular, we could partially address #29 and #40 by dynamically generating a flow diagram with igraph based on historical data.

llrs · 2022-02-14T11:37:27Z

Note that the #40 archive directory not showing up was not solved.

Update of packages already on CRAN are sometimes very fast (<15 minutes) so they aren't captured by the dashboard.
Only new packages that take time to be processed could be actually represented.
However, I think such a report should be careful with time estimations for the process: it could encourage negativity towards the CRAN reviewers if packages don't go through within expectations, and I don't think this would be productive for either part.

I will create a PR with the code I used to get together all the files (maybe I'll need to modify it to be able to parse the new column recently added). There is also a .R file on the history branch https://github.com/r-hub/cransays/blob/history/analysis.R

Bisaloo linked a pull request Apr 24, 2022 that will close this issue

Function to download history data #56

Merged

Bisaloo closed this as completed in #56 Apr 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Function to analyze history branch/data #50

Function to analyze history branch/data #50

llrs commented Feb 11, 2022

Bisaloo commented Feb 11, 2022

llrs commented Feb 11, 2022

maelle commented Feb 14, 2022

llrs commented Feb 14, 2022

Bisaloo commented Feb 14, 2022

llrs commented Feb 14, 2022

Function to analyze history branch/data #50

Function to analyze history branch/data #50

Comments

llrs commented Feb 11, 2022

Bisaloo commented Feb 11, 2022

llrs commented Feb 11, 2022

maelle commented Feb 14, 2022

llrs commented Feb 14, 2022

Bisaloo commented Feb 14, 2022

llrs commented Feb 14, 2022