Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Descriptions could be better #3

Open
francisdb opened this issue Dec 19, 2014 · 17 comments
Open

Descriptions could be better #3

francisdb opened this issue Dec 19, 2014 · 17 comments

Comments

@francisdb
Copy link
Contributor

0 0 0 1 1 ?
at 00:00 at 1 day at January month

0/1 * * * * ?
every seconds

https://github.com/RedHogs/cron-parser seems to be doing a way better job...

@jmrozanec
Copy link
Owner

Yes, we need to improve this. More examples would be useful, so that we can use them for tests and improve current descriptions. Thanks!

@francisdb
Copy link
Contributor Author

Their tests contain all needed examples:
https://github.com/RedHogs/cron-parser/blob/master/src/test/java/net/redhogs/cronparser/CronExpressionDescriptorTest.java (also other languages available)

@Naxos84
Copy link
Contributor

Naxos84 commented Dec 6, 2017

public class TestDescriptor {

    // SoM MoH HoD D0M MoY  DoW Year
    // 3/4 5/6 7/8 9/2 10/2 ?   2017/2

    //expected result (separated in lines) (taken from https://www.freeformatter.com/cron-expression-generator-quartz.html)
    //Every 4 seconds starting at second 03,
    //every 6 minutes starting at minute :05,
    //every 8 hours starting at 07am,
    //every 2 days starting on the 9th,
    //every 2 months starting in October,
    //every 2 years starting in 2017
    @Test
    public void testFull() {
        final CronDefinition cronDefinition = CronDefinitionBuilder.instanceDefinitionFor(CronType.QUARTZ);
        final CronParser parser = new CronParser(cronDefinition);
        final Cron cron = parser.parse("3/4 5/6 7/8 9/2 10/2 ? 2017/2");
        System.out.println(CronDescriptor.instance().describe(cron));
    }

}

This gives: every 4 seconds every 6 minutes every 8 hours every 2 days every February months every 2 year
After fixing a formatting issue during runtime. see #300

Therefore I'm currently working on a big bunch of tests to cover a lot of description examples.

Naxos84 pushed a commit to Naxos84/cron-utils that referenced this issue Dec 6, 2017
All added tests are currently ignored. (so that the build wont break)
Naxos84 pushed a commit to Naxos84/cron-utils that referenced this issue Dec 6, 2017
Naxos84 pushed a commit to Naxos84/cron-utils that referenced this issue Dec 7, 2017
@jmrozanec
Copy link
Owner

jmrozanec commented Mar 24, 2018

@francisdb @Naxos84 we are getting back to this issue - this time would like to explore using some machine learning to achieve the goal. We would take current English descriptions as the original text to be translated by a deep learning model into a better human-readable text, which may be in English or another language. For this we should:

  • develop a dataset, with current descriptions and desired matches. Best way to do it would be to get a diverse range of patterns and randomly alter descriptions, to cover a wide range of variations.
  • develop a deep learning model: as a start, we can develop it with Keras and load it with dl4j

Experiments would be run for English translations first. If we succeed at this, will continue developing with the same approach for other languages as well.
We would like to target the problem using some machine learning since it is difficult to provide good human-readable descriptions for languages in general just with a properties pattern since structure and terms used vary from language to language.

Ideas and suggestions are always welcome!

@jmrozanec jmrozanec self-assigned this Mar 26, 2018
@Naxos84
Copy link
Contributor

Naxos84 commented Mar 26, 2018

@jmrozanec I don't have any experience in machine learning. But this sound interesting and therefore we should give it a try. :)

@jmrozanec
Copy link
Owner

@Naxos84 Great! We have been experimenting with Neural Machine Translation with good results up to now. Soon will be posting code we used to generate training samples and will request some help to generate new patterns so that we cover a wider range of cases. Thanks!

@jmrozanec
Copy link
Owner

We would like to invite everyone to contribute to this repo with expression generators for different languages so that we can later create datasets and train models which would provide accurate descriptions for each case.

@francisdb
Copy link
Contributor Author

Can we not just do something like what these guys do, the NL/FR/EN results are great according to their tests
https://github.com/grahamar/cron-parser/tree/master/cron-parser-core/src/test/java/net/redhogs/cronparser

@jmrozanec
Copy link
Owner

jmrozanec commented Jul 3, 2018

@francisdb Thank you for pointing that out! I think they do a very good job when it comes to NL/FR/EN, but I am not so sure we can fit good descriptions for all languages to those templates. By using a neural translation approach we may ensure very good quality regardless of the complexities of any language.

@Naxos84
Copy link
Contributor

Naxos84 commented Aug 2, 2018

I know that the machine learning project has started.
I also tried to have a look into it.
But I'm not sure how to correctly contribute to it.
Could you please (again) provide info on how to do this.

@jmrozanec
Copy link
Owner

@Naxos84 thank you for reaching out! We are tackling the issue from several poins:

  • build datasets, that will provide patterns and desired translations. We created a repo to add dataset generators for supported languages. We aim to translate from current english descriptions provided by cron-utils to the ideal ones. Up to now we just created a generator for English expressions.
  • train models that will perform the translations. We created a repo to track the algorithms we use to achieve it and the models we build. The model that worked best is a char-by-char neural translation. We tested it for English-English. The model was built using python and is serialized into .h5. We are still figuring out how to serialize from python and deserialize in Java 😄
  • create a satellite repo, that would provide means to download and load the models we trained in order to provide better descriptions when additioned to the core.

Neural translation model worked very well for English-English - we would like to expand this to other languages as well. The python code will most likely remain the same. The hard work to be done is consider exhaustive cases to provide translations to several languages and then how can we

  • store and version the models we train in order to be accessible on the web
  • how we load them within the satellite repo, so that can be used from the JVM.

Please let us know if anything else shall be clarified! 😄 Thanks!

@toughpheeckouse
Copy link
Contributor

if I correctly understood it is about plural forms and transaltion.
This task is very cleverly implemented in GWT.

Please take a look at PluralRule
and implementation for many locales

@jmrozanec
Copy link
Owner

@toughpheeckouse thanks! We will take a look at it!

@natrajmolala
Copy link

There is an addition to this. Cron Expression is Wrong when used astertic at Second place.
Example : * 0 9-23 * * ?
Description : every hour between 9 and 23

Expected Description : Every second, at minute :00 every hour between 09am and 23pm, of every day
This is taken from https://www.freeformatter.com/cron-expression-generator-quartz.html.
And this is giving right info

@jmrozanec
Copy link
Owner

@natrajmolala thank you for reporting this. You are welcome to send a PR with a test for this. Best! 😄

@natrajmolala
Copy link

@jmrozanec The test is in place and in a local branch. I think you need to add me to contributors list so I can raise PR?

@jmrozanec
Copy link
Owner

@natrajmolala not necessary - anyone can issue a PR, regardless his project status :)

jmrozanec pushed a commit that referenced this issue Oct 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants