-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lda v1 #311
base: lda_output_fix_final
Are you sure you want to change the base?
Lda v1 #311
Commits on Feb 3, 2018
-
SVM: Add minibatch as a new solver
This work is based on the original work by Xiaocheng Tang <[email protected]> in madlib#75. This PR adds two main features: - A Minibatch solver that takes as input a batch of data - SVM code that takes advantage of the minibatch Closes madlib#229 Co-authored-by: Nikhil Kak <[email protected]> Co-authored-by: Xiaocheng Tang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a8bbe08 - Browse repository at this point
Copy the full SHA a8bbe08View commit details
Commits on Feb 7, 2018
-
Fix lda output inconsistency bug and add install check test
JIRA: MADLIB-1201 Fixed the issue of output of lda_train and lda_get_word_topic_count not matching each other. Added test case in install check. See jira for more details and example. Also added a install check that validates that the output of lda_train and lda_get_word_topic_count are consistent with each other. See jira for more details and example.
Jingyi Mei and Nikhil Kak authored and Jingyi Mei committedFeb 7, 2018 Configuration menu - View commit details
-
Copy full SHA for a99883d - Browse repository at this point
Copy the full SHA a99883dView commit details -
LDA: Add helper function to map wordid and topicid
JIRA: MADLIB-1160 This commit adds a helper function, which will map each wordid with corresponding topicid that get assigned in output table. Duplicate lines are removed from the final result. Also adds a workaround for GPDB4.3 svec In GPDB4.3, we cannot call madlib.svec directly on a text format.Instead, we have to call madlib.svec_from_string to convert the text. This commit fix this issue so the new helper function madlib.lda_get_word_topic_mapping can work on both gpdb5 and gpdb4.
Jingyi Mei committedFeb 7, 2018 Configuration menu - View commit details
-
Copy full SHA for f066423 - Browse repository at this point
Copy the full SHA f066423View commit details -
Address LDA topicid index inconsistency issue
JIRA:MADLIB-1160 This commit fixes the topicid inconsistency in madlib.lda_train and madlib.lda_get_topic_desc, where the former one uses 0 based index and the latter uses 1 index. Now they will all start at 0.
Jingyi Mei committedFeb 7, 2018 Configuration menu - View commit details
-
Copy full SHA for a062acb - Browse repository at this point
Copy the full SHA a062acbView commit details -
Fix LDA lda_get_topic_desc getting wrong top_k words issue
JIRA: MADLIB-1160 Previously, madlib.lda_get_topic_desc got top k - 1 words in the result table. This commit fixed it to be top k.
Jingyi Mei committedFeb 7, 2018 Configuration menu - View commit details
-
Copy full SHA for 7569049 - Browse repository at this point
Copy the full SHA 7569049View commit details
Commits on Feb 9, 2018
-
User doc updates for LDA, term freq, PageRank, matrix ops, test-train
Frank McQuillan committedFeb 9, 2018 Configuration menu - View commit details
-
Copy full SHA for e9a51fc - Browse repository at this point
Copy the full SHA e9a51fcView commit details