Release 3.3.3 #26

tovbinm · 2018-06-22T19:35:01Z

Change List

Convert some more stages tests to use OP stages specs (Pivot with max cardinality percentage #241)
Changed error to occur only when all labels are removed (Outputting Raw Feature Filter information: Part 1 #237)
Fixes for writing/reading stages in OpPipelineStageSpec tests (Logo for TransmogrifAI #235)
Add files via upload (Tweaks to OpBinScoreEvaluator #233)
workflow description figures
Stop words changes to text analyzers and bug fixes (Cleanup Helloworld examples #230)
Update README.md (Fix rawPrediction of OpXGBoostClassifcationModel for binary classification #229)
Remove null leakage checks for text features from sanity checker (Unable to create features dynamically. #228)
Update sanity checker shortcut with protectTextSharedHash param (build failed #234)
Remove JSD check for date + datetime features in RFF (Reverse geocoder with Lucene #227)
Introduced FeatureBuilder.fromDataFrame function allowing materializing features from a DataFrame (Cleanup helloworld example + decrease logging verbosity to ERROR #226)
Get rid of ClassTags in OP models (Make decision tree numeric bucketizer tests less flaky #225)
Test if transformer transforms the data correctly after being loaded (Scaler and descaler transformers #223)
Update to BSD-3 license (Upgrade to Gradle 5.2 #218)
Some more licenses (Make tests a little less flaky #221)
Changed from extending to wrapping spark models.
wrapped spark model classed using reflection (part 1 of 2) (Fix indices in LOCO for record-level insights and add more robust tests #216)
wrapped spark estimators so that they return op wrapped models with prediction return type (part 2a of 2) (Release 0.5.1 #222)
wrapped spark estimators for new models added (part 2b of 2) (Random param builder for random hyperparameter search in model selectors #238)
Moved code out of spark ml workspace and added comments - no code changes after tickets (TransmogrifAI on Apache Zeppelin #239)
Change ootb transformers to use OPTransformerSpec for tests (Scaler and descaler transformers #215)
Move base stages to features sub project + test classes and specs (Cl/rff metrics #214)
Better clues when asserting stages (Fix sorting in Prediction type for multiclass classification and add stronger tests #213)
Implement multi-class threshold metrics (Integrate helloworld project with Travis CI #212)
NameEntityRecognizer (NER) transformer (Is it possible to prevent fields from being used as features but keep them as output fields? #209)
Allow customizing feature type equality in op test transformer/estimator specs (test cases for RichListFeature #207)
Threshold metrics bug fix (Use class.getName & update splitter meta parsing #204)
use prediction rather than raw prediction
Added an extra OpEstimatorBaseSpec base class with loosen model type boundaries to allow testing Spark wrapped estimators (Add package which gives ability compile check and execute code provided in documentation #203)
Fix package access level on OpEstimatorBaseSpec (Error in transmogrifai gen when field has an underscore #205)
internal OP test base class
Fast materializer method FeatureTypeSparkConverter by full feature type name (Correct some syntax/compilation errors in Titanic Binary Classification Docs Example #202)
Added UID.reset() before tests so that all workflows will generate the same feature names (Syntax/Compilation errors in Titanic Binary Classification Docs Example #201)
Added add/subtract operations for Spark ml Vector types (Regression error = 0.0 - looking for suggestions #200)
workflow cleanup (Export model selector defaults + metadata fixes #199)
Fix TextMapNullEstimator to count a null when text entirely removed by tokenizer (The problem of Xgboost #198)
fix the issue that certain text strings can be entirely removed by our tokenizers, but null tracking step for text map vectorizers just checks for the presence of a key
Workflow CV Fixes (Possible solution for issue #154 (Geolocation to Country transformer) #196)
fix dead lock in OpCrossValidation.findBestModel happened due to the fact that when running splits processing in parallel these threads would try to access spake stage params on the same stages.
Update ternary, quaternary and sequence transformer/estimator bases tests (Adds options for tracking text length in text vectorizers #195)
Enabling null-label leakage detection in RawFeatureFilter (Error: Could not find or load main class com.salesforce.op.cli.CLI #191, Illegal character in path at index 2: ..\test-data\PassengerData.avro #192, Use OS specific path separator #193)
Feature Type values docs (Add transformer / estimator for text length calculation #190)
Bump up lucene version and add lucene-opennlp package (Allow convertion from Date and Timestamp Spark types to Date and DateTime TransmogrifAI types #188)
Minor README cleanup (Add length of the text as default features for text fields #187, TransmogirfAI build issues #189)
Test specs for OP stages (Release 0.5.0 #186)
Adding pr_curve, roc_curve metrics (Upgrade Apache Spark to 2.4 #184)
Create hash space strategy param (Can't use the cloned project #182)
Make new Cross Validation (XGBoost error code 255 #181)
Avoid reseting UID in every test, but only do it when necessary (Upgrade XGBoost to 0.81 #180)
Upgrade to gradle 4.7 (Integrate Streaming Histogram into RawFeatureFilter. #179)
Added OpTransformer.transformKeyValue to allow transforming Map and any other key/value types (Evaluators check for empty data #178) in preparation for sparkless scoring
Adding autoBucketize to transmogrify for numerics & numeric maps + pass in optional label Replace assert with require #159
Autobucketizing for numeric maps should not fail if map is empty, instead we generate empty column for empty numeric map jupyter notebooks for transmogrify samples #231

Migration Guide

OpLogisticRegression() is in progress (evaluator needs updates)
may use BinaryClassificationModelSelector() instead
need to add .setProbabilityCol($probCol) to evaluator in workflow definition to make sure that the evaluator will get the correct probability column to do the calculation

sxd929

LGTM

Initial patch

44bcbec

tovbinm requested review from leahmcguire and kinfaikan June 22, 2018 19:35

tovbinm added 2 commits June 22, 2018 12:41

artifactory revert

da8681c

cleanup

bc6f8f3

tovbinm requested a review from sxd929 June 22, 2018 19:46

sxd929 approved these changes Jun 22, 2018

View reviewed changes

tovbinm merged commit b49d81c into master Jun 22, 2018

tovbinm deleted the mt/3.3.3-release branch June 22, 2018 20:37

ericwayman pushed a commit that referenced this pull request Feb 8, 2019

Release 3.3.3 (#26)

f27602a

tovbinm added the release label Jul 11, 2019

emitc2h pushed a commit that referenced this pull request Feb 24, 2022

Revert "@W-8863563 Add Integer feature type (#23)" (#26)

c794e18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 3.3.3 #26

Release 3.3.3 #26

tovbinm commented Jun 22, 2018 •

edited

sxd929 left a comment

Release 3.3.3 #26

Release 3.3.3 #26

Conversation

tovbinm commented Jun 22, 2018 • edited

Change List

Migration Guide

sxd929 left a comment

Choose a reason for hiding this comment

tovbinm commented Jun 22, 2018 •

edited