Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve test coverage for transformers & estimators #278

Open
tovbinm opened this issue Apr 11, 2019 · 4 comments
Open

Improve test coverage for transformers & estimators #278

tovbinm opened this issue Apr 11, 2019 · 4 comments

Comments

@tovbinm
Copy link
Collaborator

tovbinm commented Apr 11, 2019

Problem
Some of our transformers & estimators are not thoroughly tested or not tested at all.

Solution
Use OpTransformerSpec and OpEstimatorSpec base test specs to provide tests for all existing transformers & estimators.

@crupley
Copy link
Contributor

crupley commented Apr 15, 2019

After a quick survey of com.salesforce.op.stages.impl.feature, found the following:

These classes don't appear to have associated tests:

  • FilterMap
  • OpIndexToString (just a wrapper for Spark IndexToString, though)
  • OpLDA (has test, OpLdaTest, but names don't match)
  • OpOneHotVectorizer
  • OpScalarStandardScaler (OpStandardScalerTest exists but names don't match)
  • RealNNVectorizer
  • TextMapPivotVectorizer (tested in both OPMapVectorizerTest and TextMapVectorizerTest but names don't match)
  • Transmogrifier (TransmogrifyTest exists but names don't match)

These tests exist but do not extend OpTransformerSpec or OpEstimatorSpec

  • Base64VectorizerTest
  • DateMapVectorizerTest
  • DateTimeVectorizerTest
  • DateVectorizerTest
  • EmailParserTest
  • EmailVectorizerTest
  • FillMissingWithMeanTest
  • GeolocationVectorizerTest
  • HashingTFTest
  • IntegralVectorizerTest
  • IsotonicRegressionCalibratorTest
  • LinearScalerTest
  • MultiPickListMapVectorizerTest
  • NGramSimilarityTest
  • NGramTest
  • NumericBucketizerTest
  • NumericVectorizerTest
  • OPCollectionHashingVectorizerTest
  • OPCollectionTransformerTest
  • OpCountVectorizerTest
  • OpIndexToStringNoFilterTest
  • OpLdaTest
  • OPMapVectorizerTest
  • OpSetVectorizerTest
  • OpStandardScalerTest
  • OpStringIndexerNoFilterTest
  • OpStringIndexerTest
  • OpWord2VecTest
  • PercentileCalibratorTest
  • PhoneNumberParserTest
  • RealVectorizerTest
  • ScalerMetadataTest
  • ScalerTest
  • SmartTextMapVectorizerTest
  • TextMapVectorizerTest
  • TextTokenizerTest
  • TextTransmogrifyTest
  • TextVectorizerTest
  • ToOccurTransformerTest
  • TransmogrifyTest
  • UniqueCountTest
  • URLVectorizerTest

@Sammyalhashe
Copy link

Hi guys, I would like to help contributing and this seems like a good place to start. Any tips or places to look to help me getting up and running with implementing some of these tests?

@crupley
Copy link
Contributor

crupley commented Sep 28, 2019

Hi @Sammyalhashe! There are still some tests in the list above that need to be updated. I would go through there and find the tests that don't extend OpTransformerSpec or OpEstimatorSpec. You can look at the tests that do extend them and at the PR's that reference this issue for examples on what to do.

@Sammyalhashe
Copy link

Hey @crupley! Thanks for the reply, I'll start looking into some

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants