Skip to content

Autokeras slow to start for large dataset? #1298

Open
@gautambak

Description

@gautambak

Hi there,

I'm playing with autokeras and tried to apply the tutorial to my dataset. It's just not running for a large dataset.

The shape of my initial sample is- (100000, 112).

Then I run the block of code in the tutorial(changing 'price' to 'value):

# Initialize the structured data regressor.
reg = ak.StructuredDataRegressor(
    overwrite=True,
    max_trials=300) # It tries 10 different models.
# Feed the structured data regressor with training data.
reg.fit(
    # The path to the train.csv file.
    train_file_path,
    # The name of the label column.
    'value',
    epochs=10)
# Predict with the best model.
predicted_y = reg.predict(test_file_path)
# Evaluate the best model with testing data.
print(reg.evaluate(test_file_path, 'value'))

I've waited over a hour and nothing has happened. When I try this on the tutorial dataset it runs fairly quickly. I've also tried cpu, gpu and tpu to no avail.

Is there anything I can do? This dataset is a sample, my actual dataframe shape is over 1M rows and 250 columns.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions