Specific Requisites Statistica Software

Hello everyone,

I'm new on plataform Statistica and i need to present to my companny some requisites, if the software can attend.

Could you help me?

- Read file homesite.train.csv. (obs: this file was provided in one of Kaggle's competitions).

- Convert the columns with categorical data (Strings) to numeric.

- Save converted data to Hadoop HDFS.

- From Spark, read the HDFS data and load them into an RDD. Cache and demonstrate on the Spark console.

- Divide the RDD between training and validation (70% -30%) randomly.

- With the training data, train a model using Spark's RandomForestClassifier.

- Validate the trained model with the validation data.

- Calculate the AUC-ROC score in Spark and display in the solution interface.

(1) Answer

Login