Jump to content

Split training & test according to a temporal attribute


Recommended Posts

Hello

I have a time series and I want to split it in test (most recent values) and training (the other ones). I saw that the Split Data task allows you to assign the first portion to the training and the last to the test using “Data shuffle option” = “no shuffle” and setting the starting patter for training and test and it works!

I wonder if it is possible to assign the test set based on the value of a column (for example all records with a date greater than a certain threshold).

Thank you!

Link to comment
Share on other sites

Hi!

You can do it using the Datamanager task:

  • perform the filter that select your desired test set using the query manager
  • right-click on data “test set” in the “modelling set” section (top-right)
  • click on “assign the displayed row to test set”
  • clear the query manager
  • save
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...