Federico Posted October 22, 2024 Posted October 22, 2024 Hello, is there a task that lets you quickly compute the correlation and/or covariance matrices associated to a dataset? Thank you for your support Want to know more?
Enrico Posted October 23, 2024 Posted October 23, 2024 Hi Federico, are you referring to this? This option is available under Sheets inside DM task Take a look also at the documentation and let us know in case you have any additional doubts Evaluating statistics Want to know more?
Federico Posted October 24, 2024 Author Posted October 24, 2024 Hi Enrico, yes but in a way that automatically compares every pair of variables and ideally outputs the result in matrix form, similarly to what eg pandas.DataFrame.corr does (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.corr.html) further outputting it as a heatmap would be very useful to estimate correlations in the whole dataset at a glance (see picture below). Thank you for your support! Want to know more?
Silvia Posted November 4, 2024 Posted November 4, 2024 Hi Federico, In the Data tab of the Data Manager task, given a dataset containing correlation attribute1 and correlation attribute2 and supposing both attributes are nominal, you can create a third attribute containing the correlation results and define in the formula bar a statistical function. Once you've obtained these results you can switch to the Plots tab and create a Heatmap by clicking on the Plot Type icon and selecting Heatmap, dragging the attribute1 (nominal) onto the X icon, the attribute2 (nominal) onto the Y icon, and the correlation attribute (continuous) onto the Color target icon. As we don't have the exact dataset you're working on we cannot give you more detailed instructions on how to replicate the heatmap you sent us. But if you need any further help, we're here for you. Want to know more?
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.