Wei Keong Tan Posted January 26, 2023 Share Posted January 26, 2023 I'm new in R. I'm trying to modify a TIBCO provided TERR function from https://community.tibco.com/modules/cdf-data-function-tibco-spotfirer. The script works fantastic for one dataset. However, when I modify the script and add line "group = as.factor(rep(c("group1", "group2"), each = N/2))" in the script, the "group" column is created and combined as a factor variable with two levels "group1" and "group2" which are repeated N/2 times each. The sample cdf plot as shown in the picture 2.It doesn't separate the data into different groups and may I know how can I create a separate CDF for each group as shown in the picture 1 ?Picture 1 --R script--library(ggplot2)x = sort(analysisColumn, decreasing=FALSE, na.last=NA) # Sort increasing, removing missing values.N=length(x)cdfTable = data.frame( value = x, prob = ((1:N)-1)/(N-1), group = as.factor(rep(c("group1", "group2"), each = N/2)))ggplot(cdfTable, aes(x=value, y=prob, color = group)) + geom_line() + ggtitle("Multiple Cumulative Distribution Curves")Picture 2 Picture 3 1 Link to comment Share on other sites More sharing options...
Gaia Paolini Posted January 27, 2023 Share Posted January 27, 2023 Could you elaborate on what your goal is? You have divided the data in two halves in the same loop as calculating the cdf, so you are creating group1 for the first N/2 rows and group2 for the rest, which is what you are getting. Did you mean to use a separate column to divide the data in two groups, and then calculate the cdf for each group? If so, what is the name of the grouping column in your example? Link to comment Share on other sites More sharing options...
Wei Keong Tan Posted January 27, 2023 Author Share Posted January 27, 2023 Thanks for your reply. Yes, you're right. I would like to use a separate column to divide the data in two groups or maybe more to achieve the cdf as shown in the Picture 1. The grouping column is named as "group". May I know is the existing approach correct to do so ? or should use facet_wrap() function to create multiple panels for each group instead of divide the data in two halves in the same loop? Link to comment Share on other sites More sharing options...
Gaia Paolini Posted January 27, 2023 Share Posted January 27, 2023 you have uploaded a sample dataset so if you give me the name of the grouping column I can try. I understand you have a column of data and another column that divides that data into separate groups.If you mean facets as used by ggplot2, are no facets involved, as Spotfire will take care of the plot. Link to comment Share on other sites More sharing options...
Wei Keong Tan Posted January 27, 2023 Author Share Posted January 27, 2023 Yes, I mean the facets call from ggplot2 lib. Did you mean Spotfire can handle the multiple panels without the lib ?Sorry I thought you refer to the grouping column name that highlight in yellow in the picture below. The name of the grouping column that I would like to create the multiple panels for each group are "group1", "group2", "group3", "group4" and "group5". Link to comment Share on other sites More sharing options...
Gaia Paolini Posted January 27, 2023 Share Posted January 27, 2023 I meant the name of the actual column containing the values group1, etc. You have 105 columns in your dataset, which one should be used for grouping the values of the first column (x)? Link to comment Share on other sites More sharing options...
Wei Keong Tan Posted January 28, 2023 Author Share Posted January 28, 2023 Got it. I uploaded the dataset in .csv file (separate the data in five groups - "group1", "group2", "group3", "group4", "group5") and also the .dxp file. For the first column (x), the values group as "group 1" between 0.0996 to 0.2736 For "group 2", the values group between 0.2736 to 0.2832For "group 3", the values group between 0.2832 to 0.2964For "group 4", the values group between 0.2964 to 0.3192For "group 5", the values group between 0.3192 to 0.3516Wonder the cdf couldn't achieve the cdf (separate CDF for each group) as shown in the Picture 1 due to duplicated values in my dataset ? Or something need to do with the formula "group = as.factor(rep(c("group1", "group2"), each = N/2))" to create multiple CDF plots for each group ? Link to comment Share on other sites More sharing options...
Wei Keong Tan Posted January 28, 2023 Author Share Posted January 28, 2023 Attached .csv file Link to comment Share on other sites More sharing options...
Gaia Paolini Posted January 30, 2023 Share Posted January 30, 2023 I still don't quite understand how you wish to group your data. Anyway I have modified the original dxp that was on TIBCO Exchange by adding a grouping column. Note that the inputs to the data function are now different.I used the original example and I added a grouping column (the only potential candidate I could use for that dataset was RAD).The data function works by creating a separate CDF for each group, then concatenating them into the final table.The resulting table can be displayed via a line chart separated by group.You don't need ggplot2 as that is the R library for data visualization, which is taken care of by Spotfire. 1 Link to comment Share on other sites More sharing options...
Wei Keong Tan Posted January 31, 2023 Author Share Posted January 31, 2023 I would like to group my data as you plotted. This is great! My question is now solved, thank you so much! Now I only realize I don't need the ggplot2 until you enlighten me on the issue. I tried to modify the data function from you, it's able to create a separate CDF for each group for my data now. I learnt something today. Thanks a lot! Link to comment Share on other sites More sharing options...
Rich Lake Posted February 26 Share Posted February 26 On 1/30/2023 at 4:53 AM, Gaia Paolini said: I still don't quite understand how you wish to group your data. Anyway I have modified the original dxp that was on TIBCO Exchange by adding a grouping column. Note that the inputs to the data function are now different. I used the original example and I added a grouping column (the only potential candidate I could use for that dataset was RAD). The data function works by creating a separate CDF for each group, then concatenating them into the final table. The resulting table can be displayed via a line chart separated by group. You don't need ggplot2 as that is the R library for data visualization, which is taken care of by Spotfire. Can you share the R script for this or provide dxp attachment. Not sure why there is no file in this thread. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now