Suggestions for improvement of future versions of Statistica

Dear StatSoft,

I am not quite sure if this is the most visible place to deliver some feedback on Statistica (feature requests / suggestions), but hopefully someone will at least forward these to the developers.

I hope these suggestions are helpful, and I would really appreciate to see them implemented in a future version!

Many thanks in advance!


Most of these suggestions are about adding interface/GUI features to Statistica that are already common to many types of software, ranging from Word and Excel to Photoshop and Sibelius, and which users of Windows software expect to find in any "serious" application.


1) When defining a new variable using Add Variable, it would be very good to have an Autocomplete feature in the Formula field for variable names, i.e. for suggestions of existing variable names to come up when you start typing a few letters, much like function names are suggested in Excel when you start typing a formula into a cell.
This would be really useful when you have long names for existing variables. I know you can use the call number of a variable (e.g. v4) in the Formula field, but I'd rather use the full name of the variable, so as to better keep track of how each new variable was defined as a function of existing variables, and also to prevent cases in which the call number of a variable changes as a result of adding new variables or swapping existing variables among themselves.

Entering variable names manually is, aside from being terribly inconvenient, also not accurate, and leads to many erros. For example, there is nowhere to be found that you cannot have a colon as part of a variable's name, and yet if you then use this variable's name in the Formula for another variable, it is not recognised, and you have to then change the name to something more "DOS safe" such as an underscore. With an Autocomplete function, this would not be a problem, because Statistica would know which variable you are trying to point to, since you are selecting it from a list so there is no ambiguity and no proneness to error by typing a long variable name yourself.

I guess an easy solution for that would be to have, for the Formula field, the same functionality for the F2 key that this key has anywhere else in Statistica, that is - the functionality of opening up a Variables window from which you can select whatever (and however many) variables you want.

2) It would be very useful to have a zoom in/out feature that allows you to vary - using Ctrl + mouse wheel scroll - how much of the spreadsheet is visible on the screen, so as to be able to get a broader view on large datasets. This feature is available in most word processors/spreadsheet applications/graphic editors/etc and is a very handy feature indeed.

3) It would be awesome if the status bar could be made to display some basic stats (e.g. Average, Cell count, Min, Max) about the current selection of cells, as is the case in Excel. This would replace the several steps that are currently needed in order to get to those stats (make selection; right-click; Statistics of Block Data; etc).

4) An easier way of sorting variables according to (for example) name (i.e. alphabetically) would really make life much easier. Currently this is only possible for cases.

5) Having multiple Undo and Redo (i.e. being able to undo/redo several operations at once) would be a very powerful feature to add. In Word and Excel for example, the Undo and Redo buttons have an arrow next to them which opens up a list where you can not only see the names of all previous operations, but can also undo/redo any number of them .

6) It would be good to have the option of displaying all open documents (spreadsheets/workbooks) as separate items on the Windows taskbar, rather than all of them being bundled into a single Statistica window on the taskbar and having to use the Switch Windows button inside Statistica to switch between those.

7) It would be great to be able to do a Find&Replace in the All Variable Specs window! This would be particularly useful if you need to do a quick text transformation for several variables in a row (e.g. replace all instances of "zero" with "0"). It would also be very useful to have a function such as "Add text before", to add a certain prefix to a set of variables.

8) Again modelled on Excel, would be good to have a way to AutoSet the widths of several selected columns when double clicking on the vertical line that separates them, in such a way that the name of the variable fits optimally (on one or two rows)

9) If you make a cell selection and do a Copy, the values copied to clipboard will be of the decimal precision specified by the cell format, whereas you often just want to have the values that are displayed in the spreadsheet (which is typically a different number of decimals). Therefore, I suggest a "WYSIWYG" type of Copy (alongside Copy and Copy With Headers), which copies values as they are currently displayed in the spreadsheet that the user can see

10) A very useful feature in Excel is the ability to "Freeze panes", which allows you to always keep on the screen the first row or column (or the first few rows or columns). Admittedly, this is most useful in Excel where normally you use the first row for headers (descriptions of variables etc), whereas in Statistica the header row is always visible, however there are still many situations in which I would like to, for example, always see the first few columns, because I want to be able to compare, for a given case, the values in those variables with the value of a variable that is farther off to the right, without needing to scroll. Without this feature, one either has to constantly scroll left/right, or has to rearrange the order of the variables in order to keep the two variables that need to be compared next to each other

11) It would be good to have tooltips when mousing over icons in the Ribbon (e.g. have the tooltip for the Copy command read "Copy (Ctrl+C)" instead of just "Copy"). Again, a simple but effective idea for becoming accustomed to the keyboard shortcuts, borrowed from other software. Specifically, it would be good to have a shortcut for the ">>" button that allows you to advance to the next variable in the Specs window, and, again, for that shortcut to be displayed upon mousing over that button.

More detailed tooltips would also be useful when mousing above buttons (such as One Variable List) in Analyses windows

12) I am looking at the results of a Correlation analysis in which the top-most white part of the result spreadsheet mentions "Marked correlations are significant at p < .05000". On the other hand, the table gives a p of 0.0025. It would be useful if the "less than" value would actually be the lowest (round) number for which the inequality still holds, in this case p < 0.005, or at least p < 0.01

13) Text pasted from other programs retains formatting, which involves wasted time to clear the formatting (or having to use "Paste Special|Unformatted Text" everytime). Would it be possible to have a Paste option to just keep the plain text, or adapt the font/colour/size etc to that of the default style? You can set a default paste option (which most people would probably keep to "Keep text only"). This is similar to the Paste menu that appears in Office 2010 upon pasting.

Also, it would be very useful to have a Clear Formatting command (the shortcut for which usually is: Ctrl+Space) which strips any formatting from the selected cells and reduces them to the default style.

13') As a follow up to the previous suggestion, Paste Unformatted Text should be available as quick choice in the Paste menu, alongside Paste WIth Variable/Case Names

13'') A second follow-up: The Formula field in the Variable Specs also remembers the font that a pasted selection comes with, which is really inconvenient because, since there is no default pasting of plaintext values, you constantly need to clear the formatting in other ways, in order not to have a "font frenzy" inside your Formula fields

14) This one might be a bit harder to implement, but having MS-Word-style Styles for text formatting, which encompass any number of format settings (font, colour, borders, etc) would be a much easier and cleaner way to keep track of formatting throughout a large document than just being able to set these different formattings individually.

15) Some keyboard shortcuts that are standard in most word-processors/spreadsheet software (e.g. Ctrl+E for centering text, or Ctrl+PgUp for scrolling between open sheets) are not defined in Statistica. It would be nice to be able to customise keyboard shorcuts.

16) Bug report: when undoing a Paste operation that pasted formatted data, some of the formats (e.g. cell borders) are still maintained, even though the numbers themselves disappear.

17) The "Would you like to set this style as the default for the selected variables/cases?" dialog that comes up after making any sort of formatting change could use a "Remember my answer" tick box.

18) I suggest that you re-think the logic behind the "Long name (label or formula with functions)" field, in the Variable Specs window. Any variable should have BOTH a formula field, so that it can be defined as a function of existing variables, AND a long name field, where you can enter additional text that gives more details about that variable. The Formula of each cell would then hopefully be visible in a section of the Statistica window just above the spreadsheet (just like the Formula bar in Excel), whereas the Long Name would just be something that you can see if you go to Variable Specs.

I know that you can add comments after inputting a Formula, however I hope you agree it's still a bit sloppy, and that having separated would be a lot clearer and cleaner.

19) It would be *very* good if the different results in the output workbook were all timestamped, so that you can know at what date/time that analysis was made. I cannot count the number of times I wished I could have known that. Also, right now there isn't always a way of telling all the details under which an analysis was done (including, importantly, the variables used), so it would be good if each section of the output report had a "Metadata" section that keeps a log of the details of how that analysis has been run. For example, you could have a "View the data that this analysis has been run on" option in the right-click menu of each section of the output workbook.

20) The warning message that comes up after defining a Formula for a variable ("Note that values which cannot be calculated, such as sqrt(..)...") could do with a "Do not display this message again" tickbox, since it doesn't really need to be displayed each and every single time the user defines a formula. Better yet, have the formula parsed and only display this message if it's relevant to the formula (i.e. if it contains square roots)

21) For different types of analyses (e.g. "Independent t-test for groups" and "Discriminant Analysis"), the Select Variables windows has the grouping variable either on the left- or on the right-hand side. The same goes for the "dependent variable" and the "independent variable": you currently can't rely on them always being in the same part of the window, either to the left or to the right, which tends to be very confusing, especially if you do repeated analyses. It would be intuitive, for example, if the grouping variable (if there is such a variable for the current analysis) always comes first, so is the first to the left

22) Sometimes, the figures that are the result of a certain analysis also have in them (usually just above the graph) a few relevant statistics. For example, in a correlation graph you will find the R value above the graph. However, this behaviour is not constant in Statistica - for instance, if you ask for a box-whisker plot when doing a t-test, the t-value and p-value are *not* displayed above the graph, as you would expect, and as would be useful to have (because then you get both a qualitative information - from the plot - AND a quantitative one, from the p-value).

Also, for the t-test, having an asterisk above the two quantities compared, in the case the difference between them is significant (as is normally done in plots that appear in papers), would make the graph much more informative, as you can then tell from just looking at the graph where significance was reached, rather than also opening a Summary window to just see what the p-value is. Some people might not like this, so maybe have "Display asterisk between quantities that are significantly different" as an option somewhere in Preferences.

22') As a follow-up: although the full output of an analysis is almost always displayed in spreadsheets in the output workbook, sometimes relevant results are displayed in the Analysis window, where they are harder to read and to copy to clipboard. This is the case, for example, when doing Discriminant Analysis: a few random results ( Wilks' Lambda, F, p) are displayed in the top part of the Discriminant Analysis window, and the rest of the results are displayed in spreadsheets, as expected, upon pressing the relevant button in the analysis window. This seems a bit sloppy and random, and I think it would be better for these statistics to all be displayed in relevant places of the output workbook, and leave the analysis window to be just what it is: a GUI, and not a place where results are displayed.

23) Bug report: Statistica reliably crashes when doing a Cut (either using Ctrl+X or clicking on the Cut button) on any variable in the All Specs window. If you cannot replicate this please let me know and I will send you the spreadsheet I was working on when I noticed this bug.

24) In the Correlation Matrices analysis, when you click on One Variable List and select two variables that you want to check for correlations, you are given no clue as to which one of them will be plotted as the "IV" variable (on the x axis) and which will be plotted as the "DV" variable (on the y axis). I noticed that the first one that gets selected is the IV and the second one the DV, and if you want to reverse this, you have the select them in an inverse order in the One Variable List window. It would be better for this to be explicitly mentioned rather than the user having to rely on this observation (unless there is something I'm not getting here about how Statistica does correlations...)

25) For the analyses that have the option of either a "One Variable List" or a "Two Lists", there is no explanation (no tooltips in the analysis window, and nothing in the relevant Help section) that explains when each one should be used.

26) Minor bug: the tooltip that appears when mousing over a data point in an output graph flickers randomly, specifically it disappears and reappears with every small movement of the mouse, which makes it quite hard to read unless you are keeping the mouse perfectly still.

27) When selecting several (non-adjacent) variables in the All Specs window and then doing a Copy, the message "That command cannot be used on multiple selections". It is quite a basic and necessary function to be able to Copy non-contiguous (non-adjacent) cells.

27') One also cannot Delete several variables at a time, unless they are adjacent. Statistica should really allow operations on non-contiguous cells/variables/cases.

28) Bug report: there is sometimes (I'm not sure under which circumstances it can be reproduced) a bug whereby case names are not visible unless the width of the left-most column (which contains case names) is large enough, which often means that it takes more space than it should (for example, in order for case names "1" to "20" to be visible, it has to be about 6 times the width necessary to display a two-digit number. Screenshot here shows how it looks when the column is just a bit narrower than it would need to be in order for it to fully display the case names (without truncating the first digit).

29) (Here I'm assuming that the "Auto-recalculate spreadsheet formulas when data change" option is enabled.) Currently, if you change the name of a variable that appears in the formula by which another variable is defined, you only get an error message saying "Error in formula: unrecognised name". What I would have expected is that the name is updated automatically in any formula in which it appears. Otherwise, you have to manually inspect each variable, see if it contains that variable's name, and, if it does, manually edit it to reflect the new name of the variable.

I know you can use absolute referencing in formulas as well (e.g. v2), but that's actually even worse, because then inserting or deleting a single variable invalidates ALL the formulas that exist for variables in your spreadsheet.

30) Under Customize Quick Access Toolbar, having an "All commands" category in the drop-down list and being able to change the order of icons in the toolbar would be nice..

31) There really needs to be a way of disabling SharePoint. Personally I don't use that at all, but everytime I'm not connected to the Internet, Statistica takes a long time to save/open files, and even to start the program upon double-clicking on a .sta file, because it keeps trying to connect to SharePoint, and currently there is no option to disable this.

32) When a Replace All is done (Ctrl+H), it would be useful (and expected!) that you are told how many replacements have been made altogether.

33) It would be good to have an option for cloaked variables to not show up in the Variables list (upon pressing F2) as well, not just to not show up in the spreadsheet. That's because if you don't want to see a variable in the spreadsheet, you probably don't want to see it in the Variables list either.

34) It would be good if the Capture Rectangle command had a separate function to *save* the selection as an image file, in addition to the existing function of copying it to the clipboard.

35) The Save Graph command should really include a tickable list of objects that the user might or might not want to include in the saved image, objects such as: error bars, fit lines, confidence bands, axes titles, histograms to the right and on top of the graph, etc. This ability to customise the contents of a graph object-by-object would significantly reduce the time required to edit these saved images afterwards.

35') Similarly, graphs produced in the output workbook should be easily edited with regards to their constituent objects, for example you should be able to remove the confidence bands by clicking on them and pressing Delete, and ideally there would be a list of all objects in the graph, which can be ticked or unticked.

36) When scrolling the spreadsheet using the arrow keys and reaching the end of the spreadsheet, the cursor will go round the other end, for example if you reach the bottom limit and keep pressing the Down arrow, the cursor will come up at the top of the spreadsheet. That's a bit confusing, especially seeing that in Excel this does not happen - it would be good to have an option to stop this from happening, i.e. when the end of the dataset is reached, the cursor should just stop there.

37) Say you have two groups and you want to see how two measures are correlated, both overall (all cases) and separately for the two groups. If you want the data points in the correlation graph to be colour-coded as a function of groups (e.g. red dots for group A and blue squares for group B), to my knowledge this is only possible via Graphs/Scatterplot/Categorized, but not via Basic Statistics/Correlation, which is a bit inconvenient, because you have to do a Scatterplot just to see this separation. I suggest that a Categorized tab be added to Correlation analysis windows as well.

38) Ctrl+W should act as "Close File", just like Ctrl+F4, as it does in most Windows software.

39) The Find function (Ctrl+F) does not allow you to search within the LongName/Function field of a variable. I strongly suggest you add this possibility alongside the existing ones, "Case names", "Variable names", "Selected rows" and "Selected columns"

40) Once some variables are selected in a Variables window that has two lists of variables (e.g. X variable and Y variable), it would be good to have a "Switch" button that makes variables that were selected in the X list now be selected in the Y list, in other words reverses the roles. Reversing this selection manually takes a lot of time.

41) When wanting to edit the title of an axis, in the Graph Options window, the text cannot be edited but instead is treated as a block, i.e. you can only delete it as a whole, and not just portions of it

42) I posted a question on regarding how a set of formatting changes (done via Graph Options) can be repeated for multiple graphs, however I see that replies on this forum are really slow, and I'm guessing this feature is actually not available in Statistica anyway (I looked for it for a while!), and so I'd like to suggest this as a feature for a next version.


43) When clicking in the Inside/Outside Background Colour options under Graph Options | Window, the options appear briefly and disappear, and only at the second click they will stay open


44) It would be really necessary for the Variable type windows (as well as for the All Specs window) to be maximisable, i.e. for the user to be able to double click the upper bar for the window to maximise itself, rather than having to manually drag the window corner.

45) When going to All Specs, it would be good for the currently selected variable to also be selected in the list of variables, so that you don't have to search for it again when you've already found it (and selected it) already.

46) The All Specs window should allow some more advanced editing of variable names such as Find&Replace, or adding a fixed or numbered (counter) suffix/prefix to selected variables.

47) The restrictions to variable names (i.e. only alphanumeric characters and underscores are acceptable if one wishes to reference the variable name in another variable's formula) make it really difficult to be organised with variable names when one has a very large spreadsheet where having well organised variable names is the only way of hoping to keep track of all variables. It would be good if this restriction were at least partly eliminated (e.g. allow spaces, dashes and backslashes)

48) The Cut command cannot be used to relocate variables in the All Specs window, as it just cuts the variable Name, and not the whole row! Also there is no separate Cut Variable command in the right click menu.


49) When defining a mixed-design ANOVA (i.e. both between and within factors) using the GLM analysis type, it's very difficult to be sure which of the variables you define correspond to the different combinations of the levels of the within factors defined. The only way is to make sure you define factors in the order in which the enumeration of their levels contributes to the variable names, however this is very much prone to error. I suggest implementing a simple graphical way of making this identification in an intuitive way.


50) The "Move Variables" feaure could be much more intuitive and easy to use than it currently is: rather than selecting a "From" and a "To" range of variables and then an "Insert after" variable, it'd make much more sense to havea drag&drop-enabled list of all variables (perhaps the All Specs window), where one could easily select any range of variables, and then drag them in the "space" between two variables, to indicate where the whole selection should be moved to.

51) It would be good to be able to re-order variables according to their names, i.e. select a range of variables and then have a Sort Alphabetically command that changes their order according to this criterion

52) A feature to detect (and delete) duplicate variables would be VERY useful. One often has hundreds of variables that come from different sources, and it would be very useful to be able to quickly identify which of these are duplicates (either according to their names or according to their values)

53) Currently p-values are displayed as a less than value instead of the actual value (i.e. p<0.05 instead of p=0.000123) in some specific tests such as the Kolmogorov-Smirnov Test, but not in others (t-test). The user should be able to choose whether he prefers one way or the other, for all tests

54) When variables are moved, the column width remains that of the previous variable that was there, instead of being that of the variable that is bineg moved, and therefore one has to manually adjust it every time!

55) If after doing an ANOVA (via the GLM item in the menu), one clicks the Modify button, the old window still remains open, and if one does this several times (as you try different analyses), then you have lots of GLM windows open that really aren't of any use since you Modified the analysis many times since then

56) Statistica seems to crash quite often, and unfortunately right now the error reports are useless, as everytime I get a crash and want to submit the error report, I get (another!) error message: "The file D:\Outlook Files\Outlook.PST cannot be found". I don't use Outlook, and I'm not even sure it's installed, but submitting these error reports should not depend on that! I saved the contents of the report as a ZIP file and can email it to your devs on request.

57) The Graphics Styles feature is completely unreliable and severely bugged! After saving a set of graph properties as a style and then attempting to apply that style to another graph, in most cases the settings loaded are not the correct ones, sometimes the graph becomes of the wrong type (e.g. a bar chart although it was previously a line chart), and even the Undo no longer works. This is a very imporant feature (e.g. formatting several graphs to one single style, e.g. that required by a publication) and should be thoroughly tested!

58) The Spreadhseet Case Selection Conditions window (F8) should really allow inserting variable names into the Expression field, from the Review Variables window, which presently only lists the variable, without an option to insert a variable's name into the Expression field, or at least to copy the variable's name into clipboard.

(21) Answers