Spotlight

Try VizRank online - You can now experiment with VizRank online. Find interesting data projections of your own data sets.
[This now works again]


FRI > Biolab > Supplements > VizRank

VizRank's Graphical Interface

VizRank is implemented inside Orange and it can be used with any visualization method that projects high-dimensional data to points in two dimensional space. In Orange, we implemented two such visualization methods - scatterplot and radviz. You can use them by starting Orange Canvas (if you installed Orange, the "Orange Canvas" icon should be on your desktop), selecting Visualize tab and clicking on one of the buttons that are on the bottom figure encircled. For introductory information on how to use Orange and Orange Canvas please see Orange and Orange Widgets white paper. More details can also be found on Orange web site.

Each of this visualization methods has a button on the lower left edge of the dialog, named "VizRank optimization dialog" (see the scatterplot example below). Clicking this button will open the VizRank dialog.

 

VizRank Dialog

VizRank dialog consists of three tabs:

  • the Main tab, where user can start the evaluation of projections. The tab shows shows a list of currently evaluated projections, which is sorted by scores assigned by VizRank; the most interesting projections are listed first,
  • the Settings tab, where you can set the parameters of the VizRank projection evaluation method, and
  • the Manage & Save tab, where you can modify, save and load the list of evaluated projections.

Following is a detailed description of each of these tabs.

Main Tab

 




Projections with exactly/maximum n attributes
: this setting is only shown with radviz vizualization method, as radviz can concurrently visualize an arbitrary number of attributes. Increasing the number of attributes n will also increase the time needed to assess projection usefulness since VizRank will test all possible placements of n visualized attributes.

Start evaluating projections: pressing this button will start projection assessment. You can stop projection assessment any time by pressing this button again.

Projection List
: this list box contains the list of already assessed projections. The list is ordered from the most to the least interesting projection. Each item in the list box includes information on the VizRank projection's score and the attributes that participate in the projection. Clicking the item in the list will visualize the selected groups of attribute.

Rank, Predicted Accuracy, # Instances: check boxes that determine the details that are presented for each projection in the Projection List . Rank represents the index of the projection in the Projection List, Predicted Accuracy is the estimate of projection usefulness as determined by VizRank and #Instances is the number of data instances in the projection (due to missing values, different attribute subsets can have different number of valid instances).

 

Settings Tab

  Number of neighbors (k): number of neighbors used in k-nearest neighbor (k-NN) algorithm.

Percent of data used in evaluation:
Since the projection assessment can take a long time when evaluating large data sets (more than a few thousand examples), we can tell VizRank to use only a subset of the data to make the assessment faster.

Measure of classification success
: a measure that evaluates prediction success of k-NN classifier on the given projection. Possible values are:
  • Classification accuracy
  • Average probability assigned to the correct class
  • Brier score

For details on this measures see "Evaluating Usefulness of a Projection" section in VizRank method details

Testing Method: an evaluation schema for testing prerformance of a classifier. Possible values are:

  • Leave one out: this method is the slowest, but most accurate. When predicting class value for an example, all other examples can participate in the prediction.
  • 10 fold cross validation: Data set is separated into 10 folds. When predicting class values for examples in one fold, examples from all other folds can participate in the prediction.
  • Test on learing set: this method is fastest, but also methodologically incorrect. When predicting class value for an example, the whole data set (including the predicted example) can participate in the prediction.

Heuristics for Attribute Ordering: this is a heuristic, that enables VizRank to first evaluate attribute groups that are most likely to be interesting. For details see "Heuristic" section in VizRank method details.

 

Manage & Save Tab

 

Number of concurrently visualized attributes: Items in this list box represent the lengths of attribute groups of projections in Projection List. For scatterplot method the only value in the list box will be "2" (since scatterplot can concurrently visualize only 2 attributes), while for radviz method it can contain any value between 3 and the number of attributes in the data set. Deselecting any of this items in the list box will remove projections with that number of attributes from the Projection List.

Reevaluate shown projections: Reevaluate projections from the Projection List with VizRank's current settings.

Load: Load the list of assessed projections from a file.

Save: Save the current list of assessed projections from Projection List to a file.

Evaluate projection: Assess usefulness of current data projection with VizRank's current settings.

kNN correct: show k-NN classifier performance on the current projection. Show correctly classified examples as dark points and wrongly classified as white points.

kNN wrong: similar to kNN correct, just that the colors are inverse.

Original: show projection where points are colored using class attribute.