Enhancing
Interactive Visual Data Analysis
by
Statistical Functionality

Implemented Functionality in the Statistics Library

Theoretic distributions

  • Normal distribution
  • Log normal distribution
  • Uniform distribution
  • Exponential distribution
  • Chi-squared distribution


Kolmogorov-Smirnov Tests

  • for Normal distribution
  • for Log normal distribution
  • for Uniform distribution
  • for Exponential distribution
  • for Chi-squared distribution
  • for the same distribution of two samples


Multivariate Linear Regression


Correlation

  • Pearson correlation (classic)
  • Kendall correlation (robust)
  • Spearman correlation (robust)
  • Calculation of correlation matrix
  • Hierarchical clustering on correlation matrix (creates groups of dimensions)


Covariance matrix

  • Classic covariance matrix
  • MCD covariance matrix (robust)


Applications using the covariance matrix

  • Mahalanobis distance / Robust distance
  • Principal Component Analysis (classic / robust)
  • Multivariate outlier detection


Clustering

  • k means clustering
  • Fuzzy k means clustering


Transformations

  • Absolute transformation
  • Square root transformation
  • Logarithmic transformation (ln, log10)
  • Z standardization (classic, robust, a-robust)
  • Linear Scale to [0,1] interval
  • Linear Scale (zero preserving)


Moments

  • Arithmetic mean, a-trimmed mean, median
  • Standard deviation, variance, MAD, a-trimmed standard deviation, mean of absolute deviations
  • Skewness, kurtosis
  • Quantiles, quartiles, Inter Quartile Range (IQR)
  • 1D outlier detection


Utilities

  • Ranks
  • Number of unique values
  • Unique values
  • k smallest element (expected running time O(n))
  • Manhatten / Euclidean distances (with dimension weighting)
  • Matrix inversion
  • Determinant of a matrix