Code Quality Rank: L5
Monthly Downloads: 1,786
Programming language: Ruby
License: BSD 3-clause "New" or "Revised" License
Tags: Scientific     SciRuby    
Latest version: v2.1.0

statsample alternatives and similar gems

Based on the "SciRuby" category.
Alternatively, view statsample alternatives based on common mentions on social networks and blogs.

Do you think we are missing an alternative of statsample or a related project?

Add another 'SciRuby' Gem



Build Status Code Climate Gem Version

Homepage :: https://github.com/sciruby/statsample


You should have a recent version of GSL and R (with the irr and Rserve libraries) installed. In Ubuntu:

$ sudo apt-get install libgsl0-dev r-base r-base-dev
$ sudo Rscript -e "install.packages(c('Rserve', 'irr'))"

With these libraries in place, just install from rubygems:

$ [sudo] gem install statsample

On *nix, you should install statsample-optimization to retrieve gems gsl, statistics2 and a C extension to speed some methods.

$ [sudo] gem install statsample-optimization

If you need to work on Structural Equation Modeling, you could see +statsample-sem+. You need R with +sem+ or +OpenMx+ [http://openmx.psyc.virginia.edu/] libraries installed

$ [sudo] gem install statsample-sem


See CONTRIBUTING for information on testing and contributing to statsample.


You can see the latest documentation in rubydoc.info.



You can see some iruby notebooks here:



Working with DataFrame and Vector


See the /examples directory for some use cases. The notebooks listed above have mostly the same examples, and they look better so you might want to see that first.


A suite for basic and advanced statistics on Ruby. Tested on CRuby 2.0.0, 2.1.1, 2.2 and 2.3.0 See .travis.yml for more information.


  • Descriptive statistics: frequencies, median, mean, standard error, skew, kurtosis (and many others).
  • Correlations: Pearson's r, Spearman's rank correlation (rho), point biserial, tau a, tau b and gamma. Tetrachoric and Polychoric correlation provides by +statsample-bivariate-extension+ gem.
  • Intra-class correlation
  • Anova: generic and vector-based One-way ANOVA and Two-way ANOVA, with contrasts for One-way ANOVA.
  • Tests: F, T, Levene, U-Mannwhitney.
  • Regression: Simple, Multiple (OLS)
  • Factorial Analysis: Extraction (PCA and Principal Axis), Rotation (Varimax, Equimax, Quartimax) and Parallel Analysis and Velicer's MAP test, for estimation of number of factors.
  • Reliability analysis for simple scale and a DSL to easily analyze multiple scales using factor analysis and correlations, if you want it.
  • Basic time series support
  • Dominance Analysis, with multivariate dependent and bootstrap (Azen & Budescu)
  • Sample calculation related formulas
  • Structural Equation Modeling (SEM), using R libraries +sem+ and +OpenMx+
  • Creates reports on text, html and rtf, using ReportBuilder gem
  • Graphics: Histogram, Boxplot and Scatterplot


  • Software Design:
    • One module/class for each type of analysis
    • Options can be set as hash on initialize() or as setters methods
    • Clean API for interactive sessions
    • summary() returns all necessary informacion for interactive sessions
    • All statistical data available though methods on objects
    • All (important) methods should be tested. Better with random data.
  • Statistical Design
    • Results are tested against text results, SPSS and R outputs.
    • Go beyond Null Hiphotesis Testing, using confidence intervals and effect sizes when possible
    • (When possible) All references for methods are documented, providing sensible information on documentation


  • Classes for manipulation and storage of data:
    • Uses daru for storing data and basic statistics.
    • Statsample::Multiset: multiple datasets with same fields and type of vectors
  • Anova module provides generic Statsample::Anova::OneWay and vector based Statsample::Anova::OneWayWithVectors. Also you can create contrast using Statsample::Anova::Contrast
  • Module Statsample::Bivariate provides covariance and pearson, spearman, point biserial, tau a, tau b, gamma, tetrachoric (see Bivariate::Tetrachoric) and polychoric (see Bivariate::Polychoric) correlations. Include methods to create correlation and covariance matrices
  • Multiple types of regression.
    • Simple Regression : Statsample::Regression::Simple
    • Multiple Regression: Statsample::Regression::Multiple
  • Factorial Analysis algorithms on Statsample::Factor module.
    • Classes for Extraction of factors:
    • Statsample::Factor::PCA
    • Statsample::Factor::PrincipalAxis
    • Classes for Rotation of factors:
    • Statsample::Factor::Varimax
    • Statsample::Factor::Equimax
    • Statsample::Factor::Quartimax
    • Classes for calculation of factors to retain
    • Statsample::Factor::ParallelAnalysis performs Horn's 'parallel analysis' to a principal components analysis to adjust for sample bias in the retention of components.
    • Statsample::Factor::MAP performs Velicer's Minimum Average Partial (MAP) test, which retain components as long as the variance in the correlation matrix represents systematic variance.
  • Dominance Analysis. Based on Budescu and Azen papers, dominance analysis is a method to analyze the relative importance of one predictor relative to another on multiple regression
    • Statsample::DominanceAnalysis class can report dominance analysis for a sample, using uni or multivariate dependent variables
    • Statsample::DominanceAnalysis::Bootstrap can execute bootstrap analysis to determine dominance stability, as recomended by Azen & Budescu (2003) link[http://psycnet.apa.org/journals/met/8/2/129/].
  • Module Statsample::Codification, to help to codify open questions
  • Converters to export data:
    • Statsample::Mx : Write Mx Files
    • Statsample::GGobi : Write Ggobi files
  • Module Statsample::Crosstab provides function to create crosstab for categorical data
  • Module Statsample::Reliability provides functions to analyze scales with psychometric methods.
    • Class Statsample::Reliability::ScaleAnalysis provides statistics like mean, standard deviation for a scale, Cronbach's alpha and standarized Cronbach's alpha, and for each item: mean, correlation with total scale, mean if deleted, Cronbach's alpha is deleted.
    • Class Statsample::Reliability::MultiScaleAnalysis provides a DSL to easily analyze reliability of multiple scales and retrieve correlation matrix and factor analysis of them.
    • Class Statsample::Reliability::ICC provides intra-class correlation, using Shrout & Fleiss(1979) and McGraw & Wong (1996) formulations.
  • Module Statsample::SRS (Simple Random Sampling) provides a lot of functions to estimate standard error for several type of samples
  • Module Statsample::Test provides several methods and classes to perform inferencial statistics
    • Statsample::Test::BartlettSphericity
    • Statsample::Test::ChiSquare
    • Statsample::Test::F
    • Statsample::Test::KolmogorovSmirnov (only D value)
    • Statsample::Test::Levene
    • Statsample::Test::UMannWhitney
    • Statsample::Test::T
    • Statsample::Test::WilcoxonSignedRank
  • Module Graph provides several classes to create beautiful graphs using rubyvis
    • Statsample::Graph::Boxplot
    • Statsample::Graph::Histogram
    • Statsample::Graph::Scatterplot
  • Gem bio-statsample-timeseries provides module Statsample::TimeSeries with support for time series, including ARIMA estimation using Kalman-Filter.
  • Gem statsample-sem provides a DSL to R libraries +sem+ and +OpenMx+
  • Gem statsample-glm provides you with GML method, to work with Logistic, Poisson and Gaussian regression ,using ML or IRWLS.
  • Close integration with gem reportbuilder, to easily create reports on text, html and rtf formats.



BSD-3 (See LICENSE.txt)

Could change between version, without previous warning. If you want a specific license, just choose the version that you need.

*Note that all licence references and agreements mentioned in the statsample README section above are relevant to that project's source code only.