daru v0.1.0 Release Notes

Release Date: 2015-06-13 // almost 9 years ago
    • 🛠 Fixes
      • Update documentation and fix it in other places.
      • Fix Vector#sum_of_squares and #ranked.
      • Fixed some tests that were giving RSpec warnings
      • Fixed a bug where nyaplot not being present would raise a warning.
      • Fixed a bug in DataFrame row assignment.
    • ✨ Enhancements
      • Wrote a proper .travis.yml
      • Added optional GSL dependency gsl-nmatrix
      • Added Marshalling and unMarshalling capabilities to Vector, Index and DataFrame.
      • Added new method Daru::IO.load for loading data from files by marshalling.
      • Lots of documentation and new notebooks.
      • Added data loading and writing from and to CSV, Excel, plain text and SQL databases.
      • Daru::DataFrame and Vector have now completely replaced Statsample::Dataset and Vector.
      • Vector
        • #center
        • #standardize
        • #vector_percentile
        • Added a new wrapper class Daru::Accessors::GSLWrapper for wrapping around GSL::Vector, which works similarly to NMatrixWrapper or ArrayWrapper.
        • Added a host of statistical methods to GSLWrapper in Daru::Accessors::GSLStatistics that call the relevant GSL::Vector functions for super-fast C level computations.
        • More stats functions - #vector_standardized_compute, #vector_centered_compute, #sample_with_replacement, #sample_without_replacement
        • #only_valid for creating a Vector with only non-nil data.
        • #only_missing for creating a Vector of only missing data.
        • #only_numeric to create Vector of only numerical data.
        • Ported many Statsample::Vector stat methods to Daru::Vector. These are: #percentile, #factors, etc.
        • Added .new_with_size for creating vectors by specifying a size for the vector and a block for generating values.
        • Added Vector#verify, #recode! and #recode.
        • Added #save, #jackknife and #bootstrap.
        • Added #missing_values= that will allow setting values for treating data as 'missing'.
        • Added #split_by_separator, #split_by_separator_freq and #splitted.
        • Added #reset_index!
        • Added #any? and #all?
        • Added #db_type for guessing the type of SQL type contained in the vector.
        • Added and tested plotting support for histogram and box plot.
      • DataFrame
        • #dup_only_valid
        • #clone, #clone_only_valid, #clone_structure
        • #[]= does not clone the vector if it has the same index as the DataFrame.
        • Added a :clone option to initialize that will not clone Daru::Vectors passed into the constructor.
        • Added #save.
        • Added #only_numerics.
        • Added better iterators and changed some behaviour of previous ones to make them more ruby-like. New iterators are #map, #map!, #each, #recode and #collect.
        • Added #vector_sum and #vector_mean.
        • Added #to_gsl to convert to GSL::Matrix.
        • Added #has_missing_data? and #missing_values_rows.
        • Added #compute and #verify.
        • Added .crosstab_by_assignation to generate data frame from row, column and value vectors.
        • Added #filter_vector.
        • Added #standardize and added argument option to #dup.
        • Added #any? and #all? for vector and row axis.
        • Better creation of empty data frames.
        • Added #merge, #one_to_many, #add_vectors_by_split_recode
        • Added constant SPLIT_TOKEN and methods #add_vectors_by_split, .[], #summary.
        • Added #bootstrap.
        • Added a #filter method to wrap around #filter_vectors and #filter_rows.
        • Greatly improved plotting function.
      • Added a lazy update feature that will allow users to delay updating the missing positions index until the last possible moment.
      • Added interoperaility with rserve client which makes it possible to change daru data to R data and perform computation there.
    • 🔄 Changes
      • Changes Vector#nil_positions to Vector#missing_positions so that future changes for accomodating different values for missing data can be made easily.
      • Changed History.txt to History.md