Description
Creek is a Ruby gem that provides a fast, simple and efficient method of parsing large Excel (xlsx and xlsm) files.
Creek alternatives and similar gems
Based on the "Spreadsheets and Documents" category.
Alternatively, view Creek alternatives based on common mentions on social networks and blogs.
-
AXLSX
xlsx generation with charts, images, automated column width, customizable styles and full schema validation. Axlsx excels at helping you generate beautiful Office Open XML Spreadsheet documents without having to understand the entire ECMA specification. Check out the README for some examples of how easy it is. Best of all, you can validate your xlsx file before serialization so you know for sure that anything generated is going to load on your client's machine. -
Spreadsheet Architect
Spreadsheet Architect is a library that allows you to create XLSX, ODS, or CSV spreadsheets super easily from ActiveRecord relations, plain Ruby objects, or tabular data. -
Xsv .xlsx reader
High performance, lightweight .xlsx parser for Ruby that provides nothing a CSV parser wouldn't
InfluxDB - Power Real-Time Data Analytics at Scale
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.
Do you think we are missing an alternative of Creek or a related project?
Popular Comparisons
README
Creek - Stream parser for large Excel (xlsx and xlsm) files.
Creek is a Ruby gem that provides a fast, simple and efficient method of parsing large Excel (xlsx and xlsm) files.
Installation
Creek can be used from the command line or as part of a Ruby web framework. To install the gem using terminal, run the following command:
gem install creek
To use it in Rails, add this line to your Gemfile:
gem 'creek'
Basic Usage
Creek can simply parse an Excel file by looping through the rows enumerator:
require 'creek'
creek = Creek::Book.new 'spec/fixtures/sample.xlsx'
sheet = creek.sheets[0]
sheet.rows.each do |row|
puts row # => {"A1"=>"Content 1", "B1"=>nil, "C1"=>nil, "D1"=>"Content 3"}
end
sheet.simple_rows.each do |row|
puts row # => {"A"=>"Content 1", "B"=>nil, "C"=>nil, "D"=>"Content 3"}
end
sheet.rows_with_meta_data.each do |row|
puts row # => {"collapsed"=>"false", "customFormat"=>"false", "customHeight"=>"true", "hidden"=>"false", "ht"=>"12.1", "outlineLevel"=>"0", "r"=>"1", "cells"=>{"A1"=>"Content 1", "B1"=>nil, "C1"=>nil, "D1"=>"Content 3"}}
end
sheet.simple_rows_with_meta_data.each do |row|
puts row # => {"collapsed"=>"false", "customFormat"=>"false", "customHeight"=>"true", "hidden"=>"false", "ht"=>"12.1", "outlineLevel"=>"0", "r"=>"1", "cells"=>{"A"=>"Content 1", "B"=>nil, "C"=>nil, "D"=>"Content 3"}}
end
sheet.state # => 'visible'
sheet.name # => 'Sheet1'
sheet.rid # => 'rId2'
Filename considerations
By default, Creek will ensure that the file extension is either *.xlsx or *.xlsm, but this check can be circumvented as needed:
path = 'sample-as-zip.zip'
Creek::Book.new path, :check_file_extension => false
By default, the Rails file_field_tag uploads to a temporary location and stores the original filename with the StringIO object. (See this section of the Rails Guides for more information.)
Creek can parse this directly without the need for file upload gems such as Carrierwave or Paperclip by passing the original filename as an option:
# Import endpoint in Rails controller
def import
file = params[:file]
Creek::Book.new file.path, check_file_extension: false
end
Parsing images
Creek does not parse images by default. If you want to parse the images,
use with_images
method before iterating over rows to preload images information. If you don't call this method, Creek will not return images anywhere.
Cells with images will be an array of Pathname objects. If an image is spread across multiple cells, same Pathname object will be returned for each cell.
sheet.with_images.rows.each do |row|
puts row # => {"A1"=>[#<Pathname:/var/folders/ck/l64nmm3d4k75pvxr03ndk1tm0000gn/T/creek__drawing20161101-53599-274q0vimage1.jpeg>], "B2"=>"Fluffy"}
end
Images for a specific cell can be obtained with images_at method:
puts sheet.images_at('A1') # => [#<Pathname:/var/folders/ck/l64nmm3d4k75pvxr03ndk1tm0000gn/T/creek__drawing20161101-53599-274q0vimage1.jpeg>]
# no images in a cell
puts sheet.images_at('C1') # => nil
Creek will most likely return nil for a cell with images if there is no other text cell in that row - you can use images_at method for retrieving images in that cell.
Remote files
remote_url = 'http://dev-builds.libreoffice.org/tmp/test.xlsx'
Creek::Book.new remote_url, remote: true
Mapping cells with header names
By default, Creek will map cell names with letter and number(A1, B3 and etc). To be able to get cell values by header column name use with_headers (can be used only with #simple_rows method!!!) during creation (Note: header column is first string of sheet)
creek = Creek::Book.new file.path, with_headers: true
Contributing
Contributions are welcomed. You can fork a repository, add your code changes to the forked branch, ensure all existing unit tests pass, create new unit tests which cover your new changes and finally create a pull request.
After forking and then cloning the repository locally, install the Bundler and then use it to install the development gem dependencies:
gem install bundler
bundle install
Once this is complete, you should be able to run the test suite:
rake
There are some remote tests that are excluded by default. To run those, run
bundle exec rspec --tag remote
Bug Reporting
Please use the Issues page to report bugs or suggest new enhancements.
License
Creek has been published under MIT License
*Note that all licence references and agreements mentioned in the Creek README section above
are relevant to that project's source code only.