Description
Ruby gem generating image thumbnails from a given URL. Rank them and give you back an object containing images and website informations. Works like Facebook link previewer.
Demo Application is here !
The source code of the Demo Application is hosted here!
LinkThumbnailer alternatives and similar gems
Based on the "Web Crawling" category.
Alternatively, view LinkThumbnailer alternatives based on common mentions on social networks and blogs.
-
FastImage
FastImage finds the size or type of an image given its uri by fetching as little as needed -
Wombat
Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages. -
MetaInspector
Ruby gem for web scraping purposes. It scrapes a given URL, and returns you its title, meta description, meta keywords, links, images... -
Spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use. -
instabot.rb
An instagram bot works without instagram api, only needs your username and password. written in ruby -
The Hawker Ruby gem
The Hawker gem is a web scraper which allows you to pull the basic information for given social media profile URL -
Kimurai
DISCONTINUED. Kimurai is a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with Javascript rendered websites
CodeRabbit: AI Code Reviews for Developers

* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.
Do you think we are missing an alternative of LinkThumbnailer or a related project?
README
LinkThumbnailer
Ruby gem generating image thumbnails from a given URL. Rank them and give you back an object containing images and website informations. Works like Facebook link previewer.
Demo Application is here ! The source code of the Demo Application is hosted here!
Features
- Dead simple.
- Support OpenGraph protocol.
- Find and sort images that best represent what the page is about.
- Find and rate description that best represent what the page is about.
- Allow for custom class to sort the website descriptions yourself.
- Support image urls blacklisting (advertisements).
- Works with and without Rails.
- Fully customizable.
- Fully tested.
Installation
Add this line to your application's Gemfile:
gem 'link_thumbnailer'
And then execute:
$ bundle
Or install it yourself as:
$ gem install link_thumbnailer
If you are using Rails, you can generate the configuration file with:
$ rails g link_thumbnailer:install
This will add link_thumbnailer.rb
to config/initializers/
.
Usage
Run irb
and require the gem:
require 'link_thumbnailer'
The gem handle regular website but also website that use the Opengraph protocol.
object = LinkThumbnailer.generate('http://stackoverflow.com')
=> #<LinkThumbnailer::Models::Website:...>
object.title
=> "Stack Overflow"
object.favicon
=> "//cdn.sstatic.net/stackoverflow/img/favicon.ico?v=038622610830"
object.description
=> "Q&A for professional and enthusiast programmers"
object.images.first.src.to_s
=> "http://cdn.sstatic.net/stackoverflow/img/[email protected]?v=fde65a5a78c6"
LinkThumbnailer generate
method return an instance of LinkThumbnailer::Models::Website
that respond to to_json
and as_json
as you would expect:
object.to_json
=> "{\"url\":\"http://stackoverflow.com\",\"title\":\"Stack Overflow\",\"description\":\"Q&A for professional and enthusiast programmers\",\"images\":[{\"src\":\"http://cdn.sstatic.net/stackoverflow/img/[email protected]?v=fde65a5a78c6\",\"size\":[316,316],\"type\":\"png\"}]}"
Configuration
LinkThumbnailer comes with default configuration values. You can change default value by overriding them in a rails initializer:
In config/initializers/link_thumbnailer.rb
LinkThumbnailer.configure do |config|
# Numbers of redirects before raising an exception when trying to parse given url.
#
# config.redirect_limit = 3
# Set user agent
#
# config.user_agent = 'link_thumbnailer'
# Enable or disable SSL verification
#
# config.verify_ssl = true
# The amount of time in seconds to wait for a connection to be opened.
# If the HTTP object cannot open a connection in this many seconds,
# it raises a Net::OpenTimeout exception.
#
# See http://www.ruby-doc.org/stdlib-2.1.1/libdoc/net/http/rdoc/Net/HTTP.html#open_timeout
#
# config.http_open_timeout = 5
# List of blacklisted urls you want to skip when searching for images.
#
# config.blacklist_urls = [
# %r{^http://ad\.doubleclick\.net/},
# %r{^http://b\.scorecardresearch\.com/},
# %r{^http://pixel\.quantserve\.com/},
# %r{^http://s7\.addthis\.com/}
# ]
# List of attributes you want LinkThumbnailer to fetch on a website.
#
# config.attributes = [:title, :images, :description, :videos, :favicon]
# List of procedures used to rate the website description. Add you custom class
# here. See wiki for more details on how to build your own graders.
#
# config.graders = [
# ->(description) { ::LinkThumbnailer::Graders::Length.new(description) },
# ->(description) { ::LinkThumbnailer::Graders::HtmlAttribute.new(description, :class) },
# ->(description) { ::LinkThumbnailer::Graders::HtmlAttribute.new(description, :id) },
# ->(description) { ::LinkThumbnailer::Graders::Position.new(description, weight: 3) },
# ->(description) { ::LinkThumbnailer::Graders::LinkDensity.new(description) }
# ]
# Minimum description length for a website.
#
# config.description_min_length = 25
# Regex of words considered positive to rate website description.
#
# config.positive_regex = /article|body|content|entry|hentry|main|page|pagination|post|text|blog|story/i
# Regex of words considered negative to rate website description.
#
# config.negative_regex = /combx|comment|com-|contact|foot|footer|footnote|masthead|media|meta|outbrain|promo|related|scroll|shoutbox|sidebar|sponsor|shopping|tags|tool|widget|modal/i
# Numbers of images to fetch. Fetching too many images will be slow.
# Note that LinkThumbnailer will only sort fetched images between each other.
# Meaning that they could be a "better" image on the page.
#
# config.image_limit = 5
# Whether you want LinkThumbnailer to return image size and type or not.
# Setting this value to false will increase performance since for each images, LinkThumbnailer
# does not have to fetch its size and type.
#
# config.image_stats = true
#
# Whether you want LinkThumbnailer to raise an exception if the Content-Type of the HTTP request
# is not an html or xml.
#
# config.raise_on_invalid_format = false
#
# Sets number of concurrent http connections that can be opened to fetch images informations such as size and type.
#
# config.max_concurrency = 20
# Sets the default encoding.
#
# config.encoding = 'utf-8'
end
Or at runtime:
object = LinkThumbnailer.generate('http://stackoverflow.com', redirect_limit: 5, user_agent: 'foo')
Note that runtime options will override default global configuration.
See Configuration Options Explained for more details on each configuration options.
Exceptions
LinkThumbnailer defines a list of custom exceptions you may want to rescue in your code. All the following exceptions inherit from LinkThumbnailer::Exceptions
:
RedirectLimit
-- raised when redirection threshold defined in config is reachedBadUriFormat
-- raised when url given is not a valid HTTP urlFormatNotSupported
-- raised when theContent-Type
of the HTTP request is not supported (nothtml
)
You can rescue from any LinkThumbnailer exceptions using the following code:
begin
LinkThumbnailer.generate('http://foo.com')
rescue LinkThumbnailer::Exceptions => e
# do something
end
Contributing
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Run the specs (
bundle exec rspec spec
) - Commit your changes (
git commit -am 'Added some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request