Changelog History
Page 1
-
v1.13.9 Changes
October 18, 2022🔒 Security
- ⚡️ [CRuby] Vendored libxml2 is updated to address CVE-2022-2309, CVE-2022-40304, and CVE-2022-40303. See GHSA-2qc6-mcvw-92cw for more information.
- ⚡️ [CRuby] Vendored zlib is updated to address CVE-2022-37434. Nokogiri was not affected by this vulnerability, but this version of zlib was being flagged up by some vulnerability scanners, see #2626 for more information.
Dependencies
- 🚀 [CRuby] Vendored libxml2 is updated to v2.10.3 from v2.9.14.
- 🚀 [CRuby] Vendored libxslt is updated to v1.1.37 from v1.1.35.
- 🚀 [CRuby] Vendored zlib is updated from 1.2.12 to 1.2.13. (See LICENSE-DEPENDENCIES.md for details on which packages redistribute this library.)
🛠 Fixed
- ⚡️ [CRuby]
Nokogiri::XML::Namespace
objects, when compacted, update their internal struct's reference to the Ruby object wrapper. Previously, with GC compaction enabled, a segmentation fault was possible after compaction was triggered. [#2658] (Thanks, @eightbitraptor and @peterzhu2118!) - 🚚 [CRuby]
Document#remove_namespaces!
now defers freeing the underlyingxmlNs
struct until theDocument
is GCed. Previously, maintaining a reference to aNamespace
object that was removed in this way could lead to a segfault. [#2658]
-
v1.13.8 Changes
July 23, 2022🗄 Deprecated
- 🚚
XML::Reader#attribute_nodes
is deprecated due to incompatibility between libxml2'sxmlReader
memory semantics and Ruby's garbage collector. Although this method continues to exist for backwards compatibility, it is unsafe to call and may segfault. This method will be removed in a future version of Nokogiri, and callers should use#attribute_hash
instead. [#2598]
👌 Improvements
XML::Reader#attribute_hash
is a new method to safely retrieve the attributes of a node fromXML::Reader
. [#2598, #2599]
🛠 Fixed
- 🚚
-
v1.13.7 Changes
July 12, 2022🛠 Fixed
⚡️
XML::Node
objects, when compacted, update their internal struct's reference to the Ruby object wrapper. Previously, with GC compaction enabled, a segmentation fault was possible after compaction was triggered. [#2578] (Thanks, @eightbitraptor!) -
v1.13.6 Changes
May 08, 2022🔒 Security
- 🔒 [CRuby] Address CVE-2022-29181, improper handling of unexpected data types, related to untrusted inputs to the SAX parsers. See GHSA-xh29-r2w5-wx8m for more information.
👌 Improvements
- 📜
{HTML4,XML}::SAX::{Parser,ParserContext}
constructor methods now raiseTypeError
instead of segfaulting when an incorrect type is passed.
-
v1.13.5 Changes
May 04, 2022🔒 Security
- ⚡️ [CRuby] Vendored libxml2 is updated to address CVE-2022-29824. See GHSA-cgx6-hpwq-fhv5 for more information.
Dependencies
- 🚀 [CRuby] Vendored libxml2 is updated from v2.9.13 to v2.9.14.
👌 Improvements
- 📜 [CRuby] The libxml2 HTML parser no longer exhibits quadratic behavior when recovering some broken markup related to start-of-tag and bare
<
characters.
🔄 Changed
- ✅ [CRuby] The libxml2 HTML parser in v2.9.14 recovers from some broken markup differently. Notably, the XML CDATA escape sequence
<![CDATA[
and incorrectly-opened comments will result in HTML text nodes starting with<!
instead of skipping the invalid tag. This behavior is a direct result of the quadratic-behavior fix noted above. The behavior of downstream sanitizers relying on this behavior will also change. Some tests describing the changed behavior are intest/html4/test_comments.rb
.
-
v1.13.4 Changes
April 11, 2022🔒 Security
- ➕ Address CVE-2022-24836, a regular expression denial-of-service vulnerability. See GHSA-crjr-9rc5-ghw8 for more information.
- ⚡️ [CRuby] Vendored zlib is updated to address CVE-2018-25032. See GHSA-v6gp-9mmm-c6p5 for more information.
- ⚡️ [JRuby] Vendored Xerces-J (
xerces:xercesImpl
) is updated to address CVE-2022-23437. See GHSA-xxx9-3xcr-gjj3 for more information. - ⚡️ [JRuby] Vendored nekohtml (
org.cyberneko.html
) is updated to address CVE-2022-24839. See GHSA-gx8x-g87m-h5q6 for more information.
Dependencies
- 🚀 [CRuby] Vendored zlib is updated from 1.2.11 to 1.2.12. (See LICENSE-DEPENDENCIES.md for details on which packages redistribute this library.)
- ⚡️ [JRuby] Vendored Xerces-J (
xerces:xercesImpl
) is updated from 2.12.0 to 2.12.2. - ⚡️ [JRuby] Vendored nekohtml (
org.cyberneko.html
) is updated from a fork of 1.9.21 to 1.9.22.noko2. This fork is now publicly developed at https://github.com/sparklemotion/nekohtml
-
v1.13.3 Changes
February 21, 2022🛠 Fixed
- ⏪ [CRuby] Revert a HTML4 parser bug in libxml 2.9.13 (introduced in Nokogiri v1.13.2). The bug causes libxml2's HTML4 parser to fail to recover when encountering a bare
<
character in some contexts. This version of Nokogiri restores the earlier behavior, which is to recover from the parse error and treat the<
as normal character data (which will be serialized as<
in a text node). The bug (and the fix) is only relevant when theRECOVER
parse option is set, as it is by default. [#2461]
- ⏪ [CRuby] Revert a HTML4 parser bug in libxml 2.9.13 (introduced in Nokogiri v1.13.2). The bug causes libxml2's HTML4 parser to fail to recover when encountering a bare
-
v1.13.2 Changes
February 21, 2022🔒 Security
- ⚡️ [CRuby] Vendored libxml2 is updated from 2.9.12 to 2.9.13. This update addresses CVE-2022-23308.
- ⚡️ [CRuby] Vendored libxslt is updated from 1.1.34 to 1.1.35. This update addresses CVE-2021-30560.
🔒 Please see GHSA-fq42-c5rg-92c2 for more information about these CVEs.
Dependencies
- ⚡️ [CRuby] Vendored libxml2 is updated from 2.9.12 to 2.9.13. Full changelog is available at https://download.gnome.org/sources/libxml2/2.9/libxml2-2.9.13.news
- ⚡️ [CRuby] Vendored libxslt is updated from 1.1.34 to 1.1.35. Full changelog is available at https://download.gnome.org/sources/libxslt/1.1/libxslt-1.1.35.news
-
v1.13.1 Changes
January 13, 2022🛠 Fixed
- 💅 Fix
Nokogiri::XSLT.quote_params
regression in v1.13.0 that raised an exception when non-string stylesheet parameters were passed. Non-string parameters (e.g., integers and symbols) are now explicitly supported and both keys and values will be stringified with#to_s
. [#2418] - 🛠 Fix CSS selector query regression in v1.13.0 that raised an
Nokogiri::XML::XPath::SyntaxError
when parsing XPath attributes mixed into the CSS query. Although this mash-up of XPath and CSS syntax previously worked unintentionally, it is now an officially supported feature and is documented as such. [#2419]
- 💅 Fix
-
v1.13.0 Changes
January 06, 2022Notes
💎 Ruby
🚀 This release introduces native gem support for Ruby 3.1. Please note that Windows users should use the
x64-mingw-ucrt
platform gem for Ruby 3.1, andx64-mingw32
for Ruby 2.6–3.0 (see RubyInstaller 3.1.0 release notes).🚀 This release ends support for:
- 💎 Ruby 2.5, for which official support ended 2021-03-31.
- 🚀 JRuby 9.2, which is a Ruby 2.5-compatible release.
🐧 Faster, more reliable installation: Native Gem for ARM64 Linux
🐧 This version of Nokogiri ships experimental native gem support for the
aarch64-linux
platform, which should support AWS Graviton and other ARM Linux platforms. We don't yet have CI running for this platform, and so we're interested in hearing back from y'all whether this is working, and what problems you're seeing. Please send us feedback here: Feedback: Have you used theaarch64-linux
native gem?Publishing
💎 This version of Nokogiri opts-in to the "MFA required to publish" setting on Rubygems.org. This and all future Nokogiri gem files must be published to Rubygems by an account with multi-factor authentication enabled. This should provide some additional protection against supply-chain attacks.
A related discussion about Trust exists at #2357 in which I invite you to participate if you have feelings or opinions on this topic.
Dependencies
- ⚡️ [CRuby] Vendored libiconv is updated from 1.15 to 1.16. (Note that libiconv is only redistributed in the native windows and native darwin gems, see [
LICENSE-DEPENDENCIES.md
](LICENSE-DEPENDENCIES.md) for more information.) [#2206] - ⬆️ [CRuby] Upgrade mini_portile2 dependency from
~> 2.6.1
to~> 2.7.0
. ("ruby" platform gem only.)
👌 Improved
- 📜
{XML,HTML4}::DocumentFragment
constructors all now take an optional parse options parameter or block (similar to Document constructors). [#1692] (Thanks, @JackMc!) Nokogiri::CSS.xpath_for
allows anXPathVisitor
to be injected, for finer-grained control over how CSS queries are translated into XPath.- 📜 [CRuby]
XML::Reader#encoding
will return the encoding detected by the parser when it's not passed to the constructor. [#980] - 💎 [CRuby] Handle abruptly-closed HTML comments as recommended by WHATWG. (Thanks to tehryanx for reporting!)
- [CRuby]
Node#line
is no longer capped at 65535. libxml v2.9.0 and later support a new parse option, exposed asNokogiri::XML::ParseOptions::PARSE_BIG_LINES
, which is turned on by default inParseOptions::DEFAULT_{XML,XSLT,HTML,SCHEMA}
(Note that JRuby already supported large line numbers.) [#1764, #1493, #1617, #1505, #1003, #533] - 💎 [CRuby] If a cycle is introduced when reparenting a node (i.e., the node becomes its own ancestor), a
RuntimeError
is raised. libxml2 does no checking for this, which means cycles would otherwise result in infinite loops on subsequent operations. (Note that JRuby already did this.) [#1912] - 🏗 [CRuby] Source builds will download zlib and libiconv via HTTPS. ("ruby" platform gem only.) [#2391] (Thanks, @jmartin-r7!)
- [JRuby]
Node#line
behavior has been modified to return the line number of the node in the final DOM structure. This behavior is different from CRuby, which returns the node's position in the input string. Ideally the two implementations would be the same, but at least is now officially documented and tested. The real-world impact of this change is that the value returned in JRuby is greater by 1 to account for the XML prolog in the output. [#2380] (Thanks, @dabdine!)
🛠 Fixed
- CSS queries on HTML5 documents now correctly match foreign elements (SVG, MathML) when namespaces are not specified in the query. [#2376]
- 🏗
XML::Builder
blocks restore context properly when exceptions are raised. [#2372] (Thanks, @ric2b and @rinthedev!) - 🔧 The
Nokogiri::CSS::Parser
cache now uses theXPathVisitor
configuration as part of the cache key, preventing incorrect cache results from being returned when multipleXPathVisitor
options are being used. - 📜 Error recovery from in-context parsing (e.g.,
Node#parse
) now always uses the correctDocumentFragment
class. PreviouslyNokogiri::HTML4::DocumentFragment
was always used, even for XML documents. [#1158] DocumentFragment#>
now works properly, matching a CSS selector against only the fragment roots. [#1857]- 📜
XML::DocumentFragment#errors
now correctly contains any parsing errors encountered. Previously this was always empty. (Note thatHTML::DocumentFragment#errors
already did this.) - 💎 [CRuby] Fix memory leak in
Document#canonicalize
when inclusive namespaces are passed in. [#2345] - 💎 [CRuby] Fix memory leak in
Document#canonicalize
when an argument type error is raised. [#2345] - 💎 [CRuby] Fix memory leak in
EncodingHandler
where iconv handlers were not being cleaned up. [#2345] - 💎 [CRuby] Fix memory leak in XPath custom handlers where string arguments were not being cleaned up. [#2345]
- 💎 [CRuby] Fix memory leak in
Reader#base_uri
where the string returned by libxml2 was not freed. [#2347] - 0️⃣ [JRuby] Deleting a
Namespace
from aNodeSet
no longer modifies thehref
to be the default namespace URL. - 💎 [JRuby] Fix XHTML formatting of closing tags for non-container elements. [#2355]
🗄 Deprecated
- 🗄 Passing a
Nokogiri::XML::Node
as the second parameter toNode.new
is deprecated and will generate a warning. This parameter should be a kind ofNokogiri::XML::Document
. This will become an error in a future version of Nokogiri. [#975] - 📜
Nokogiri::CSS::Parser
,Nokogiri::CSS::Tokenizer
, andNokogiri::CSS::Node
are now internal-only APIs that are no longer documented, and should not be considered stable. With the introduction ofXPathVisitor
injection intoNokogiri::CSS.xpath_for
there should be no reason to rely on these internal APIs. - 🚚 CSS-to-XPath utility classes
Nokogiri::CSS::XPathVisitorAlwaysUseBuiltins
andXPathVisitorOptimallyUseBuiltins
are deprecated. PreferNokogiri::CSS::XPathVisitor
with appropriate constructor arguments. These classes will be removed in a future version of Nokogiri.