All Versions
109
Latest Version
Avg Release Cycle
30 days
Latest Release
17 days ago

Changelog History
Page 1

  • v1.13.8 Changes

    July 23, 2022

    ๐Ÿ—„ Deprecated

    • ๐Ÿšš XML::Reader#attribute_nodes is deprecated due to incompatibility between libxml2's xmlReader memory semantics and Ruby's garbage collector. Although this method continues to exist for backwards compatibility, it is unsafe to call and may segfault. This method will be removed in a future version of Nokogiri, and callers should use #attribute_hash instead. [#2598]

    ๐Ÿ‘Œ Improvements

    • XML::Reader#attribute_hash is a new method to safely retrieve the attributes of a node from XML::Reader. [#2598, #2599]

    ๐Ÿ›  Fixed

    • ๐Ÿ’Ž [CRuby] Calling XML::Reader#attributes is now safe to call. In Nokogiri <= 1.13.7 this method may segfault. [#2598, #2599]
  • v1.13.7 Changes

    July 12, 2022

    ๐Ÿ›  Fixed

    โšก๏ธ XML::Node objects, when compacted, update their internal struct's reference to the Ruby object wrapper. Previously, with GC compaction enabled, a segmentation fault was possible after compaction was triggered. [#2578] (Thanks, @eightbitraptor!)

  • v1.13.6 Changes

    May 08, 2022

    ๐Ÿ”’ Security

    • ๐Ÿ”’ [CRuby] Address CVE-2022-29181, improper handling of unexpected data types, related to untrusted inputs to the SAX parsers. See GHSA-xh29-r2w5-wx8m for more information.

    ๐Ÿ‘Œ Improvements

    • ๐Ÿ“œ {HTML4,XML}::SAX::{Parser,ParserContext} constructor methods now raise TypeError instead of segfaulting when an incorrect type is passed.
  • v1.13.5 Changes

    May 04, 2022

    ๐Ÿ”’ Security

    Dependencies

    • ๐Ÿš€ [CRuby] Vendored libxml2 is updated from v2.9.13 to v2.9.14.

    ๐Ÿ‘Œ Improvements

    • ๐Ÿ“œ [CRuby] The libxml2 HTML parser no longer exhibits quadratic behavior when recovering some broken markup related to start-of-tag and bare < characters.

    ๐Ÿ”„ Changed

    • โœ… [CRuby] The libxml2 HTML parser in v2.9.14 recovers from some broken markup differently. Notably, the XML CDATA escape sequence <![CDATA[ and incorrectly-opened comments will result in HTML text nodes starting with &lt;! instead of skipping the invalid tag. This behavior is a direct result of the quadratic-behavior fix noted above. The behavior of downstream sanitizers relying on this behavior will also change. Some tests describing the changed behavior are in test/html4/test_comments.rb.
  • v1.13.4 Changes

    April 11, 2022

    ๐Ÿ”’ Security

    Dependencies

    • ๐Ÿš€ [CRuby] Vendored zlib is updated from 1.2.11 to 1.2.12. (See LICENSE-DEPENDENCIES.md for details on which packages redistribute this library.)
    • โšก๏ธ [JRuby] Vendored Xerces-J (xerces:xercesImpl) is updated from 2.12.0 to 2.12.2.
    • โšก๏ธ [JRuby] Vendored nekohtml (org.cyberneko.html) is updated from a fork of 1.9.21 to 1.9.22.noko2. This fork is now publicly developed at https://github.com/sparklemotion/nekohtml
  • v1.13.3 Changes

    February 21, 2022

    ๐Ÿ›  Fixed

    • โช [CRuby] Revert a HTML4 parser bug in libxml 2.9.13 (introduced in Nokogiri v1.13.2). The bug causes libxml2's HTML4 parser to fail to recover when encountering a bare < character in some contexts. This version of Nokogiri restores the earlier behavior, which is to recover from the parse error and treat the < as normal character data (which will be serialized as &lt; in a text node). The bug (and the fix) is only relevant when the RECOVER parse option is set, as it is by default. [#2461]
  • v1.13.2 Changes

    February 21, 2022

    ๐Ÿ”’ Security

    • โšก๏ธ [CRuby] Vendored libxml2 is updated from 2.9.12 to 2.9.13. This update addresses CVE-2022-23308.
    • โšก๏ธ [CRuby] Vendored libxslt is updated from 1.1.34 to 1.1.35. This update addresses CVE-2021-30560.

    ๐Ÿ”’ Please see GHSA-fq42-c5rg-92c2 for more information about these CVEs.

    Dependencies

  • v1.13.1 Changes

    January 13, 2022

    ๐Ÿ›  Fixed

    • ๐Ÿ’… Fix Nokogiri::XSLT.quote_params regression in v1.13.0 that raised an exception when non-string stylesheet parameters were passed. Non-string parameters (e.g., integers and symbols) are now explicitly supported and both keys and values will be stringified with #to_s. [#2418]
    • ๐Ÿ›  Fix CSS selector query regression in v1.13.0 that raised an Nokogiri::XML::XPath::SyntaxError when parsing XPath attributes mixed into the CSS query. Although this mash-up of XPath and CSS syntax previously worked unintentionally, it is now an officially supported feature and is documented as such. [#2419]
  • v1.13.0 Changes

    January 06, 2022

    Notes

    ๐Ÿ’Ž Ruby

    ๐Ÿš€ This release introduces native gem support for Ruby 3.1. Please note that Windows users should use the x64-mingw-ucrt platform gem for Ruby 3.1, and x64-mingw32 for Ruby 2.6–3.0 (see RubyInstaller 3.1.0 release notes).

    ๐Ÿš€ This release ends support for:

    ๐Ÿง Faster, more reliable installation: Native Gem for ARM64 Linux

    ๐Ÿง This version of Nokogiri ships experimental native gem support for the aarch64-linux platform, which should support AWS Graviton and other ARM Linux platforms. We don't yet have CI running for this platform, and so we're interested in hearing back from y'all whether this is working, and what problems you're seeing. Please send us feedback here: Feedback: Have you used the aarch64-linux native gem?

    Publishing

    ๐Ÿ’Ž This version of Nokogiri opts-in to the "MFA required to publish" setting on Rubygems.org. This and all future Nokogiri gem files must be published to Rubygems by an account with multi-factor authentication enabled. This should provide some additional protection against supply-chain attacks.

    A related discussion about Trust exists at #2357 in which I invite you to participate if you have feelings or opinions on this topic.

    Dependencies

    • โšก๏ธ [CRuby] Vendored libiconv is updated from 1.15 to 1.16. (Note that libiconv is only redistributed in the native windows and native darwin gems, see [LICENSE-DEPENDENCIES.md](LICENSE-DEPENDENCIES.md) for more information.) [#2206]
    • โฌ†๏ธ [CRuby] Upgrade mini_portile2 dependency from ~> 2.6.1 to ~> 2.7.0. ("ruby" platform gem only.)

    ๐Ÿ‘Œ Improved

    • ๐Ÿ“œ {XML,HTML4}::DocumentFragment constructors all now take an optional parse options parameter or block (similar to Document constructors). [#1692] (Thanks, @JackMc!)
    • Nokogiri::CSS.xpath_for allows an XPathVisitor to be injected, for finer-grained control over how CSS queries are translated into XPath.
    • ๐Ÿ“œ [CRuby] XML::Reader#encoding will return the encoding detected by the parser when it's not passed to the constructor. [#980]
    • ๐Ÿ’Ž [CRuby] Handle abruptly-closed HTML comments as recommended by WHATWG. (Thanks to tehryanx for reporting!)
    • [CRuby] Node#line is no longer capped at 65535. libxml v2.9.0 and later support a new parse option, exposed as Nokogiri::XML::ParseOptions::PARSE_BIG_LINES, which is turned on by default in ParseOptions::DEFAULT_{XML,XSLT,HTML,SCHEMA} (Note that JRuby already supported large line numbers.) [#1764, #1493, #1617, #1505, #1003, #533]
    • ๐Ÿ’Ž [CRuby] If a cycle is introduced when reparenting a node (i.e., the node becomes its own ancestor), a RuntimeError is raised. libxml2 does no checking for this, which means cycles would otherwise result in infinite loops on subsequent operations. (Note that JRuby already did this.) [#1912]
    • ๐Ÿ— [CRuby] Source builds will download zlib and libiconv via HTTPS. ("ruby" platform gem only.) [#2391] (Thanks, @jmartin-r7!)
    • [JRuby] Node#line behavior has been modified to return the line number of the node in the final DOM structure. This behavior is different from CRuby, which returns the node's position in the input string. Ideally the two implementations would be the same, but at least is now officially documented and tested. The real-world impact of this change is that the value returned in JRuby is greater by 1 to account for the XML prolog in the output. [#2380] (Thanks, @dabdine!)

    ๐Ÿ›  Fixed

    • CSS queries on HTML5 documents now correctly match foreign elements (SVG, MathML) when namespaces are not specified in the query. [#2376]
    • ๐Ÿ— XML::Builder blocks restore context properly when exceptions are raised. [#2372] (Thanks, @ric2b and @rinthedev!)
    • ๐Ÿ”ง The Nokogiri::CSS::Parser cache now uses the XPathVisitor configuration as part of the cache key, preventing incorrect cache results from being returned when multiple XPathVisitor options are being used.
    • ๐Ÿ“œ Error recovery from in-context parsing (e.g., Node#parse) now always uses the correct DocumentFragment class. Previously Nokogiri::HTML4::DocumentFragment was always used, even for XML documents. [#1158]
    • DocumentFragment#> now works properly, matching a CSS selector against only the fragment roots. [#1857]
    • ๐Ÿ“œ XML::DocumentFragment#errors now correctly contains any parsing errors encountered. Previously this was always empty. (Note that HTML::DocumentFragment#errors already did this.)
    • ๐Ÿ’Ž [CRuby] Fix memory leak in Document#canonicalize when inclusive namespaces are passed in. [#2345]
    • ๐Ÿ’Ž [CRuby] Fix memory leak in Document#canonicalize when an argument type error is raised. [#2345]
    • ๐Ÿ’Ž [CRuby] Fix memory leak in EncodingHandler where iconv handlers were not being cleaned up. [#2345]
    • ๐Ÿ’Ž [CRuby] Fix memory leak in XPath custom handlers where string arguments were not being cleaned up. [#2345]
    • ๐Ÿ’Ž [CRuby] Fix memory leak in Reader#base_uri where the string returned by libxml2 was not freed. [#2347]
    • 0๏ธโƒฃ [JRuby] Deleting a Namespace from a NodeSet no longer modifies the href to be the default namespace URL.
    • ๐Ÿ’Ž [JRuby] Fix XHTML formatting of closing tags for non-container elements. [#2355]

    ๐Ÿ—„ Deprecated

    • ๐Ÿ—„ Passing a Nokogiri::XML::Node as the second parameter to Node.new is deprecated and will generate a warning. This parameter should be a kind of Nokogiri::XML::Document. This will become an error in a future version of Nokogiri. [#975]
    • ๐Ÿ“œ Nokogiri::CSS::Parser, Nokogiri::CSS::Tokenizer, and Nokogiri::CSS::Node are now internal-only APIs that are no longer documented, and should not be considered stable. With the introduction of XPathVisitor injection into Nokogiri::CSS.xpath_for there should be no reason to rely on these internal APIs.
    • ๐Ÿšš CSS-to-XPath utility classes Nokogiri::CSS::XPathVisitorAlwaysUseBuiltins and XPathVisitorOptimallyUseBuiltins are deprecated. Prefer Nokogiri::CSS::XPathVisitor with appropriate constructor arguments. These classes will be removed in a future version of Nokogiri.
  • v1.12.5 Changes

    September 27, 2021

    ๐Ÿ”’ Security

    ๐Ÿ”’ [JRuby] Address CVE-2021-41098 (GHSA-2rr5-8q37-2w7h).

    0๏ธโƒฃ In Nokogiri v1.12.4 and earlier, on JRuby only, the SAX parsers resolve external entities (XXE) by default. This fix turns off entity-resolution-by-default in the JRuby SAX parsers to match the CRuby SAX parsers' behavior.

    ๐Ÿ’Ž CRuby users are not affected by this CVE.

    ๐Ÿ›  Fixed

    • ๐Ÿ’Ž [CRuby] Document#to_xhtml properly serializes self-closing tags in libxml > 2.9.10. A behavior change introduced in libxml 2.9.11 resulted in emitting start and and tags (e.g., <br></br>) instead of a self-closing tag (e.g., <br/>) in previous Nokogiri versions. [#2324]