Skip to the content.

SwiftSoup

Platform OS X | iOS | tvOS | watchOS | Linux SPM compatible 🐧 linux: ready Carthage compatible Build Status Version License Twitter


SwiftSoup is a pure Swift library designed for seamless HTML parsing and manipulation across multiple platforms, including macOS, iOS, tvOS, watchOS, and Linux. It offers an intuitive API that leverages the best aspects of DOM traversal, CSS selectors, and jQuery-like methods for effortless data extraction and transformation. Built to conform to the WHATWG HTML5 specification, SwiftSoup ensures that parsed HTML is structured just like modern browsers do.

Key Features:

SwiftSoup is designed to handle all types of HTML—whether perfectly structured or messy tag soup—ensuring a logical and reliable parse tree in every scenario.


Swift

Swift 5 >=2.0.0

Swift 4.2 1.7.4

Installation

Cocoapods

SwiftSoup is available through CocoaPods. To install it, simply add the following line to your Podfile:

pod 'SwiftSoup'

Carthage

SwiftSoup is also available through Carthage. To install it, simply add the following line to your Cartfile:

github "scinfu/SwiftSoup"

Swift Package Manager

SwiftSoup is also available through Swift Package Manager. To install it, simply add the dependency to your Package.Swift file:

...
dependencies: [
    .package(url: "https://github.com/scinfu/SwiftSoup.git", from: "2.6.0"),
],
targets: [
    .target( name: "YourTarget", dependencies: ["SwiftSoup"]),
]
...

Usage Examples

Parse an HTML Document

import SwiftSoup

let html = """
<html><head><title>Example</title></head>
<body><p>Hello, SwiftSoup!</p></body></html>
"""

let document: Document = try SwiftSoup.parse(html)
print(try document.title()) // Output: Example

Select Elements with CSS Query

let html = """
<html><body>
<p class='message'>SwiftSoup is powerful!</p>
<p class='message'>Parsing HTML in Swift</p>
</body></html>
"""

let document = try SwiftSoup.parse(html)
let messages = try document.select("p.message")

for message in messages {
    print(try message.text())
}
// Output:
// SwiftSoup is powerful!
// Parsing HTML in Swift

Extract Text and Attributes

let html = "<a href='https://example.com'>Visit the site</a>"
let document = try SwiftSoup.parse(html)
let link = try document.select("a").first()

if let link = link {
    print(try link.text()) // Output: Visit the site
    print(try link.attr("href")) // Output: https://example.com
}

Modify the DOM

var document = try SwiftSoup.parse("<div id='content'></div>")
let div = try document.select("#content").first()
try div?.append("<p>New content added!</p>")
print(try document.html())
// Output:
// <html><head></head><body><div id="content"><p>New content added!</p></div></body></html>

Clean HTML for Security (Whitelist)

let dirtyHtml = "<script>alert('Hacked!')</script><b>Important text</b>"
let cleanHtml = try SwiftSoup.clean(dirtyHtml, Whitelist.basic())
print(cleanHtml) // Output: <b>Important text</b>

Use CSS selectors to find elements

(from jsoup)

Selector overview

Selector combinations

Pseudo selectors

Text content pseudo selectors

Structural pseudo selectors

Optimize repeated queries

SwiftSoup provides automatic caching of parsed CSS queries to speed up repeated queries, and also to speed up parsing related queries.

The cache is controlled through the static property QueryParser.cache. By default, it is initialized with a reasonable size limit. You may replace the cache at any time; however, assigning a new cache instance will discard all previously cached values.

// Remove any cache limits.
QueryParser.cache = QueryParser.DefaultCache(limit: .unlimited)
// Limit to 1000 items. See also documentation for ``QueryParserCache/set(_:_:)``.
QueryParser.cache = QueryParser.DefaultCache(limit: .count(1000))

An alternative is to parse the query upfront and passing an Evaluator instead of query string. Since Evaluator instances are immutable they are safe to store in (static) properties or pass across isolation boundaries.

let elements: Elements = 
let eval = try QueryParser.parse("div > p")
for element in elements {
    print(try element.select(eval).text())
}

Author

Nabil Chatbi, scinfu@gmail.com

Current maintainer: Alex Ehlke, available for hire for SwiftSoup related work or other iOS projects: alex dot ehlke at gmail

Note

SwiftSoup was ported to Swift from Java Jsoup library.

License

SwiftSoup is available under the MIT license. See the LICENSE file for more info.