mkoehnke / WKZombie
- понедельник, 4 апреля 2016 г. в 03:12:01
Swift
WKZombie is a Swift library for iOS/OSX to navigate within websites and collect data without the need of User Interface or API, also known as Headless browser. In addition, it can be used to run automated tests or manipulate websites using Javascript.
WKZombie is an iOS/OSX web-browser without a graphical user interface. It was developed as an experiment in order to familiarize myself with using functional concepts written in Swift (>= 2.2).
It incorporates WebKit (WKWebView) for rendering and hpple (libxml2) for parsing the HTML content. In addition, it has rudimentary support for parsing and decoding JSON elements. Chaining asynchronous actions makes the code compact and easy to use.
For more information, see Usage.
There are many use cases for a Headless Browser. Some of them are:
The following example is supposed to demonstrate the WKZombie functionality. Let's assume that we want to show all iOS Provisioning Profiles in the Apple Developer Portal.
When using a common web-browser (e.g. Mobile Safari) on iOS, you would typically type in your credentials, sign in and navigate (via links) to the Provisioning Profiles section:
The same navigation process can be reproduced automatically within an iOS/OSX app linking WKZombie Actions. In addition, it is now possible to manipulate or display this data in a native way with UITextfield, UIButton and a UITableView. Take a look at the demo project to see how to use it.
A WKZombie instance equates to a web session, which can be created using the following line:
let browser = WKZombie(name: "Demo")
Web page navigation is based on Actions, that can be executed implicitly when chaining actions using the >>> operator. All chained actions pass their result to the next action. The === operator then starts the execution of the action chain. The following snippet demonstrates how you would use WKZombie to collect all Provisioning Profiles from the Developer Portal:
browser.open(url)
>>> browser.get(by: .Id("name"))
>>> browser.setAttribute("value", value: user)
>>> browser.get(by: .Id("password"))
>>> browser.setAttribute("value", value: password)
>>> browser.get(by: .Name("form"))
>>> browser.submit
>>> browser.get(by: .Attribute("href", "/account"))
>>> browser.click
>>> browser.get(by: .Text("Provisioning Profiles"))
>>> browser.click(then: .Wait(0.5))
>>> browser.getAll(by: .Class("ui-ellipsis bold"))
=== myOutput
In order to output or process the collected data, one can either use a closure or implement a custom function taking the WKZombie optional result as parameter:
func myOutput(result: [HTMLTableColumn]?) {
// handle result
}
Actions can also be started manually by calling the start() method:
let action : Action<HTMLPage> = browser.open(url)
action.start { result in
switch result {
case .Success(let page): // process page
case .Error(let error): // handle error
}
}
This is certainly the less complicated way, but you have to write a lot more code, which might become confusing when you want to execute Actions successively.
There are currently a few Actions implemented, helping you visit and navigate within a website:
The returned WKZombie Action will load and return a HTML or JSON page for the specified URL.
func open<T : Page>(url: NSURL) -> Action<T>
Optionally, a PostAction can be passed. This is a special wait/validation action, that is performed after the page has finished loading. See PostAction for more information.
func open<T : Page>(then: PostAction)(url: NSURL) -> Action<T>
The returned WKZombie Action will submit the specified HTML form.
func submit<T : Page>(form: HTMLForm) -> Action<T>
Optionally, a PostAction can be passed. See PostAction for more information.
func submit<T : Page>(then: PostAction)(form: HTMLForm) -> Action<T>
The returned WKZombie Action will simulate the click of a HTML link.
func click<T: Page>(link : HTMLLink) -> Action<T>
Optionally, a PostAction can be passed. See PostAction for more information.
func click<T: Page>(then: PostAction)(link : HTMLLink) -> Action<T>
The returned WKZombie Action will search the specified HTML page and return the first element matching the generic HTML element type and passed SearchType.
func get<T: HTMLElement>(by: SearchType<T>)(page: HTMLPage) -> Action<T>
The returned WKZombie Action will search and return all elements matching.
func getAll<T: HTMLElement>(by: SearchType<T>)(page: HTMLPage) -> Action<[T]>
The returned WKZombie Action will set or update an existing attribute/value pair on the specified HTMLElement.
func setAttribute<T: HTMLElement>(key: String, value: String?)(element: T) -> Action<HTMLPage>
Some HTMLElements, that implement the HTMLFetchable protocol (e.g. HTMLLink or HTMLImage), contain attributes like "src" or "href", that link to remote objects or data. The following method returns a WKZombie Action that can conveniently download this data:
func fetch<T: HTMLFetchable>(fetchable: T) -> Action<T>
Once the fetch method has been executed, the data can be retrieved and converted. The following example shows how to convert data, fetched from a link, into an UIImage:
let image : UIImage? = link.fetchedContent()
Fetched data can be converted into types, that implement the HTMLFetchableContent protocol. The following types are currently supported:
Note: See the OSX example for more info on how to use this.
The returned WKZombie Action will transform a HTMLElement into another HTMLElement using the specified function f.
func map<T: HTMLElement, A: HTMLElement>(f: T -> A)(element: T) -> Action<A>
Some Actions, that incorporate a (re-)loading of webpages (e.g. open, submit, etc.), have PostActions available. A PostAction is a wait or validation action, that will be performed after the page has finished loading:
PostAction | Description |
---|---|
Wait (Seconds) | The time in seconds that the action will wait (after the page has been loaded) before returning. This is useful in cases where the page loading has been completed, but some JavaScript/Image loading is still in progress. |
Validate (Javascript) | The action will complete if the specified JavaScript expression/script returns 'true' or a timeout occurs. |
In order to find certain HTML elements within a page, you have to specify a SearchType. The return type of get() and getAll() is generic and determines which tag should be searched for. For instance, the following would return all links with the class book:
let books : Action<HTMLLink> = browser.getAll(by: .Class("book"))(page: htmlPage)
The following 6 types are currently available and supported:
SearchType | Description |
---|---|
Id (String) | Returns an element that matches the specified id. |
Name (String) | Returns all elements matching the specified value for their name attribute. |
Text (String) | Returns all elements with inner content, that contain the specified text. |
Class (String) | Returns all elements that match the specified class name. |
Attribute (String, String) | Returns all elements that match the specified attribute name/value combination. |
XPathQuery (String) | Returns all elements that match the specified XPath query. |
The following Operators can be applied to Actions, which makes chained Actions easier to read:
Operator | Description |
---|---|
>>> | This Operator equates to the andThen() method. Here, the left-hand side Action will be started and the result is used as parameter for the right-hand side Action. |
=== | This Operator starts the left-hand side Action and passes the result as Optional to the function on the right-hand side. |
The returned WKZombie Action will make a bulk execution of the specified action function f with the provided input elements. Once all actions have finished executing, the collected results will be returned.
func batch<T, U>(f: T -> Action<U>)(elements: [T]) -> Action<[U]>
The returned WKZombie Action will execute the specified action (with the result of the previous action execution as input parameter) until a certain condition is met. Afterwards, it will return the collected action results.
func collect<T>(f: T -> Action<T>, until: T -> Bool)(initial: T) -> Action<[T]>
This command is useful for debugging. It prints out the current state of the WKZombie browser represented as DOM.
func dump()
When using WKZombie, the following classes are involved when interacting with websites:
This class represents a read-only DOM of a website. It allows you to search for HTML elements using the SearchType parameter.
The HTMLElement class is a base class for all elements in the DOM. It allows you to inspect attributes or the inner content (e.g. text) of that element. Currently, there are 6 subclasses with additional element-specific methods and variables available:
Additional subclasses can be easily implemented and might be added in the future.
As mentioned above, WKZombie as rudimentary support for JSON documents.
For parsing and decoding JSON, the following methods and protocols are available:
The returned WKZombie Action will parse NSData and create a JSON object.
func parse<T: JSON>(data: NSData) -> Action<T>
The following methods return a WKZombie Action, that will take a JSONParsable (Array, Dictionary and JSONPage) and decode it into a Model object. This particular Model class has to implement the JSONDecodable protocol.
func decode<T : JSONDecodable>(element: JSONParsable) -> Action<T>
func decode<T : JSONDecodable>(array: JSONParsable) -> Action<[T]>
This protocol must be implemented by each class, that is supposed to support JSON decoding. The implementation will take a JSONElement (Dictionary<String : AnyObject>) and create an object instance of that class.
static func decode(json: JSONElement) -> Self?
The following example shows how to use JSON parsing/decoding in conjunction with WKZombie:
browser.open(bookURL)
>>> browser.decode
=== myOutput
func myOutput(result: Book?) {
// handle result
}
Mathias Köhnke @mkoehnke
WKZombie is available under the MIT license. See the LICENSE file for more info.
The release notes can be found here.