Readium Logo

Composite Fetcher API

Summary

The goal of this proposal is to make the fetcher more flexible using a composite design pattern. We will introduce several Fetcher implementations to answer different needs, such as resource transformation and caching.

Motivation

The Fetcher component provides access to a publication’s resource. Therefore, it’s a place of choice to offer extensibility for reading apps. With a composite pattern, we can decorate the fetcher to add custom behaviors.

This will also lead to smaller, more focused implementations that are easier to unit test.

Location in the Toolkit

While historically the fetcher was part of the streamer component, this proposal requires to have it in shared, to be able to create fetchers from the streamers, navigators and the OPDS libraries.

However, even though the streamer doesn’t contain the implementation of Fetcher anymore, it is still responsible to assemble the composite fetcher structure for each kind of publication.

Developer Guide

Publication resources are accessed using a root Fetcher object, which can be composed of sub-fetchers organized in a composite tree structure. Each Fetcher node takes on a particular responsibility, with:

Accessing a Resource

You will never need to use the Fetcher API directly, which is a private detail of Publication. To read a resource, use Publication::get() which will delegate to the internal root Fetcher.

Publication::get() returns a Resource object, which is a proxy interface to the actual resource. The content is fetched lazily in Resource, for performance and medium limitation reasons (e.g. HTTP). Therefore, Publication::get() always returns a Resource, even if the resource doesn’t actually exist. Errors are handled at the Resource level.

let resource = publication.get("/manifest.json")

switch resource.readAsString() {
case .success(let content):
    let manifest = Manifest(jsonString: content)

case .failure(let error):
    print("An error occurred: \(error)")
}

Customizing The Root Fetcher

Fetcher trees are created by parsers, such as r2-streamer and r2-opds, when constructing the Publication object. However, you may want to decorate the root fetcher to modify its behavior by:

While the composition of the fetcher tree is private, you can wrap the tree in a custom root fetcher, either by:

func prepareForIndexing(resource: Resource) -> Resource {
    // ...
}

// The `manifest` and `fetcher` parameters are passed by reference, to be able to overwrite them.
let indexingPublication = publication.copy { inout manifest, inout fetcher in
    fetcher = TransformingFetcher(fetcher, transformer: prepareForIndexing)
}

Backward Compatibility and Migration

Mobile (Swift & Kotlin)

This proposal is a non-breaking change, since it describes a structure that is mostly internal. The public features are new, such as adding resource transformers and decorating resource access.

Reference Guide

The following Fetcher implementations are here only to draft use cases, so they should be implemented only when actually needed.

Examples of Fetcher Trees

The fetcher tree created by publication parsers can be adapted to fit the characteristics of each format.

CBZ and ZAB (Zipped Audio Book)

These formats are very simple, we just need to access the ZIP entries.

Audiobook Manifest

The resources of a remote audiobook are fetched with HTTP requests, using an HTTPFetcher. However, we can implement an offline cache by wrapping the fetcher in a CachingFetcher.

LCP Protected Package (Audiobook, LCPDF, etc.)

The resources of a publication protected with LCP need to be decrypted. For that, we’re using a DecryptionTransformer embedded in a TransformingFetcher. Any remote resources declared in the manifest are fetched using an HTTPFetcher.

EPUB

The EPUB fetcher is one of the most complex:

Fetcher Interface

Provides access to a Resource from a Link.

Properties

Methods

Resource Interface

Acts as a proxy to an actual resource by handling read access.

Resource caches the content lazily to optimize access to multiple properties, e.g. length, read(), etc.

Every failable API returns a Result<T, Resource.Error> containing either the value, or a Resource.Error enum with the following cases:

Properties

Methods

Implementations

Resource.Transformer Function Type

typealias Resource.Transformer = (Resource) -> Resource

Implements the transformation of a Resource. It can be used, for example, to:

If the transformation doesn’t apply, simply return resource unchanged.

Leaf Fetchers

A leaf fetcher is an implementation of Fetcher handling the actual low-level bytes access. It doesn’t delegate to any other Fetcher.

FileFetcher Class

Provides access to resources on the local file system.

FileFetcher::links contains the recursive list of files in the reachable local paths, sorted alphabetically.

ArchiveFetcher Class

Provides access to entries of an Archive, such as ZIP.

ArchiveFetcher::links returns the archive’s entries list, in their archiving order.

ArchiveFetcher is responsible for the archive lifecycle, and should close it when Fetcher.close() is called. If a Resource tries to access an entry after the archive was closed, Resource.Error.Unavailable can be returned.

The specification of Archive is out of scope for this proposal.

HTTPFetcher Class

Provides access to resources served by an HTTP server.

HTTPFetcher::links returns an empty list, since we can’t know which resources are available.

ProxyFetcher Class

Delegates the creation of a Resource to a closure.

Composite Fetchers

A composite fetcher is delegating requests to child fetchers.

The links property returns the concatenated list of links from all child fetchers.

Warning: Make sure to forward the Fetcher.close() calls to child fetchers.

RoutingFetcher Class

Routes requests to child fetchers, depending on a provided predicate.

This can be used for example to serve a publication containing both local and remote resources, and more generally to concatenate different content sources.

RoutingFetcher.Route Class

Holds a child fetcher and the predicate used to determine if it can answer a request.

Both the fetcher and accepts properties are public.

TransformingFetcher Class

Transforms the resources’ content of a child fetcher using a list of Resource.Transformer functions.

CachingFetcher Class

Caches resources of a child fetcher on the file system, to implement offline access.

API to be determined in its own proposal.

Rationale and Alternatives

The first design considered was to handle HREF routing and resources caching in other Readium components, such as the navigator. But there are several drawbacks:

Drawbacks and Limitations

The fetcher is an optional component in the Readium architecture. Therefore, other components could bypass the features introduced by the fetcher layer, such as caching and injection.

While this might be fine in some cases, such as for an HTML WebPub, the navigators provided by Readium should use the fetcher as much as possible.

This issue occurs in particular when parsing a manifest containing remote URLs, which could be requested directly by some navigators. To alleviate this problem, a PublicationServer component could:

  1. Serve the Publication through a local HTTP server.
  2. Produce a copy of the Publication for the navigator, modifying the remote manifest links to use the local URLs instead.
    • By serving remote resources, the PublicationServer would then act as a proxy to the remote servers, and allow injection to happen through the Fetcher layer.
  3. To go even further, a Resource.Transformer could replace remote URLs with local ones in the resources content itself.
    • A particularly tricky situation is to intercept the external links in a web view, because it will usually trigger the request internally. If the web view doesn’t offer a native interception mechanism, then transforming links in the content itself could be a workaround.

Future Possibilities

While Fetcher is used internally in Publication, it is not tightly coupled to it – it’s only dependency is to the Link core model. Therefore, it could be used for other purposes.

Some types could be further specified in their own proposal: