The goal of this proposal is to make the fetcher more flexible using a composite design pattern. We will introduce several Fetcher
implementations to answer different needs, such as resource transformation and caching.
The Fetcher
component provides access to a publication’s resource. Therefore, it’s a place of choice to offer extensibility for reading apps. With a composite pattern, we can decorate the fetcher to add custom behaviors.
This will also lead to smaller, more focused implementations that are easier to unit test.
While historically the fetcher was part of the streamer component, this proposal requires to have it in shared, to be able to create fetchers from the streamers, navigators and the OPDS libraries.
However, even though the streamer doesn’t contain the implementation of Fetcher
anymore, it is still responsible to assemble the composite fetcher structure for each kind of publication.
Publication resources are accessed using a root Fetcher
object, which can be composed of sub-fetchers organized in a composite tree structure. Each Fetcher
node takes on a particular responsibility, with:
You will never need to use the Fetcher
API directly, which is a private detail of Publication
. To read a resource, use Publication::get()
which will delegate to the internal root Fetcher
.
Publication::get()
returns a Resource
object, which is a proxy interface to the actual resource. The content is fetched lazily in Resource
, for performance and medium limitation reasons (e.g. HTTP). Therefore, Publication::get()
always returns a Resource
, even if the resource doesn’t actually exist. Errors are handled at the Resource
level.
let resource = publication.get("/manifest.json")
switch resource.readAsString() {
case .success(let content):
let manifest = Manifest(jsonString: content)
case .failure(let error):
print("An error occurred: \(error)")
}
Fetcher trees are created by parsers, such as r2-streamer
and r2-opds
, when constructing the Publication
object. However, you may want to decorate the root fetcher to modify its behavior by:
ContentFilter
).While the composition of the fetcher tree is private, you can wrap the tree in a custom root fetcher, either by:
Publication
object with a customized Fetcher
, for on-the-fly temporary cases. For example, to pre-process resources in a background indexing task.func prepareForIndexing(resource: Resource) -> Resource {
// ...
}
// The `manifest` and `fetcher` parameters are passed by reference, to be able to overwrite them.
let indexingPublication = publication.copy { inout manifest, inout fetcher in
fetcher = TransformingFetcher(fetcher, transformer: prepareForIndexing)
}
This proposal is a non-breaking change, since it describes a structure that is mostly internal. The public features are new, such as adding resource transformers and decorating resource access.
The following Fetcher
implementations are here only to draft use cases, so they should be implemented only when actually needed.
Fetcher
TreesFetcher
InterfaceResource
InterfaceFetcher
TreesThe fetcher tree created by publication parsers can be adapted to fit the characteristics of each format.
These formats are very simple, we just need to access the ZIP entries.
The resources of a remote audiobook are fetched with HTTP requests, using an HTTPFetcher
. However, we can implement an offline cache by wrapping the fetcher in a CachingFetcher
.
The resources of a publication protected with LCP need to be decrypted. For that, we’re using a DecryptionTransformer
embedded in a TransformingFetcher
. Any remote resources declared in the manifest are fetched using an HTTPFetcher
.
The EPUB fetcher is one of the most complex:
HTTPFetcher
is used for remote resources.Fetcher
InterfaceProvides access to a Resource
from a Link
.
links: List<Link>
get(link: Link, parameters: Map<String, String> = {}) -> Resource
Resource
at the given Link.href
.
Resource
is always returned, since for some cases we can’t know if it exists before actually fetching it, such as HTTP. Therefore, errors are handled at the Resource
level.link: Link
Link
because a Fetcher
might use its properties, e.g. to transform the resource. Therefore, Publication::get()
makes sure that a complete Link
is always provided to the Fetcher
.parameters: Map<String, String> = {}
Link::href
is templated,close()
Resource
InterfaceActs as a proxy to an actual resource by handling read access.
Resource
caches the content lazily to optimize access to multiple properties, e.g. length
, read()
, etc.
Every failable API returns a Result<T, Resource.Error>
containing either the value, or a Resource.Error
enum with the following cases:
BadRequest
equivalent to a 400 HTTP error.
NotFound
equivalent to a 404 HTTP error.Forbidden
equivalent to a 403 HTTP error.
Unavailable
equivalent to a 503 HTTP error.
Other(Exception)
for any other error, such as HTTP 500.link: Link
Resource
to include additional metadata, e.g. the Content-Type
HTTP header in Link::type
.ArchiveFetcher
might add a compressedLength
property which could then be used by the PositionsService
to address this issue.Cache-Control
HTTP header could be used to customize the behavior of a parent CachingFetcher
for a given resource.length: Result<Long, Resource.Error>
read(range: Range<Long>? = null) -> Result<ByteArray, Resource.Error>
range
.range: Range<Long>? = null
range
is null
, the whole content is returned.readAsString(encoding: Encoding? = null) -> Result<String, Resource.Error>
String
.encoding: Encoding? = null
null
, then it is parsed from the charset
parameter of link.type
using MediaType::parameters
, and falls back on UTF-8.close()
StringResource(link: Link, string: String)
Resource
serving a string.BytesResource(link: Link, bytes: ByteArray)
Resource
serving an array of bytes.FailureResource(link: Link, error: Resource.Error)
Resource
that will always return the given error.Resource.Transformer
Function Typetypealias Resource.Transformer = (Resource) -> Resource
Implements the transformation of a Resource
. It can be used, for example, to:
dir="rtl"
in an HTML document,If the transformation doesn’t apply, simply return resource
unchanged.
A leaf fetcher is an implementation of Fetcher
handling the actual low-level bytes access. It doesn’t delegate to any other Fetcher
.
FileFetcher
ClassProvides access to resources on the local file system.
FileFetcher(paths: [String: String])
paths: [String: String]
href
.FileFetcher(href: String, path: String)
FileFetcher(paths: [href: path])
FileFetcher::links
contains the recursive list of files in the reachable local paths
, sorted alphabetically.
ArchiveFetcher
ClassProvides access to entries of an Archive
, such as ZIP.
ArchiveFetcher(archive: Archive)
archive: Archive
Archive
to fetch the resources from.ArchiveFetcher::links
returns the archive’s entries list, in their archiving order.
ArchiveFetcher
is responsible for the archive lifecycle, and should close it when Fetcher.close()
is called. If a Resource
tries to access an entry after the archive was closed, Resource.Error.Unavailable
can be returned.
The specification of Archive
is out of scope for this proposal.
HTTPFetcher
ClassProvides access to resources served by an HTTP server.
HTTPFetcher(client: HTTPClient = R2HTTPClient())
client: HTTPClient
HTTPFetcher::links
returns an empty list, since we can’t know which resources are available.
ProxyFetcher
ClassDelegates the creation of a Resource
to a closure.
ProxyFetcher(closure: (Link) -> Resource)
ProxyFetcher
that will call closure
when asked for a resource.ProxyFetcher(closure: (Link) -> String)
Resource
from a string.ProxyFetcher({ link -> StringResource(link, closure(link))})
A composite fetcher is delegating requests to child fetchers.
The links
property returns the concatenated list of links
from all child fetchers.
Warning: Make sure to forward the Fetcher.close()
calls to child fetchers.
RoutingFetcher
ClassRoutes requests to child fetchers, depending on a provided predicate.
This can be used for example to serve a publication containing both local and remote resources, and more generally to concatenate different content sources.
RoutingFetcher(routes: List<RoutingFetcher.Route>)
RoutingFetcher
from a list of routes, which will be tested in the given order.RoutingFetcher(local: Fetcher, remote: Fetcher)
local
if the Link::href
starts with /
, otherwise to remote
.RoutingFetcher.Route
ClassHolds a child fetcher and the predicate used to determine if it can answer a request.
Both the fetcher
and accepts
properties are public.
RoutingFetcher.Route(fetcher: Fetcher, accepts: (Link) -> Bool = { true })
accepts
means that the fetcher will accept any link.TransformingFetcher
ClassTransforms the resources’ content of a child fetcher using a list of Resource.Transformer
functions.
TransformingFetcher(fetcher: Fetcher, transformers: List<Resource.Transformer>)
TransformingFetcher
from a child fetcher
and a list of transformers
to apply in the given order.TransformingFetcher(fetcher: Fetcher, transformer: Resource.Transformer)
TransformingFetcher(fetcher, [transformer])
CachingFetcher
ClassCaches resources of a child fetcher on the file system, to implement offline access.
API to be determined in its own proposal.
The first design considered was to handle HREF routing and resources caching in other Readium components, such as the navigator. But there are several drawbacks:
Publication
level, via an internal Fetcher
, means that any component using the Publication
will benefit from it.The fetcher is an optional component in the Readium architecture. Therefore, other components could bypass the features introduced by the fetcher layer, such as caching and injection.
While this might be fine in some cases, such as for an HTML WebPub, the navigators provided by Readium should use the fetcher as much as possible.
This issue occurs in particular when parsing a manifest containing remote URLs, which could be requested directly by some navigators. To alleviate this problem, a PublicationServer
component could:
Publication
through a local HTTP server.Publication
for the navigator, modifying the remote manifest links to use the local URLs instead.
PublicationServer
would then act as a proxy to the remote servers, and allow injection to happen through the Fetcher
layer.Resource.Transformer
could replace remote URLs with local ones in the resources content itself.
While Fetcher
is used internally in Publication
, it is not tightly coupled to it – it’s only dependency is to the Link
core model. Therefore, it could be used for other purposes.
Some types could be further specified in their own proposal:
CachingFetcher
needs to be well thought out.
CachingFetcher
probably needs to be paired with a publication service.HTMLInjectionTransformer
might be mutable to switch scripts or to update a CSS theme.