fabernovel loader

May 3, 2018 | 10 min read

Dev

Benji case study : building a REST facade over an Object storage

Stéphane Tankoua

Developer


FABERNOVEL TECHNOLOGIES
This article is the second in a serie of articles about Benji, the new reactive Scala DSL for object storage, developed by FABERNOVEL TECHNOLOGIES. Before going any further, we recommend you to read the first one in case you missed it.

Benji is our home made DSL for object storage which abstract operations independently of the underlying storage. During the first article, we discussed about Benji’s rationale. In this article, we will see how to use Benji’s provided DSL and modules going from a concrete example: a rest API which will expose all CRUD operations available on an object storage (creating buckets and objects, listing them, deleting them and so forth). As a bonus, we will use Play! as web framework to see how Benji integrates with it.

All the code used for this article comes from:

Setting up our Play! project

To facilitate the integration with Play!, Benji provides a module. It is only compatible with Play 2.6+ though. In order to use it, add this dependency to your build.sbt:

"com.zengularity.benji" %% "benji-play" % VERSION

This plug-in provides components to provision easily an object storage either using runtime dependency injection (with Guice) or compile time injection. We will dive into the details in the next section.

Before going forward, let’s talk about the prerequisites we will need when working with Play!. First of all, you have to enable the Benji module for Play! into your application.conf:

play {
 modules.enabled += "play.modules.benji.BenjiModule"
}

Then, for the Benji object storage to be functional, you will also have to set, the URI of your storage in your configuration file:

benji.uri = "s3:protocol://accessKey:secretKey@host?style=path"

We are now ready to inject Benji’s object storage into our own code.

Dependency Injection

Before exploring the different ways to do Dependency Injection (DI) with Benji components in a Play! application, we should take a step back and define first what DI is. From a non accustomed perspective, DI can be summarized as passing in components you will need (think of parameters) instead of instantiating them explicitly. In the context of Benji, we will need to inject the object storage dependency into our controller in order to query over it.

Runtime DI

Once you have enabled the Benji module and set the storage URI, the ObjectStorage could be injected as follows:

import javax.inject.Inject
import play.api.mvc.{BaseController, ControllerComponents}
import com.zengularity.benji.ObjectStorage

class BenjiController @Inject()(
  val controllerComponents: ControllerComponents,
  benji: ObjectStorage
) extends BaseController { … }

Here Guice will instantiate an ObjectStorage object and will inject it into our controller (take a look at Play! documentation for more details).That’s all, we can now use the object storage within our controller.

Compile time DI

For those who prefer to inject Play! components at compile time, it is possible to do so with the Benji plug-in for Play!. Let’s look how our ApplicationLoader would look like:

import play.filters.HttpFiltersComponents	 	 	
import play.modules.benji.BenjiFromContext

class CustomComponents(context: Context)
    extends BenjiFromContext(context) with HttpFiltersComponents {

  // injected Benji ObjectStorage component is now available in scope 
  // (such as controllerComponents) to use as follow:
  lazy val applicationController = new BenjiController(controllerComponents, benji)
  lazy val router: Router = new Routes(httpErrorHandler, applicationController)
}

In the example above, the BenjiFromContext is a DI component provided by the benji-play module (previously added in our build.sbt). By inheriting from BenjiFromContext, the ObjectStorage will be provided (here named benji). Note that the BenjiFromContext is based on BuiltInComponentsFromContext so all its content will be available in scope.

Once the object storage provisioned, we can now create our application controller which will facade over it.

Creating the controller

Routes

Let’s now create our controller (called BenjiController as the in the DI section). This controller will expose as a REST API most of the operations available in our ObjectStorage. Following CRUD operations on buckets and objects will be available:

This is how our Play! routes file will look:

# The fully qualified name of the controller with be omitted for readability

# Bucket api
GET /api/buckets                                       BenjiController.listBuckets
+ nocsrf
POST /api/buckets/:bucketName                          BenjiController.createBucket(bucketName: String)
+ nocsrf
DELETE /api/buckets/:bucketName                        BenjiController.deleteBucket(bucketName: String)

# Object api
GET /api/buckets/:bucketName/objects                   BenjiController.listObjects(bucketName: String)
GET /api/buckets/:bucketName/objects/:objectName       BenjiController.getObject(bucketName: String, objectName: String)
HEAD /api/buckets/:bucketName/objects/:objectName      BenjiController.objectMetadata(bucketName: String, objectName: String)
+ nocsrf
POST /api/buckets/:bucketName/objects                  BenjiController.createObject(bucketName: String)
+ nocsrf
DELETE /api/buckets/:bucketName/objects/:objectName    BenjiController.deleteObject(bucketName: String, objectName: String)

Scroll horizontal pour voir l’intégralité du code

Considering these REST contracts defined as HTTP routes, we could now implement corresponding actions into our BenjiController. Let’s start by implementing CRUD actions for bucket.

Bucket API

The routes been set, we can start implementing our API for bucket. The controller we will use is the one we created in the DI section (BenjiController). Let’s start by implementing the action to list buckets in it:

def listBuckets = Action.async {
  benji.buckets.collect[List]().map { buckets =>
    Ok(Json.toJson(buckets.map(_.name)))
  }
}

The method buckets on the ObjectStorage returns an object of type BucketsRequest, which provides the method collect to fetch all the buckets on our object storage. Let’s take a look at the signature of this collect method:

def collect[M[_]]()(
 implicit m: Materializer,
 builder: CanBuildFrom[M[_], Bucket, M[Bucket]]
): Future[M[Bucket]]

Since Benji relies on Akka Stream, we need a materializer to fetch all buckets from our object storage.

This method is flexible enough to work with any collection type (using CanBuildFrom, here we are using collect with List but could be Set).

This method returns a collection of bucket metadata wrapped in a Future. Therefore we need to use Play! Action async combinator when defining our listBuckets. A bucket metadata can be seen as a tuple of a name and the creation time of the bucket. Since we only need the name, we map it (in the listBuckets action) to select only the bucket name.

Note that collect loads all metadata in memory, this can cause an error if it doesn’t fit in your memory. If you have to deal with a large list of buckets, prefer the apply method which gives you a reactive Source.

Let’s look now at the action to create buckets:

def createBucket(bucketName: String) = 
  Action.async { request =>
   benji.bucket(bucketName).create(failsIfExists = true).map { _ =>
     Created(s"$bucketName created")
   }.recover {
     case BucketAlreadyExistsException(_) => Ok(s"$bucketName already exists")
   }
 }

Before trying to create the bucket, we need to resolve it in the storage. This is where the bucket function on the object storage comes in. It gives us a BucketRef which is, as we can infer, a reference to this bucket.

Then, we use the create function on the BucketRef.
It returns a Future[Unit]. The function parameter failIfExists (defaults to false) makes the future fails if the bucket already exists. In our createBucket action, we set it to true so we explicitly handle the exception raised when the bucket already exists.

Finally, let’s implement the method for deleting a bucket:

case class DeleteBucketForm(ignore: Boolean, recursive: Boolean)

val deleteBucketForm = Form(
  mapping(
    "ignore" -> default(boolean, true),
    "recursive" -> default(boolean, false)
  )(DeleteBucketForm.apply)(DeleteBucketForm.unapply)
)

def deleteBucket(bucketName: String) = 
  Action.async(parse.form(deleteBucketForm)) { request =>
    val delete = benji.bucket(bucketName).delete
    val withIgnore = if (request.body.ignore) delete.ignoreIfNotExists else delete
    val withRecursive = if (request.body.recursive) withIgnore.recursive else withIgnore
    withRecursive.apply().map { _ =>
      NoContent
    }
  }

On this bunch of code, we should take a few moments to look at the input data we received with the HTTP request. We are expecting to receive two boolean fields (as form data): ignore and recursive (more on that later). These will be parsed and available in the request body of type DeleteBucketForm.

Let’s now analyze the deleteBucket action. As for bucket creation, we get the BucketRef and then we use the associated delete function. It gives us a DeleteRequest on which we could call these two combinators:

  • ignoreIfNotExists: not to raise an error if the bucket does not exist
  • recursive: to delete the bucket and its content (if not empty)

These two methods return also a DeleteRequest, so we can chain them easily. In the deleteBucket action, we activate these two options depending on the booleans ignore and recursive the user filled in the deletion form. Then, we execute the request to delete the bucket, which in turn returns a Future[Unit] (the apply method need a Materializer).

Object API

Now we have the REST API for CRUD operation on buckets. We have to do the same with object. Let’s start with the action to list objects:

case class ListObjectForm(batchSize: Option[Long])

val listObjectForm = Form(
  mapping("bacthSize" -> optional(longNumber))(ListObjectForm.apply)(ListObjectForm.unapply)
)

def listObjects(bucketName: String) =
  Action.async(parse.form(BenjiForm.listObjectForm)) { request =>
    val objects = benji.bucket(bucketName).objects
    val withBatchSize = request.body.batchSize.fold(objects)(batchSize =>
      objects.withBatchSize(batchSize))
    withBatchSize.collect[List]().map(objects => Ok(Json.toJson(objects.map(_.name))))
  }

Here we parse the request body as ListObjectForm which consists of one unique argument batchSize, we will talk about it quite soon.

As usual, we first retrieve the bucket using the bucket method on the object storage. The resulting BucketRef has a method called objects to list all objects stored in this bucket.

It gives us an object of type ListRequest on which we can call withBatchSize to define batch size (we received it in our form data) for fetching objects (as usual it gives back a ListRequest). Then we can call collect on our ListRequest to effectively fetch our objects.

Before moving on, we should take a step back to think about why we need to set a batch size when listing objects. If we have a bucket containing many objects and we try to fetch them all at once, we will surely have to deal with memory error (as for collect). By defining a batch size, we could stream easily (thanks to Akka Stream) objects from our storage. Furthermore, this will result to a performance improvement of our fetching code.

Now, let’s implement the action to download an object:

def getObject(bucketName: String, objectName: String) = Action {
  val data = benji.bucket(bucketName).obj(objectName).get()
  Ok.chunked(data)
}

As before, when retrieving buckets we can call the method obj on the BucketRef to get the object by its name (the objectName parameter).

It gives us an object of type ObjectRef. By calling get() on this object reference we obtain an Akka stream source (with precise type Source[ByteString, NotUsed]) we can use with Play! to stream the response. As we stream the response content, we will benefit of back pressure for free.

Let’s see how we can implement the action to get metadata of an object:

def objectMetadata(bucketName: String, objectName: String) = Action.async {
  val objectRef = benji.bucket(bucketName).obj(objectName)
  objectRef.metadata.map { meta =>
    NoContent.withHeaders(meta.toSeq.flatMap {
      case (name, values) => values.map(name -> _)
    }: _*)
  }.recover {
    case _: ObjectNotFoundException => NotFound
   }
}

Not so much to discuss here: we fetch the object metadata using the metadata method (which gives us a result of type Future[Map[String, Seq[String]]] ) and we add them to the headers of our HTTP response.
Note that if the object doesn’t exist in our storage an exception of type ObjectNotFoundException will be raised. That’s why we need to recover over it to return a relevant HTTP response. Benji provides an exception hierarchy in case operations could not be achieved to an irrelevant context (delete a non-empty bucket, get a non-existent object …). That’s up to the user to handle them and decide how it will recover over them.

Now we can look at the action to create an object:

def createObject(bucketName: String) = Action.async(parse.multipartFormData) { request =>
  val files = request.body.files.map { file =>
    val source = FileIO.fromPath(file.ref.path)
    val uploaded: Future[NotUsed] =
      source runWith benji.bucket(bucketName).obj(file.filename).put[ByteString]
    uploaded
  }

  if (files.isEmpty) 
    Future.successful(BadRequest("No files to upload"))
  else 
    Future.sequence(files).map { _ =>
      Ok(s"File ${request.body.files.map(_.filename).mkString(",")} uploaded")
    }
}

We will try to comment only essential operation in the bunch of code above. For each file, we received through the HTTP request, we put them into the object storage.

Let’s take some time to detail this saving step. An ObjectRef has the generic function put which we can use to save a file in the object storage.

It returns a Sink (of Akka-Stream) which can stream the file to the object storage (back-pressure for free). So in the code above, we just retrieve the object reference (associated to the file name) and then call put on it. Note that, for our files saving to be effective, we combine our sink with the source we obtained with the FileIO.fromPath helper method, which give us a Future[NotUsed]. Then we just have to wait on all saves to be completed to return our HTTP response.

Finally, see how we can implement the endpoint to delete an object:

case class DeleteObjectForm(ignore: Boolean)

val deleteObjectForm = Form(
  mapping("ignore" -> default(boolean, true))
  (DeleteObjectForm.apply)(DeleteObjectForm.unapply)
)

def deleteObject(bucketName: String, objectName: String) =
  Action.async(parse.form(deleteObjectForm)) { request =>
    val delete = benji.bucket(bucketName).obj(objectName).delete
    val withIgnore = if (request.body.ignore) delete.ignoreIfNotExists else delete
    withIgnore.apply().map { _ => NoContent }
  }

Here we parse the request data as a form object which content only one boolean attribute: ignore. Once again we get the reference of the object we want to delete (similar as get action).

Then we call the delete method (on ObjectRef) which gives us a DeleteRequest. When trying to delete an object, it’s possible to ask Benji object storage not to raise an error if that object doesn’t exist. To do so, we could use ignoreIfNotExists on DeleteRequest. When implementing deleteObject action we specify it depending on the value of the ignore form attribute we receive with the request. To effectively delete this object we just have to call the apply method on our DeleteRequest.

Conclusion

To sum up what we have seen during this tour of Benji. We have seen how to setup and configure Benji alongside with its Play! Framework 2 module.

Besides creating our REST facade, we also have explored Benji API in quite an exhaustive way. We have seen and discussed all operations available on ObjectStorage and all related components (BucketRef, ObjectRef…) to retrieve bucket or object.

While discussing about Benji API, we could note that it integrates smoothly with Akka stream. So using Benji operations, we obtain Stream components (Source, Sink) which we can compose easily. More than having the stream semantics for free, we also benefit from back-pressure when we operate on the object storage.

We hope you enjoy this reading, see you soon for our next article about testing with Benji.

Interested in these subjects?

Join us!
logo business unit

FABERNOVEL TECHNOLOGIES

150 talents to face technological challenges of digital transformation

next read