Representing Resources via API

oeoeaio · May 4, 2018, 12:55am

I have a question about the best approach for representing resource information via our API. Any input welcome.

We have a need to represent resources in slightly different ways depending on the context in which the resource is going to be used. As an extreme example: we have an Api::Admin::IndexEnterpriseSerializer and an Api::Admin::EnterpriseSerializer. The index serializer is much simpler and is designed for use on the admin/enterprises index page which does not require much information about each enterprise in order to render properly. The Api::Admin::EnterpriseSerializer is used to render admin/enterprises/:id/edit and necessarily needs much more information to be included for the page to be rendered properly.

We can use the same route and controller action to request these two different representations of an object, and I think this is good. We use a param called ams_prefix in conjunction with a special controller method render_as_json to specify which representation we want and to render the json.

My question is: is there are better approach to specifying the set of required attributes? My concern is that if we continue using the existing approach, we risk polluting our list of serializers with single use classes that make only minor modifications to another class. I have seen other platforms use an approach where each attribute that is required by the client is explicitly mentioned in the request. Is that a better practice? It feels like it would lead to much messier client-side code…

Or do we just have a base serializer for each model which serializes some minimal set of attributes and then allows additional attributes to be specified as required. I feel like we are always going to need custom serializers, but perhaps some clever logic could help soak up the need for a whole new serializer when only one or two additional attributes are required?

maikel · May 4, 2018, 2:27am

It sounds like a mix of concerns within the serialisers. In my understanding, a serialiser should know how to convert a certain object into a string. But here we are talking about which data to select. I would think that the controller selects the data to load depending on action and params and then passes that to a serialiser for rendering. The serialiser just processes what’s in the object and omits missing parts. Is this difficult with ActiveModel::Serializer?

oeoeaio · May 4, 2018, 6:04am

@maikel Yeah, kind of. As the name suggests, AMS is designed to work with instances of ActiveModel, so the job of the controller is really just to load the instance and pass it to the serializer. If an attribute is specified in a serializer which does not exist on the object, is does not just skip that attribute, it raises an error. It is obviously possible to work around this, but you would be really fighting against the convention set up by AMS…

enricostn · May 5, 2018, 10:33am

Yeah, I never liked that coupling between models and serializers in AMS. I generally use something else for serializers, something that is not coupled to models. From PORO objects to https://github.com/ismasan/oat

But I would like to discuss first on what do we really need in the client, what kind of requests are we doing, how much they differ and why.

sauloperez · May 10, 2018, 5:47pm

If I understand correctly, that means that some pages show a subset of a resource’s attributes while others require them all. Am I right or is there a case that isn’t about the amount but the format of these attributes (can’t think of any other scenario)?

If the former is true, I don’t see what’s the actual problem that we’re trying to solve. Is it performance? Complexity on the frontend side? As far as my experience goes, I hardly ever had to work with more than a representation of a REST resource and I don’t know manage to see why our case is different . Just look at all the complexity that we can read throughout the discussion; it feels a bit scary to me.

oeoeaio · May 15, 2018, 12:44pm

Yes, the former, and yes performance is my main concern if we decide to just serialise everything. I also don’t have a very good idea of what “everything” actually is, and by that I mean, where is the boundary?

I guess the real problem is our god objects. Enterprise, Order, User, OrderCycle. We’re not strictly serialising just attributes for a lot of these models, we’re serialising urls and image locations, associations and the results of custom methods that don’t belong on the models themselves. I can’t think of a way of defining the boundary of what should and should not be serialised in the general sense, it is really really context specific. Trying to serialise all the things you need to render a form when all you want to do in list a bunch of objects on an index page would be have a severe impact on performance.

Perhaps a good demonstration of the problem is: Api::Admin::OrderCycleSerializer, which is used to render the OrderCycle form in the admin section. There are a bunch of methods defined on the serialiser which require access to the OrderCyclePermissions service object, so we can’t really replicate the logic client-side. The logic is complex and query-heavy, but the intention is that is will only ever be used to serialise one instance per request, so it’s not a big deal. This also means that the logic has become very view specific, so it doesn’t belong on the model. We should probably define an OrderCycle decorator, and then serialise the decorated order cycle, rather than polluting the serialiser, but I think what we would then end up with is a serialiser that is just for the decorated order cycle, which can no longer be used for a normal order cycle. Maybe that is fine?

That is essentially what we are doing with the ams_prefix param: defining the decorator to be used when representing the object. It’s just that we’ve mashed the decorator into the serialiser. Perhaps we should just separate them be explicit about the coupling between the decorator and the serialiser?

sauloperez · May 17, 2018, 3:08pm

performance is my main concern if we decide to just serialise everything.

before jumping into any solution and its details let’s step back and check whether that is actually a problem that we have. All I’ve seen related to performance so far it’s not related to serializing attributes but to a very long list of N+1, poor usage of the database capabilities (lots of map + select + sort at app-level that should be done by Postgres) and slow haml views rendering (which go quite hand in hand).

You can check in Skylight a good sample of Katuma’s performance. We move blindly without data and that is why I brought up the issue of setting up Skylight in other instances as well.

In any case, if serializing was a performance bottleneck worth considering I think you nailed here:

I guess the real problem is our god objects. Enterprise, Order, User, OrderCycle.

IMO having slow serializers is just a consequence of this and not the other way around. I heard you at least once complain particularly about OrderCycle having way too many responsibilities and I totally agree. So to me, the efforts, should go towards slimming down these objects.

Also, given

I can’t think of a way of defining the boundary of what should and should not be serialised in the general sense, it is really really context specific.

I’m afraid we end up with a big amount of context-specific serializers that increase the complexity of the codebase and decrease changeability. I imagine adding a new feature or modifying an existing one will require considerable effort on the serializers, which one should be added, changed, removed, etc.

To sum up, I vote for not investing resources on this until we know it is indeed a major performance bottleneck with empirical data. So far, I don’t see serialization as a problem.

sauloperez · May 17, 2018, 3:50pm

I took the time to write down a short guide to setup Skylight in https://github.com/openfoodfoundation/openfoodnetwork/wiki/Skylight-setup

oeoeaio · May 18, 2018, 5:12am

Thanks for the guide, and for those thoughts.

So far, I don’t see serialization as a problem.

I have a hunch (but no data to back it up) that the only reason this is true is that we already have multiple serialisers for some models. I will look into setting up Skylight on Aus production.

sauloperez · May 18, 2018, 8:47am

Yes please! I want to see how OFN performs under much more load.