2020-03-29

JSON Serialization might be your Rails bottleneck

568 words - 2 minutes

Rendering JSON or calling to_json is slow in Rails and you may benefit from generating the JSON in the query itself.

Let’s say you have a method in a controller returning approximately 200 simple JSON objects representing users.

def users
  render json: User.all
end

Everything works, but it feels like the endpoint is slower than it should be even in the local development environment where latency is zero and there is no user load.

Here’s how the response looks like in the Rails console:

Completed 200 OK in 91ms (Views: 61.9ms | ActiveRecord: 22.5ms | Allocations: 25990)

It’s not terrible, but the time spent rendering the view and the number of allocations seems strangely high.

Problem

JSON Serialization in Rails is slow, because the JSON transformation ResultSet#to_json involves first allocating a Rails object for each row returned from the query, and then converting each object one by one to the JSON representation. Ruby’s dynamic nature results into relatively large number of heap allocations and CPU time spent on doing the transformation.

What if we could just skip the Ruby processing?

Solution

Instead of returning each row from the database query, we can augment the database query to aggregate all results into the JSON string already in the database. That way we’re shifting majority of the work onto more optimized database workers. Postgres has a wide variety of JSON functions we can use to construct JSON.

In this code snippet we’re creating a helper method query_to_json, which will transform the given query into a nested table expression for Postgres. The SQL for the expression looks something like array_to_json(array_agg(original_query)).

# More efficient replacement for ResultSet#to_json
def query_to_json(query)
  query_sql = Arel.sql(query.to_sql)
  
  cte_table = Arel::Table.new(:orig_query)
  cte_definition = Arel::Nodes::Grouping.new(query_sql)
  cte = Arel::Nodes::As.new(cte_table, cte_definition)

  json = Arel::SelectManager.new
    .from(cte_table)
    .project([
      Arel::Nodes::NamedFunction.new(
        "array_to_json",
        [
          Arel::Nodes::NamedFunction.new(
            "array_agg",
            [Arel.sql("orig_query")]
          )
        ]
      )
    ])
    .with(cte)

  query.connection.execute(json.to_sql).values.first.first || "[]"
end

def users
  render json: query_to_json(User.all)
end

The positive impact on query speeds is clear:

Completed 200 OK in 41ms (Views: 0.1ms | ActiveRecord: 33.6ms | Allocations: 2206)

While some of the processing time was shifted to ActiveRecord, the overall query duration becomes significantly faster.

If your codebase includes critical paths where JSON generation is involved, consider leveraging the database for serialization.

But I don’t use to_json

If your schema follows JSON:API specifications, fast_jsonapi might be useful.

If you’re using ActiveModel Serializers, postgres_ext-serializers is a way to still leverage the underlying database.