Better Feedback For Long Running Exports With ActionCable

Since Rails 5 is now in beta and ActionCable has been merged in, I wanted to give it a try. At my current job, we have a few reports that take a while to build and export (usually to PDF). For this we have a background job that will compile the report together and polling from the front end to see whether or not the file is ready for download. I'm not a fan of polling as it creates a lot of wasteful requests. Using WebSockets to get broadcasted messages from the backend seemed like a much more elequent approach. This post will use this simple use case to demonstrate ActionCable

To demonstrate this use case, we will be searching the PubMed database and returning a list of publications based on a search term. When the user enters a search term, a background job is created (I'm using Sidekiq to handle this through ActiveJob) and on the front end a modal is displayed with a spinner (although you could display a progress bar or whatever and update it with information received from the backround job). The background job will query the PubMed database and add each entry to a CSV. For this example I've capped the number of records to 5000 because the search could take an extremely long time depending on how general the search term the user has entered. As the search results are retrieved, the background job broadcasts what offset in the search it is currently working on (as mentioned earlier, this could be used to display and updated). Once the search is complete, a record is created and the file location is attached. A link is generated for the search result record and a success is broadcasted on the socket. The front end then displays the link so the user can download the file. Let's look at the code!

Backend Code

First let's start on the backend, specifically the channel. For this example, it's actually going to be quite boring. The only thing that needs to be defined is the stream name.

class SearchChannel < ApplicationCable::Channel
  def subscribed
    stream_from "search_#{params[:search_uuid]}"
  end

  def unsubscribed
    # Any cleanup needed when channel is unsubscribed
    Rails.logger.debug('SearchChannel#unsubscribed')
  end
end

In Rails 5 there is a new directory in the apps directory called channels, where this code will live. You can use the generator to create the template for this code as well. The generator will also generate the coffeescript to create the subscription to the channel. I decided to just create the channel and javascript manually since it's pretty basic and I prefer javascript over coffeescript. So let's look at this code, we have a subscribed and a unsubscribed method. The subscribed method sets up a stream to broadcast over. This is necessary to target a spefic socket. The unsubscribed method is used to do any cleanup that might need to be done. You can also include other methods that allow for the javascript code to communicate back to the channel. I recommend watching DHH's video on how to create a simple chat program, which goes over how to handle this in more detail.

Now let's look at the background job, which lives in the app/jobs directory.

class PubmedSearchJob < ApplicationJob
  require 'csv'

  queue_as :default
 
  SEARCH_LIMIT = 500
  HARD_LIMIT = 5000

  STATUS_START = 'start'
  STATUS_COMPLETE = 'complete'
  STATUS_FAILED = 'failed'

  HEADERS = ['pubmed_id', 'pubmed_central_id', 'title', 'author_names', 'publication_date', 'journal']

  def perform(search, search_uuid)
    stream = "search_#{search_uuid}"

    ActionCable.server.broadcast(stream, status: STATUS_START)

    offset = 0

    search_result = Pubmed.search(search, offset, SEARCH_LIMIT)
    number_of_articles = search_result.count

    ActionCable.server.broadcast(stream, number_of_articles: number_of_articles)

    search_result_export_filename = "#{Rails.root}/tmp/#{stream}.csv"
    CSV.open(search_result_export_filename, 'wb') do |csv|
      csv << HEADERS

      while search_result.pubmed_ids && ((offset + SEARCH_LIMIT) <= HARD_LIMIT)
        ActionCable.server.broadcast(stream, current_offset: offset)
        articles = Pubmed.fetch(search_result.pubmed_ids)

        articles.each do |article|
          csv << [
            article.pubmed_id,
            article.pubmed_central_id,
            article.title,
            article.author_names,
            article.publication_date,
            article.journal.title
          ]
        end

        offset += SEARCH_LIMIT
        search_result = Pubmed.search(search, offset, SEARCH_LIMIT)
      end
    end

    search_result_export_file = File.open(search_result_export_filename)
    search_result_export = SearchResult.create(search: search, search_uuid: search_uuid, document: search_result_export_file)

    if search_result_export
      download_link = Rails.application.routes.url_helpers.search_result_path(search_result_export)

      ActionCable.server.broadcast(stream, status: STATUS_COMPLETE, download_link: download_link)
    else
      ActionCable.server.broadcast(stream, status: STATUS_FAILED)
    end
  end
end

I won't go into too much detail since I explained it earlier, but this script will conduct the search and build the results for download. The one thing I do want to point out is the use of ActionCable.server.broadcast. This is broadcasting the information to any consumers for a specific stream, in this case it's the search stream. And you can pass back data to the consumers listening. In this example, when the search starts it sends a "started" status, when it fetches the next offset worth of entries it sends the current offset back, which can be used to provide feedback into the progress of the search, and finally it sends a "complete" status along with a link or a "failed" status if something something went wrong.

Frontend

First, let's just take a quick look at the html. It contains a form to enter the search term into and some markup for a bootstrap modal.

<div class="container">
  <h1>Pubmed Search</h1>
  <div class="row">
    <div class="col-md-12">
      <form class="form-inline" id="search-form">
        <div class="form-group">
          <label for="search" class="control-label">Search:</label>
          <input type="text" class="form-control" id="search" autocomplete="off">
        </div>
      </form>
    </div>
  </div>
</div>

<div class="modal fade" id="search-status" tabindex="-1" role="dialog" aria-labelledby="search-status-label">
  <div class="modal-dialog" role="document">
    <div class="modal-content">
      <div class="modal-header">
        <button type="button" class="close" data-dismiss="modal" aria-label="Close"><span aria-hidden="true">&times;</span></button>
        <h4 class="modal-title" id="search-status-label">Search Results Status</h4>
      </div>
      <div class="modal-body">
        <div class="loading" style="display: none;">
          <img src="assets/loading_spinner.gif" width="100">
        </div>
        <div class="search-results" style="display: none;">
          <a href="#" class="search-results-download" target="_blank">Download Search Results</a>
        </div>
      </div>
      <div class="modal-footer">
        <button type="button" class="btn btn-default" data-dismiss="modal">Close</button>
      </div>
    </div>
  </div>
</div>

Now for the javascript.

var $search = $('#search');

$search.on('keypress', function(e) {
  if(e.which === 13) {
    var searchTerm = $.trim($search.val());

    if(searchTerm !== undefined && searchTerm !== '') {
      $search.parent().removeClass('has-error');
      
      var searchUUID = guid();

      $.ajax({
        url: '/search',
        method: 'POST',
        data: {
          search: searchTerm,
          search_uuid: searchUUID
        }
      }).then(function() {
        $search.val('');
        displaySearchModal(searchUUID);
      }).fail(function() {
        // display error
      });
    } else {
      $search.parent().addClass('has-error');
    }
    return false;
  }
});

First, an event handler is set up to listen for keypresses on the search field. If the enter key is pressed and there's some value, a search uuid is generated and an ajax request is made to the backend. The backend then kicks off the job described earlier.

From the previous code, if the background job is successfully created, it calls the displaySearchModal function.

function displaySearchModal(searchUUID) {
  var $searchModal = $('#search-status');

  $searchModal.modal({
    keyboard: false
  });

  $searchModal.find('.loading').show();
  $searchModal.find('.search-results').hide();
  $searchModal.find('.search-results-download').attr('href', '#');

  $searchModal.on('hide.bs.modal', function() {
    console.log('search model close');
    // handle clean up, potentially cancelling the request
  });

  var search = App.cable.subscriptions.create({
    channel: "SearchChannel",
    search_uuid: searchUUID
  }, {
    connected: function() {
      console.log('connected');
    },

    disconnected: function() {
      console.log('disconnected');
      search = null;
    },

    received: function(data) {
      switch(data.status) {
        case 'complete':
          this.unsubscribe();
          search = null;
          $searchModal.find('.search-results-download').attr('href', data.download_link);
          $searchModal.find('.loading').hide();
          $searchModal.find('.search-results').show();
          break;
        case 'failed':
          this.unsubscribe();
          search = null;
          $searchModal.modal('hide');
          // display error
          break;
        default:
          console.log(data);
      }
    }
  });
}

This displays the modal with a spinner and clears out any previous search results. Then a subscription is created to listen to any communications broadcasted on the stream. The subscription takes in what channel to subscribe to and any parameters to pass to it. If you remember from earlier, the channel definition defined the stream using params[:search_uuid], this is where it is being set from. Then it has functions to handle when the subscription is connected, disconnected, and received. The received function is called when data is broadcasted from the backend to the stream. When it receives a status of complete it will hide the spinner and display a link to the search results export to download. When it receives a status of "failed", it cleans unsubscribes and hides the modal. The subscription is also where you would define methods to send data back to the channel. Again I'll reference DHH's that I mentioned earlier as it covers how to do this.

And there you have it. This is obviously a simple use case for ActionCable as we are just listening for results from the channel and not sending data to the channel. You can view the complete project here.