Ruby refinements and the sorbet type checker

Ruby refinements and the sorbet type checker

On the latest project I have been working on, we decided to implement the sorbet type system from day 1. And recently had to some code similar to the below: -

class MyClass
  extend T::Sig

  sig { returns(T::Hash[String, Enumerable]) }
  def personal_emails
    @personal_emails ||= communications.fetch(:email) { [] }
                                       .index_by(&:use_code)
                                       .reject { |k, _v| k.nil? }
  end

  # Simplified output from remote data store
  sig { returns(T::Hash[String, T::Array[Enumerable]]) }
  def communications
    {
      email:   [
        { name: 'Foo', use_code: 'personal' },
        { name: 'Bar', use_code: 'business' },
        { name: 'Bar', use_code: nil }
      ],
      telephone: [
        { name: 'Baz', use_code: 'personal' }
      ]
    }
  end
end

Now, this is fine for a single method, but as we continue and we start mapping more of the fields, we start to get more and more duplication.

sig { returns(T::Hash[String, Enumerable]) }
def personal_emails
  @personal_emails ||= communications.fetch(:email) { [] }
                                     .index_by(&:use_code)
                                     .reject { |k, _v| k.nil? }
end

sig { returns(T::Hash[String, Enumerable]) }
def telephones
  @telephones ||= communications.fetch(:telephone) { [] }
                                .index_by(&:use_code)
                                .reject { |k, _v| k.nil? }
end

My main point of OCD is the .reject and there are a few ways to fix it.

Method 1

One way would be to use a utility method. For example:

sig { prams(hash: Enumerable).returns(Enumerable) }
def remove_nil_keys(hash)
  hash.reject { |k, _v| k.nil? }
end

But now we sacrifice readability by defining what we will do with the result before we tell the user what we are doing in the first place. To me, calling the rmeove_nil_keys method overshadows the actual function we are performing. The method also now doesn’t depend on the instance state so we end up moving it to a class method, in short, we now have a MyClass.remove_nil_keys which makes no sense and you just know that from SomeUnrelatedClass someone will call MyClass.remove_nil_keys or copy and paste it to where it’s additionally needed.

sig { returns(T::Hash[String, Enumerable]) }
def personal_emails
  @personal_emails ||= remove_nil_keys(communications.fetch(:email) { [] }.index_by(&:use_code))
end

Method 2

Another way would be to create another method that takes a couple of arguments

sig { returns(T::Hash[String, Enumerable]) }
def communication_item(fetch_item, index_by)
  communications.fetch(fetch_item) { [] }.index_by(&index_by).reject { |k, _v| k.nil? }
end

However now we run into memoization issues, and as soon as this no longer exactly fits our use cases we are going to end up copy-pasting code and causing duplication again.

I am a firm believer that code dryness is like a tumble dryer, it is up to the user to define how dry they want their code to be, sometimes “hanger dry” is sufficient and a little bit of duplication is ok if it aids in the readability. For this reason, this seemed like a good time for some ruby refinements.

Method 3

Basically what I need is an Array method that performs .index_by.reject.

Back in the day we would go ahead and monkey patch the Array class to include our new function and unleash that method on our entire application, this meant that things can and did break. When you started piling on gems which all did monkey patching sometimes you got a plethora of strange and ominous errors.

We also need to think about Array#index_by being an enum, and Array#reject being an enum. So for the sake of completion, our new method can also be an enum in itself in just a few extra lines and we can chain our new enum with others should the need arise.

If we were to do this using simple monkey patching we could simply do the following

class Array
  def index_by_without_nil()
    if block_given?
      index_by(&block).reject { |k, _v| k.nil? }
    else
      to_enum(:index_by_without_nil)
    end
  end
end

But let’s face it monkey patching where it might work, it’s not pretty.

Wow! that was a lot of introduction!

Method 4: Using ruby refinements

If we are to do this using ruby refinements then we need to add a couple more lines to use refinements over monkey patching and make the sorbet type checker happy. The whole solution would now look something like this.

module Util::Array::IndexByWithoutNil
  extend T::Sig
  extend T::Helpers

  include Kernel
  include Enumerable

  abstract!

  refine Array do
    extend T::Sig

    sig { params(block: T.nilable(T.proc.void)).returns(Enumerable) }
    def index_by_without_nil(&block)
      if block_given?
        index_by(&block).reject { |k, _v| k.nil? }
      else
        to_enum(:index_by_without_nil)
      end
    end
  end
end

class MyClass
  extend T::Sig

  using Util::Array::IndexByWithoutNil

  sig { returns(T::Hash[String, Enumerable]) }
  def personal_emails
    @personal_emails ||=
      communications.fetch(:email) { [] }
                    .index_by_without_nil(&:use_code)
  end

  # Simplified output from remote data store
  sig { returns(T::Hash[String, T::Array[Enumerable]]) }
  def communications
    {
      email:   [
        { name: 'Foo', use_code: 'personal' },
        { name: 'Bar', use_code: 'business' },
        { name: 'Bar', use_code: nil }
      ],
      telephone: [
        { name: 'Baz', use_code: 'personal' }
      ]
    }
  end
end

I was amazed just how few errors sorbet threw and resolving them were as easy as including a few ruby modules to ensure sorbet could find the references we make in refinement. I am not amazingly happy we have to include these modules purely to satisfy the type checker but the overhead of including these is pretty low.

Interestingly we had to call abstract! provided by the T::Helpers module to satisfy some abstract methods defined in Enumerable.

However, using this code we will still get an error reported by sorbet

demo.rb:33: Method index_by_without_nil does not exist on T::Array[T.untyped] https://srb.help/7003
     33 |          .index_by_without_nil(&:use_code)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is telling us that sorbet, as great as it is, doesn’t understand that our refinement is making this method available. Refinements are not used as much as they possibly should so it’s no wonder that refinement support is (as far as I know) not yet included in sorbet.

To get around this, we can write a simple sorbet plugin to generate the required rbi for us.

# sorbet/plugins/index_by_without_nil.rb
source = ARGV[5]
/using Util::Array::IndexByWithoutNil/.match(source) do |_|
  puts <<-RBIDEF
    class ::Array
      def index_by_without_nil(&block); end
    end
  RBIDEF
end

# sorbet/triggers.yml
triggers:
  using: sorbet/plugins/index_by_without_nil.rb

This plugin simply looks out for our using Util::Array::IndexByWithoutNil and if it finds it outputs the method definition when its found. You can find more information on sorbet plugins on the sorbet website.

Conclusion

When you start using sorbet it can be a pain to understand how all its functional parts come together. But after a while, you find it is extremely extensible and worth that extra bit of effort making sure it’s set up for as much of your codebase as you can.

Paul Trippett's Picture

About Paul Trippett

Ruby Architect @ NGA Human Resources

London, UK http://trippett.co.uk

Comments