Autoload type loading code, stop loading everything on boot!

MSP_Greg · July 28, 2019, 1:31pm

It seems rather common for plugins to load all their code on SU boot. That’s always seemed odd to me. Given how many plugins are available, it certainly seems possible that a typical user SU session might never use much of the code that is loaded.

Hence, the following seems to work as an autoload style system:

EXT_DIR = File.dirname(__FILE__).force_encoding('UTF-8').freeze

SU_AUTO = {
  DlgBase:   'dlg_base'  ,
  DlgFixed:  'dlg_sub'   ,
  DlgPB:     'dlg_sub'   ,
  DlgVari:   'dlg_sub'   ,
  DlgTable:  'dlg_table' ,
  DlgComp:   'comp'      ,
  Edit:      'edit'      ,
  DlgIE:     'entity'    ,
  DlgExport: 'export'    ,
  DlgImport: 'import'    ,
  DlgInfo:   'info'      ,
  DlgLayer:  'layer'     ,
  DlgMats:   'material'  ,
  DlgIN:     'numeric'   ,
  Obs:       'obs'       ,
  Settings:  'settings'  ,
  DlgTests:  'tests'
}.freeze

class << self
  def const_missing(name)
    if fn = SU_AUTO[name]
      if Sketchup.load "#{EXT_DIR}/#{fn}"
        klass = const_get name
        if klass
          return klass
        else
          UI.messagebox "Problem loading file #{fn}"
        end
      else
        UI.messagebox "Sketchup.load couldn't load #{fn}"
      end
    else
      super
    end
    nil
  end
  private :const_missing
end

The above code would need to be placed in any ‘namespace parent’, and loads children on demand. Paths do need to be taken into account.

In this case, all the ‘child’ files are in the same directory as the parent. This isn’t the norm for a typical gem structure, but we’re not loading with RubyGems…

There can be issues with ‘autoload’ systems when multiple threads are involved, but I suspect that is not common with SU plugins…

MSP_Greg · July 28, 2019, 2:52pm

Apologies for assuming people were familiar with autoload.

Complex apps often have quite a few different paths thru the code. So, correctly loading code can be tedious, and is often made worse by CI testing, which may load the code in a ‘non-standard’ way.

Hence, Ruby has methods Kernel.autoload and Module#autoload which allow one to associate a file with a constant name. If the constant name is not defined, ‘autoload’ will attempt to load the file, and allow the Ruby code to continue.

The code that I posted uses the const_missing method to load the file and return the constants value, which is the class/module defined in the file.

Hence, when using autoload type systems, no require/load statements are used, and everything loads on demand.

The standard Ruby autoload methods only work with *.rb or *.so files, so they cannot be used in an encrypted SU plugin.

The above code uses Sketchup.load instead of require.

DaveR · July 28, 2019, 2:57pm

Greg, I’m not a coder by any means but your post made me think of an option in the Sketchucation Tools extension that allows you to leave extensions normally unloaded and load them on demand. You have the option to load the extension only for the current session or you can set them to load on start up if you want. Have you experimented with that?

MSP_Greg · July 28, 2019, 3:37pm

Dave,

I was unaware of that, as it sounds helpful.

What I’m actually referring to is that the code required to give you access to a plugin’s features/UI is often a very small percentage of the plugin’s code. The ‘access’ code can (and should) be separate from the actual ‘feature’ code.

The code I listed is an option that allows the plugin to load the ‘feature’ code on demand, as opposed loading all the code when the plugin is loaded.

The code I listed is code that plugin creators/coders could use, but not something the plugin user would have access to, or control over.

As an example, a plugin I wrote a long time ago has 19 ruby files, but only three are needed to give the user the UI to use it. Many plugins would load all 19 files instead of 3.

Multiply that by several plugins, and the time it takes SU to load may be increased, along with the memory required, etc…

tt_su · July 29, 2019, 9:25am

Good suggestions Greg! This is something I had on my list to look into after I started looking closer at extension boot times and how Ruby gems often use autoload.

I added this topic to my list of topics to make examples and tutorials for. Might be a good blog post. I’ve been playing with the idea of having extension load times appear in the Extension Manager - making it easier for developers and users to see what’s eating time.

Related to boot times, there’s another thing I’ve been meaning to look into. I have a suspicion that Ruby toolbars (at least on Windows) might be causing a significant hit on boot times.

MSP_Greg · July 29, 2019, 2:18pm

Thomas,

I have a suspicion that Ruby toolbars (at least on Windows) might be causing a significant hit on boot times.

Just timed this with a plugin that is adding 29 menu items, 2 submenus, and 13 toolbar buttons (11 are svg), that whole method (create_ui) takes 0.1220 seconds.

Some of the items are only added if files exist in the plugin, so it’s more than just the normal commands used for menus/toolbars…

Timed it using that handy Ruby console router <bg>

kengey · July 29, 2019, 2:27pm

Is the size of the codebase of huge impact, or rather the amount of initialized objects on startup?

MSP_Greg · July 29, 2019, 3:20pm

I think it’s difficult to determine. All code takes:

time to load from disk
time to parse
time to register all the names in tables
objects added to the object store
larger object store means longer gc times
larger $LOADED_FEATURES means longer lookup times

Any code that’s loaded will always take time, memory, and increase the time it takes for many operations.

Will it significantly effect things?

I don’t think anyone has the time to determine the effects, given all the variables, for instance, SSD, or mechanical drive, RAM starved system or not, etc, etc.

Looking at Rails, they go to a lot of trouble to allow ‘autoloading’ of files, as things like threading, forking, etc affect the standard Ruby autoload (which don’t affect the vast majority of SU plugins)

Some of the reason they do so may be to minimize the require statements needed, as load order may be somewhat indeterminate with large applications, especially with CI. I suspect the main reason is to only load code when desired by the app devs.

Specifically for SU, I’m not really interested in installing the ‘top 10’ plugins and seeing how they affect SU startup time. @tt_su has mentioned concern for the issue, so I’ll trust his judgement.

I think ‘lazy-loading’ should be standard practice in plugins, along with things like “don’t leave observers connected unless it affects your plugin’s UI”…

kengey · July 29, 2019, 8:13pm

I agree largely. But I was only wondering how having a codebase of 1000’s separate files that only get loaded when needed but together with poor coding practices compares to a single file codebase with better coding practices, so that all code is there to create objects, but the code will only be called when needed. My release builds for example are combined into a single.rb that can get large in filesize, but I try to minimize the use of modules or even Singletons. Typically my code contains only 2 modules (to create the namespace) and thats it. Then, there is 1 singleton simply because one needs a global accessible object to initialise. All other objects are wrapped in a proxy that are injected into dependent objects through a service locator (https://stackify.com/service-locator-pattern/ EDIT: perhaps this is a better link, since I am not passing the service locater to every object, but their dependencies, the previous link might be confusing: Dependency injection in ASP.NET Core | Microsoft Learn). Objects are thus only created when really needed.

The generic proxy class I use looks like this

class GenericProxy

        def initialize(item_creation_proc)
          @item = nil
          @item_creation_proc = item_creation_proc
        end #def
  
        def method_missing(method, *args)
          item = item()
          if item.respond_to?(method)
            item.send(method, *args)
          else
            super
          end
        end
  
        def respond_to?(method)
          item = item()
          return item.respond_to?(method)
        end #def
  
        private 
        def item()
          return @item ||= @item_creation_proc.call()
        end #def

      end #class

to illustrate what it does


class SlowInitializingClass

  def initialize()
    # behave slow
    sleep(20)
  end #def

  def hello()
    return "hello"
  end #def

end 


class DependendClass

  def initialize(slow_class_instance)
    @slow_instance = slow_class_instance
  end #def

  def method_that_relies_on_slow_one()
    return @slow_instance.hello()
  end #def

end #class

# we need a reference to a SlowInitialingClass instance. But the creation is slow so we wrap it in a proxy
slow_instance_proxy = GenericProxy.new(->() {
  # on could retrieve dependencies for SlowInitializingClass here, if needed...
  return SlowInitializingClass.new()
})

# now we can pass the reference as if it were a SlowInitializingClass instance, but, without the slowness. After all, we are not sure if we will ever need to call it.
dependend_instance = DependendClass.new(slow_instance_proxy)

# do other things in code... dependend_instance knows how to get the slow class if needed
# ...
#...
# invoke, so instead of the intializer it is the method that goes slow the first time
dependend_instance.method_that_relies_on_slow_one()

MSP_Greg · July 29, 2019, 11:02pm

@kengey

Ok. My original post was about autoloading/load-on-demand. You’ve replied with a post minimally discussing that, and then discussing a pattern that somehow affects it?

First of all, regarding the concept of creating one source file from many, have you compared the load time of separate files vs the single source file? From your description, I assume you have both available…

Regarding source files, you threw out the number “1,000s”. Might be a little bit high…

On to your other subject…

but the code will only be called when needed

That’s true of any well designed app. How does the pattern you use, and the articles you linked to change anything?

Since SU is an app with a UI, there’s no reason to create objects unless they’re required for loading, or required due to a user action. Some dev’s may make a choice to create objects on load, which causes slower loading but quicker response with the UI, but that’s a reasonable choice.

how having a codebase of 1000’s separate files that only get loaded when needed but together with poor coding practices

Please clarify what you mean by ‘poor coding practices’.

Finally, I would caution against using patterns designed for one type of language with another. Using a pattern without a clear reason for doing so will just make code more complex.

kengey · July 30, 2019, 6:15am

I am not trying to argue against your post. The purpose of your post is to show how it manages to not load unneeded code and i am questioning if it is really the amount of code that affects load time, or the way that code is written and the number of objects that are initialized upon read time. And I question if it really is that significant, the amount of code you manage not to load at startup. Buttons in a toolbar not only invoke some command but also reflect a certain state in your app to make them checked or grayed.

Poor code? Use modules for everything… they defeat OO and they are initialized immediately even if you do not need them, or only need parts of them. All my extensions up till around 18 months ago used modules quite a lot for what would be singletons, such as data repositories, because that made them globally accessible. That was slow on boot. Because the entire repo would be created and stored in memory even if the user might never use the app. Then I got involved in a few asp.net core projects and learned about seeing everything as a service, using a service provider and constructor dependency injection and eventually tried to map those same practices on my ruby code. It did solve the issues I was experiencing at least.

Anyhow, if you prefer, we could continue this discussion elsewhere, so this one can remain about autoloading.

kengey · July 30, 2019, 6:45am

I have done a comparison on loading write_wlsx. This is the largest library I have combined into one file. Here are all the separate files. It has a dependecy on zip, so that is merged into that same large file:

All these files are combined into 1 file with 26k lines:

And here is the comparison.
Separate files:

All into 1 with 26k lines:

There is no one here creating extensions that are made up of 26k lines.

Not so much a ruby-ist myself so I might interpret these numbers wrong. But one 26k lines file takes 0.07s while the bunch of separated files take somewhere around 1.6s. If I am not mistaken, combining files into 1 improves loading speed with 2,200%

Another edit. Of course, the previous comparison is pretty slow because the path containing all files is the last in $LOAD_PATH. For that reason I did a comparsion where I reverse the array:

0.17s vs 0.07s. Still 2.5 times slower than the single file variant, but, 10 times(!) faster than the non reversed original $LOAD_PATH. That makes me think… What if Sketchup by default would’ve just reversed the order of $LOAD_PATH? The number of requires for libraries is I believe significantly lower than the number of requires for code in your own codebase (if you do not happen to combine everything into one file as I do)

But that also makes me think. Libraries are usually separated in far more smaller files than extension developers separate their files. So that might just reverse the issue.

I believe 1 core thing Ruby does wrong: it makes 0 distinction between source code and runtime code. Source code and the way we prefer to keep things nicely separated in different files only benefits the developer. And there are a few ways to try and solve that problem, such as autoload, or do your requires in a certain method so you can require it gradually, or whatever, I am afraid it will never outperform 1 file per library, instead of multiple. If loading separate files gradually is faster, then there is something wrong with what that code does when it is not needed to do it upon require.

kengey · July 30, 2019, 11:47am

Finally, I have also done a comparison for a library that is present in the stdlib: rexml.
I have combined rexml/document into 1 file and placed it at the same location rexml/document_combined. I will also attach that file here.
edit: other file with added copyright:
document_combined.rb (219.6 KB)

require "benchmark"

time = Benchmark.measure do
  require "rexml/document"
end
puts time

gives:

require "benchmark"

time = Benchmark.measure do
  require "rexml/document_combined"
end
puts time

gives:

MSP_Greg · July 30, 2019, 3:49pm

@kengey

Thanks for taking the time with your replies.

First, you mentioned the term ‘Rubyist’, which is a term I used in my profile. I should probably change that, but regardless, I’m a coder.

Been coding since the seventies as a teenager, started in CompSci, my ‘Gang of Four Design Patterns’ book is a very early printing, and I’ve written a lot of ‘MSFT’ code, both before and after .NET. I could still be an idiot, but I think most people place me somewhere above that.

First of all, with the multiple files/one file issue, if s is the average size of the files, and n is the number, as the ratio n/s rises, any measurement will be more about timing the OS’s opening of files instead of Ruby’s parsing of them. As to what values of n/s pertain to typical plugins, I don’t know.

.NET languages are essentially strongly typed compiled languages, while Ruby is an ‘untyped’ interpreted language. Ruby is also a ‘console language’, and using a ‘single instance’ of it in a UI based application like SU causes issues not seen when it’s a console app.

So, with SU’s embedded Ruby, one might look at Rubygems and Rails, as both are applications that often run for a long time. Both make use of an ‘autoload’ concept. Multiple files, load on demand.

I think some of the motivation for your ‘one file’ system was based on using gems/std-lib’s. One issue with them is that they may load a lot more files than needed for a particular task that they perform. That makes CI testing easier, as loading the main file also loads everything else. And since many are open source…

So, since one of the articles you cited was .NET with C# examples, maybe we should consider why .NET has so many assemblies/files instead of just one? Do you load all of them in your .NET/C# code? Why don’t c programs load every Windows dll, instead of only the ones they use?

I believe 1 core thing Ruby does wrong: it makes 0 distinction between source code and runtime code.

I having a hard time with that statement. Feel free to cite any articles/blogs, etc that share that opinion. In what way does the phrase ‘distinction between source code and runtime code’ apply to C#? How does it apply to an interpreted language like Ruby?

Normally, C# compiles several source files into a few runtime files. Ruby does the same in it’s VM. Where’s the difference?

Poor code? Use modules for everything

All my extensions up till around 18 months ago used modules quite a lot for what would be singletons, such as data repositories, because that made them globally accessible. That was slow on boot. Because the entire repo would be created and stored in memory even if the user might never use the app.

You seem to be conflating what code runs when it’s loaded by Ruby with whether that code is contained in a class or a module. Modules do not need to run any code when loaded. Most Ruby apps that use classes for singleton objects could be re-written with modules and perform exactly the same way. It’s just the norm to use classes…

Your posts have bridged several topics. Summing up my thoughts:

I believe code should be loaded in an ‘on demand’ fashion.
Using some gems and/std-libs may load a lot more code than is required. How best to address that is messy.
Combining multiple files into one may be helpful, but may also load a lot of code that is rarely used by the SU user.

One topic I haven’t seen addressed is what should object lifetime be? Even with autoload/load-on-demand, one still has the issue object lifetime.

So a plugin loads a file that displays/controls/interacts with a user dialog box. After the user closes the dialog, should the dialog instance be destroyed?

If a user has several plugins installed, and opens several dialogs in the course of a days work, does that affect SU? Same issue with other objects, whether they’re part of the SU API or native to Ruby. I’m guessing not, but I don’t know…

kengey · July 30, 2019, 4:00pm

Do not take that personal, at least I did not use that term to address you personally.

Regarding your book, i know you wrote it, you have mentioned that before and i have read it.

What got me into answering here is your title, that states we should stop loading all our code at once. I tried to give several examples that pure the fact of loading all code is perhaps not the real issue.

Does everything I write makes sense or explains it correctly? Sure not.

MSP_Greg · July 30, 2019, 4:28pm

@kengey

I didn’t take is a slight, I just recalled that I used it. Re the book, I did not write it, I mentioned it because I bought it long before it was considered a ‘standard read’ about design patterns and OOP programming.

You mentioned a design pattern, and the MSFT article about it is not one of their better articles. It mentions a valid idea, but the idea can easily conflict with the idea of encapsulation.

Also, use of a ‘pattern’ is often touted as a solution to problem x, but problem x may be language dependent. The pattern may be being used, but it’s actually solving another issue.

An example would be multiple inheritance. If a Ruby module is included in a class, is it really multiple inheritance? If it’s directly included in several other classes, it is. If it’s only included in one, it’s simply adding methods that could be placed in the class…

kengey · July 30, 2019, 4:37pm

Then I misread that.

If a ruby module is included, then it tightly couples that module to that class. If it is only used by that class then why not write it directly in that class? If it is so that you can use it in other classes in the future, then again they are thightly coupled.

DanRathbun · July 30, 2019, 4:40pm

Agree. And many forget that a module is an instance of class Module.

It has been argued in the past, I think by Ken, that the difference is the data of state. Using a module, a coder often uses static data variables to hold state, whereas a class instance only holds the data whilst it exists.
It is up to the coder to ensure that they only hold state data when it is needed, either as an instance of some custom class, or an instance of a Ruby core class (Array, Hash, Struct, etc.)
If however, data is and needs to be kept during the life of the session for quick and repetitive access, then IMO there is no difference between wrapping it up in a custom singleton class instance or holding it in a submodule. (I’ve found it is easier just to use a hash and load and save them from/to JSON files.)

A good coder uses either (module or class) where appropriate.

kengey · July 30, 2019, 4:52pm

The most interesting thing I found out today was that the index in $LOAD_PATH which is used to find the requested files has a huge impact. Reversing the global array gave me x10 better loading times loading the same set of files. 1.6s vs 0.16s or something. Imagine 20s of extensions all loading x files.

DanRathbun · July 30, 2019, 4:57pm

We may not want to reverse the whole thing. And it depends upon what the require call is loading. If from a path relative to the “Plugins” then yes the 4th path in $: we may want to check first. But if loading from the standard library, then we want to check the 1st path in $: first.

But, it is not just $LOAD_PATH that slows require down. It is also the check of the $LOADED_FEATURES array. Both of them hold path strings and string comparison is known to be slow in Ruby.

The require doc says foremost:

If the filename does not resolve to an absolute path, it will be searched for in the directories listed in $LOAD_PATH ( $: ).

So a coder can skip the $LOAD_PATH iteration if they use absolute path arguments.

The same if they use require_relative.

Secondly, if they write their code so as to allow multiple loads and use absolute paths, they can use load instead of require, and skip the check for both $: and $". (This only works for plain .rb files, as Sketchup#load is just an alias for the Sketchup::require method.)

Topic		Replies	Views
Speeding Up Extension Loading Ruby API extensions , require	57	3531	January 7, 2020
A question about how file_loaded? works Ruby API ruby	28	2821	August 15, 2018
Examples of proper extension organization Developers	27	4370	November 25, 2018
Issues regarding extension publishing Extension Warehouse for Developers	21	2215	June 23, 2021
How can I unload/reload a script ? and if its actually possible Ruby API	14	5259	October 3, 2019

Autoload type loading code, stop loading everything on boot!

Related topics