Force encoding when using frozen string literals?

I’m looking into using frozen string literals as it appears to be the future and so on.

However, I’m having issues in my registrar when trying to force the encoding to get around that old Windows bug. If I understand it correctly __FILE__ uses correct UTF-8 encoding for its content, but the encoding property of the string is wrongly set. This is why we just want to change the encoding property without converting the string content.

The problem is that force_encoding isn’t allowed under frozen_string_literal. The following file gives me a FrozenError and wont load.

Is there anyway around this?

# frozen_string_literal: true

require "extensions.rb"

# Eneroth Extensions
module Eneroth
  # Scaled Tape Measure
  module ScaledTapeMeasure
    path = __FILE__
    path.force_encoding("UTF-8") if path.respond_to?(:force_encoding)

    # Identifier for this extension.
    PLUGIN_ID = File.basename(path, ".*")

    # Root directory of this extension.
    PLUGIN_ROOT = File.join(File.dirname(path), PLUGIN_ID)

    # Extension object for this extension.
    EXTENSION = SketchupExtension.new(
      "Eneroth Scaled Tape Measure",
      File.join(PLUGIN_ROOT, "main")
    )

    EXTENSION.creator     = "Eneroth"
    EXTENSION.description = "Measure model with respect to customs scale."
    EXTENSION.version     = "1.0.0"
    EXTENSION.copyright   = "2019, #{EXTENSION.creator}"
    Sketchup.register_extension(EXTENSION, true)
  end
end

I want a force_encoding!! (not bang) method!

Usually, we make a copy of the string FILE, with the #dup method.
Then you can force the encoding.

1 Like

Do you know if dup works on frozen string literals by design or by accident as it doesn’t copy the frozen state? If Ruby moves to frozen string literals it would seem logical for taht to apply to all strings without exceptions.

    path = __FILE__.encode('UTF-8','UTF-8') if path.respond_to?(:encode)

… or …

    path = __FILE__.encode('UTF-8','UTF-8') if defined?(Encoding)

:question:

By design. dup copies the string and returns one that is not frozen.

JFYI, Windows Rubies 2.4 and later have had fixes re encoding issues, and I think Trimble also changed some things. Monkeying around with encoding may not be needed for SU 2018 and later.

For instance, the following all return UTF-8 for encoding in SU 2018 & 2019:

__FILE__
$LOAD_PATH.last
$LOADED_FEATURES.last
1 Like

This is what I usually do with __FILE__

	f = __FILE__ 
	if defined?(Encoding) 
		if f.encoding.inspect =~ /Windows/i
			file = (f + '').force_encoding('iso-8859-1').encode("UTF-8")
		elsif f.encoding.inspect !~ /UTF/i
			file = f.dup.force_encoding("UTF-8")
		else	
			file = f + ""
		end
	else
		file = f + ""
	end	
	file = file.gsub(/\\/, "/")

I’ve got some folders for Ruby encoding testing.

In SU 2018 the following did not work from the console, but it worked from SU 2019:

load "C:/Greg/Ruby киї/encoding.rb"

To my understanding encode changes the binary content of the string (converts it) while force encoding only changes the encoding property. If I’m correct __FILE__ does return a valid UTF-8 string, only with the wrong encoding property, which causes Ruby to try to convert it later and then messes it up.

I think encoding in the console is a separate issue.

That means dup unfreezes frozen things by design, not that Ruby is supposed to support individual strings to be mutable in an environment where strings are otherwise immutable. If such an edge case is to be allowed I would imagine the memory handling would be quite a bit more complex.

I know this. But I’m asking if perhaps telling the #encode method to treat the string source encoding as ‘UTF-8’ regardless and create a new ‘UTF-8’ encoded string, will in effect just clone it with the proper encoded property.

… also the test to “fix” encoding could have an extra conditional …

&& path.encoding != Encoding::UTF_8
Object.new.frozen?             #=> false
Object.new.freeze.frozen?      #=> true
Object.new.freeze.dup.frozen?  #=> false

Importantly, one cannot ‘thaw’ an existing object.

Why all that conditional logic?

I’ve using these two lines for my extensions, set it as a constant in the extension namespace and then build paths from that constant. Never had an issue on any platform or SU version.

And seeing how I don’t try to support Ruby 1.x any more I can omit the respond_to as well:

  file = __FILE__.dup
  file.force_encoding('UTF-8')

If I really wanted a one-liner I can use tap:

  file = __FILE__.dup.tap { |f| f.force_encoding('UTF-8') }

#encode doesn’t work properly when the string is already tagged with the wrong encoding. If there are byte sequences in the string that doesn’t fit in the wrong encoding its tagged with it’ll fail or produce junk.

I’ve also seen this happen with strings created from ‘backtick’ operations. Nothing to do with SU…

This code is actually used for the ENV paths (that is, ENV['LOCALAPPDATA'] and ENV['APPDATA']. It was built progressively based on user feedback, along the Sketchup versions 6 to 2019, which I still support in many of my plugins. In particular, problems were detected in Turkish and Russian, hence the complications with the intermediate encoding in ISO.

For __FILE__ I used a simpler code, just forcing the encoding to UTF-8, but I wonder if I should not include the cases of Windows encoding too.

Anyway, Sketchup does nothing to help… whether in the Ruby runtime, or by providing methods to access a valid location where to store data.

This should also work as #force_encoding always returns the receiver (created by #dup) …

file = __FILE__.dup.force_encoding('UTF-8')

Another one-liner …

file =( defined?(Encoding) ? __FILE__.dup.force_encoding('UTF-8') : __FILE__  )
1 Like

I think we tried to config Ruby to provide correct encoding for __FILE__ and ENV in the past, without success. Ruby issues (mainly with Windows builds). Not sure if things have changed since then. Could be worth to have another look. (But last time I checked Ruby bug reports they where hesitant to apply a fix due to compatibility concerns.)