模块 Gem::Text

一个文本处理方法的集合

公共实例方法

clean_text(text) 点击切换源代码

移除任何不可打印的字符,使文本适合打印。

# File rubygems/text.rb, line 10
def clean_text(text)
  text.gsub(/[\000-\b\v-\f\016-\037\177]/, ".")
end
format_text(text, wrap, indent=0) 点击切换源代码

text 包装到 wrap 个字符,并可选地缩进 indent 个字符

# File rubygems/text.rb, line 24
def format_text(text, wrap, indent=0)
  result = []
  work = clean_text(text)

  while work.length > wrap do
    if work =~ /^(.{0,#{wrap}})[ \n]/
      result << $1.rstrip
      work.slice!(0, $&.length)
    else
      result << work.slice!(0, wrap)
    end
  end

  result << work if work.length.nonzero?
  result.join("\n").gsub(/^/, " " * indent)
end
levenshtein_distance(str1, str2) 点击切换源代码

返回一个表示将 str1 转换为 str2 的“成本”的值。来自 ruby/did_you_mean gem @ 1.4.0 的 DidYouMean::Levenshtein.distance 的供应商版本 github.com/ruby/did_you_mean/blob/2ddf39b874808685965dbc47d344cf6c7651807c/lib/did_you_mean/levenshtein.rb#L7-L37

# File rubygems/text.rb, line 54
def levenshtein_distance(str1, str2)
  n = str1.length
  m = str2.length
  return m if n.zero?
  return n if m.zero?

  d = (0..m).to_a
  x = nil

  # to avoid duplicating an enumerable object, create it outside of the loop
  str2_codepoints = str2.codepoints

  str1.each_codepoint.with_index(1) do |char1, i|
    j = 0
    while j < m
      cost = char1 == str2_codepoints[j] ? 0 : 1
      x = min3(
        d[j + 1] + 1, # insertion
        i + 1,      # deletion
        d[j] + cost # substitution
      )
      d[j] = i
      i = x

      j += 1
    end
    d[m] = x
  end

  x
end
truncate_text(text, description, max_length = 100_000) 点击切换源代码
# File rubygems/text.rb, line 14
def truncate_text(text, description, max_length = 100_000)
  raise ArgumentError, "max_length must be positive" unless max_length > 0
  return text if text.size <= max_length
  "Truncating #{description} to #{max_length.to_s.reverse.gsub(/...(?=.)/,'\&,').reverse} characters:\n" + text[0, max_length]
end