模块 Shellwords
像 UNIX Bourne shell 一样操作字符串¶ ↑
此模块根据 UNIX Bourne shell 的单词解析规则操作字符串。
shellwords()
函数最初是 shellwords.pl 的移植版本,但已修改为符合 IEEE Std 1003.1-2008, 2016 Edition 的 Shell & Utilities 卷
用法¶ ↑
你可以使用 Shellwords
将字符串解析为 Bourne shell 友好的 Array
。
require 'shellwords' argv = Shellwords.split('three blind "mice"') argv #=> ["three", "blind", "mice"]
一旦你 require 了 Shellwords
,你就可以使用 split 别名 String#shellsplit
。
argv = "see how they run".shellsplit argv #=> ["see", "how", "they", "run"]
它们将引号视为特殊字符,因此不匹配的引号会导致 ArgumentError。
argv = "they all ran after the farmer's wife".shellsplit #=> ArgumentError: Unmatched quote: ...
Shellwords
还提供了执行相反操作的方法。 Shellwords.escape
或其别名 String#shellescape
,会对字符串中的 shell 元字符进行转义,以便在命令行中使用。
filename = "special's.txt" system("cat -- #{filename.shellescape}") # runs "cat -- special\\'s.txt"
请注意 ‘–’。如果没有它,cat(1) 会将后续参数视为命令行选项(如果它以 ‘-’ 开头)。可以保证 Shellwords.escape
将字符串转换为 Bourne shell 将解析回原始字符串的形式,但程序员有责任确保将任意参数传递给命令不会造成损害。
Shellwords
还带有 Array
的核心扩展 Array#shelljoin
。
dir = "Funny GIFs" argv = %W[ls -lta -- #{dir}] system(argv.shelljoin + " | less") # runs "ls -lta -- Funny\\ GIFs | less"
你可以使用此方法从参数数组构建完整的命令行。
作者¶ ↑
-
青山 和久
-
武舎 昭紀 <[email protected]>
联系方式¶ ↑
-
武舎 昭紀 <[email protected]>(当前维护者)
常量
- VERSION
版本号字符串。
公共类方法
转义字符串,以便可以安全地在 Bourne shell 命令行中使用。 str
可以是响应 to_s
的非字符串对象。
由于 exec
系统调用的性质,str
不能包含 NUL 字符。
请注意,结果字符串应在不加引号的情况下使用,并且不打算在双引号或单引号中使用。
argv = Shellwords.escape("It's better to give than to receive") argv #=> "It\\'s\\ better\\ to\\ give\\ than\\ to\\ receive"
String#shellescape
是此函数的简写。
argv = "It's better to give than to receive".shellescape argv #=> "It\\'s\\ better\\ to\\ give\\ than\\ to\\ receive" # Search files in lib for method definitions pattern = "^[ \t]*def " open("| grep -Ern -e #{pattern.shellescape} lib") { |grep| grep.each_line { |line| file, lineno, matched_line = line.split(':', 3) # ... } }
调用者有责任为使用此字符串的 shell 环境以正确的编码对字符串进行编码。
多字节字符被视为多字节字符,而不是字节。
如果 str
的长度为零,则返回一个空的带引号的 String
。
# File shellwords.rb, line 158 def shellescape(str) str = str.to_s # An empty argument will be skipped, so return empty quotes. return "''".dup if str.empty? # Shellwords cannot contain NUL characters. raise ArgumentError, "NUL character" if str.index("\0") str = str.dup # Treat multibyte characters as is. It is the caller's responsibility # to encode the string in the right encoding for the shell # environment. str.gsub!(/[^A-Za-z0-9_\-.,:+\/@\n]/, "\\\\\\&") # A LF cannot be escaped with a backslash because a backslash + LF # combo is regarded as a line continuation and simply ignored. str.gsub!(/\n/, "'\n'") return str end
从参数列表 array
构建命令行字符串。
所有元素都连接成一个字符串,字段之间用空格分隔,其中每个元素都针对 Bourne shell 进行转义,并使用 to_s
进行字符串化。另请参阅 Shellwords.shellescape
。
ary = ["There's", "a", "time", "and", "place", "for", "everything"] argv = Shellwords.join(ary) argv #=> "There\\'s a time and place for everything"
Array#shelljoin
是此函数的快捷方式。
ary = ["Don't", "rock", "the", "boat"] argv = ary.shelljoin argv #=> "Don\\'t rock the boat"
你还可以在元素中混合非字符串对象,如 Array#join 中允许的那样。
output = `#{['ps', '-p', $$].shelljoin}`
# File shellwords.rb, line 208 def shelljoin(array) array.map { |arg| shellescape(arg) }.join(' ') end
将字符串分割成一系列标记,其方式与 UNIX Bourne shell 相同。
argv = Shellwords.split('here are "two words"') argv #=> ["here", "are", "two words"]
由于 exec
系统调用的性质,line
不能包含 NUL 字符。
但请注意,这不是命令行解析器。除了单引号、双引号和反斜杠之外的 shell 元字符不被视为元字符。
argv = Shellwords.split('ruby my_prog.rb | less') argv #=> ["ruby", "my_prog.rb", "|", "less"]
String#shellsplit
是此函数的快捷方式。
argv = 'here are "two words"'.shellsplit argv #=> ["here", "are", "two words"]
# File shellwords.rb, line 90 def shellsplit(line) words = [] field = String.new line.scan(/\G\s*(?>([^\0\s\\\'\"]+)|'([^\0\']*)'|"((?:[^\0\"\\]|\\[^\0])*)"|(\\[^\0]?)|(\S))(\s|\z)?/m) do |word, sq, dq, esc, garbage, sep| if garbage b = $~.begin(0) line = $~[0] line = "..." + line if b > 0 raise ArgumentError, "#{garbage == "\0" ? 'Nul character' : 'Unmatched quote'} at #{b}: #{line}" end # 2.2.3 Double-Quotes: # # The <backslash> shall retain its special meaning as an # escape character only when followed by one of the following # characters when considered special: # # $ ` " \ <newline> field << (word || sq || (dq && dq.gsub(/\\([$`"\\\n])/, '\\1')) || esc.gsub(/\\(.)/, '\\1')) if sep words << field field = String.new end end words end
私有实例方法
转义字符串,以便可以安全地在 Bourne shell 命令行中使用。 str
可以是响应 to_s
的非字符串对象。
由于 exec
系统调用的性质,str
不能包含 NUL 字符。
请注意,结果字符串应在不加引号的情况下使用,并且不打算在双引号或单引号中使用。
argv = Shellwords.escape("It's better to give than to receive") argv #=> "It\\'s\\ better\\ to\\ give\\ than\\ to\\ receive"
String#shellescape
是此函数的简写。
argv = "It's better to give than to receive".shellescape argv #=> "It\\'s\\ better\\ to\\ give\\ than\\ to\\ receive" # Search files in lib for method definitions pattern = "^[ \t]*def " open("| grep -Ern -e #{pattern.shellescape} lib") { |grep| grep.each_line { |line| file, lineno, matched_line = line.split(':', 3) # ... } }
调用者有责任为使用此字符串的 shell 环境以正确的编码对字符串进行编码。
多字节字符被视为多字节字符,而不是字节。
如果 str
的长度为零,则返回一个空的带引号的 String
。
# File shellwords.rb, line 158 def shellescape(str) str = str.to_s # An empty argument will be skipped, so return empty quotes. return "''".dup if str.empty? # Shellwords cannot contain NUL characters. raise ArgumentError, "NUL character" if str.index("\0") str = str.dup # Treat multibyte characters as is. It is the caller's responsibility # to encode the string in the right encoding for the shell # environment. str.gsub!(/[^A-Za-z0-9_\-.,:+\/@\n]/, "\\\\\\&") # A LF cannot be escaped with a backslash because a backslash + LF # combo is regarded as a line continuation and simply ignored. str.gsub!(/\n/, "'\n'") return str end
从参数列表 array
构建命令行字符串。
所有元素都连接成一个字符串,字段之间用空格分隔,其中每个元素都针对 Bourne shell 进行转义,并使用 to_s
进行字符串化。另请参阅 Shellwords.shellescape
。
ary = ["There's", "a", "time", "and", "place", "for", "everything"] argv = Shellwords.join(ary) argv #=> "There\\'s a time and place for everything"
Array#shelljoin
是此函数的快捷方式。
ary = ["Don't", "rock", "the", "boat"] argv = ary.shelljoin argv #=> "Don\\'t rock the boat"
你还可以在元素中混合非字符串对象,如 Array#join 中允许的那样。
output = `#{['ps', '-p', $$].shelljoin}`
# File shellwords.rb, line 208 def shelljoin(array) array.map { |arg| shellescape(arg) }.join(' ') end
将字符串分割成一系列标记,其方式与 UNIX Bourne shell 相同。
argv = Shellwords.split('here are "two words"') argv #=> ["here", "are", "two words"]
由于 exec
系统调用的性质,line
不能包含 NUL 字符。
但请注意,这不是命令行解析器。除了单引号、双引号和反斜杠之外的 shell 元字符不被视为元字符。
argv = Shellwords.split('ruby my_prog.rb | less') argv #=> ["ruby", "my_prog.rb", "|", "less"]
String#shellsplit
是此函数的快捷方式。
argv = 'here are "two words"'.shellsplit argv #=> ["here", "are", "two words"]
# File shellwords.rb, line 90 def shellsplit(line) words = [] field = String.new line.scan(/\G\s*(?>([^\0\s\\\'\"]+)|'([^\0\']*)'|"((?:[^\0\"\\]|\\[^\0])*)"|(\\[^\0]?)|(\S))(\s|\z)?/m) do |word, sq, dq, esc, garbage, sep| if garbage b = $~.begin(0) line = $~[0] line = "..." + line if b > 0 raise ArgumentError, "#{garbage == "\0" ? 'Nul character' : 'Unmatched quote'} at #{b}: #{line}" end # 2.2.3 Double-Quotes: # # The <backslash> shall retain its special meaning as an # escape character only when followed by one of the following # characters when considered special: # # $ ` " \ <newline> field << (word || sq || (dq && dq.gsub(/\\([$`"\\\n])/, '\\1')) || esc.gsub(/\\(.)/, '\\1')) if sep words << field field = String.new end end words end