|
myvars = scanf(myformat)
myvars = anyString.scanf(myformat)
myvars = anyIO.scanf(myformat)
scanf always returns an array of matches ("conversions"):
str = "123 -456" n1, n2 = str.scanf("%d %d") # => 123, -456 array = str.scanf("%d %d") # => [123, -456]
Use var, syntax to store the first conversion only, and discard the rest:
first, = str.scanf("%d %d") # => 123
Any specifier can include a maximum field width. For a field width of n, matching stops when n characters have been matched, or when a character causes the match to fail, whichever comes first.
str = "123456" str.scanf("%4d %2d") # => 1234, 56 str = "123X45" str.scanf("%6d") # => 123
The string specifier %s matches only non-whitespace, and terminates on finding whitespace (even if a width has been specified).
str = " Alan Turing " str.scanf("%s %s") # => ["Alan", "Turing"] str.scanf("%8s %8s") # => ["Alan", "Turing"]
The character specifier %c ignores leading whitespace ONLY if preceded by whitespace in the format string. Note that %c returns strings (even if it matches digits).
str = "123 456" str.scanf("%d%c%d") # => [123, " ", 456] str.scanf("%d %c%d") # => [123, "4", 56]
%c with a width specifier matches internal whitespace. It ignores leading whitespace only if preceded by whitespace in the format string.
str = "42 is the key" str.scanf("%d%20c") # => 42, " is the key" str.scanf("%d %20c") # => 42, "is the key"
Newlines count as whitespace:
str = "12 345\n 567 \n678" str.scanf("%d %d %d %d") # => [12, 345, 567, 678]
Literal non-whitespace characters in the format string match themselves themselves in the input string.
book = "The Red and the Black" book.scanf("The %s and the %s") # => ["Red", "Black"]
Whitespace between sequences of non-whitespace literal characters is ignored:
str = "23 hrs, 42 min" str.scanf("%d hrs, %d mins") # => 23, 42 str = "23 hrs, 42 min" str.scanf("%dhrs,%dmins") # => 23, 42
Whitespace in the format string is ignored in matching literal characters:
str = "23hrs,42mins" hr, min = str.scanf("%d h r s ,%dm i n s") # => 23, 42
When the "assignment suppression" flag appears (the character '*' following the '%'), a match is required as usual but no corresponding value is returned.
str = "James K. Polk" str.scanf("%s %*s %s") # => "James", "Polk" str = "123 234 345 456 567" str.scanf("%d %*d %*d %*d %d") # => [123, 567]
%o converts an octal string with or without the leading zero.
str = "345 0345" str.scanf("%o %o") # => [229, 229]
An invalid octal digit terminates the match, even when a field width is present.
str = "2378" str.scanf("%5o%c") # => [140, "8"] (i.e. [0237, "8"])
The %x specifier (which can also be written as "%X") converts hexadecimal integers in the same way. The leading 0x (or, equivalently, 0X) in the input string is optional. Hex digits are case-insensitive.
str = "beef 0xbeef0 BEEF 0Xbeef 0xBEef" str.scanf("%x%x%x%x%x") # => [48879, 48879, 48879, 48879, 48879]
An invalid hexadecimal digit terminates the match, even when a field width is present.
str = "beefsteak" str.scanf("%6x%s") # => 48879, "steak"
The %i specifier will convert an integer from any of the common bases. The leading 0 or 0x must be specified for octal and hex numbers; otherwise, the number will be assumed to be decimal.
str = "345 0345 0x345" str.scanf("%i %i %i") # => [345, 229, 837]
scanf for Ruby allows floats written in any format which is meaningful to Ruby's String#to_f method. On conversion, the float will be represented in a standardized way.
str = "1.2, +1.2, 1.2e34, 1.2e+34, 1.2e-34" str.scanf("%f,%f,%f,%f,%f") # => [1.2, 1.2, 1.2e+34, 1.2e+34, 1.2e-34]
Character classes are similar to those in regular expressions. However, unless a field width is provided, scanf will by default match any number of consecutive characters matching the class (similar to the '+' quantifier in a regular expression). The conversion always results in a string, even if the characters are digits. Leading whitespace is NOT ignored.
str = "TX78754" str.scanf("%[A-Z]%[0-9]") # => ["TX", "78754"] str = "(512) 459-2222... pizza" phone, = str.scanf("%[()1-9 -]") # => "(512) 459-2222"
When called with a block, scanf repeatedly scans the input, yielding a new array of results every time it matches the format string. This is a convenient way of doing a series of scanf calls on a single string (or stream).
str = <<-EOS Beethoven 1770 Bach 1685 Handel 1685 Scarlatti 1685 Brahms 1833 EOS str.scanf("%s%d") { |name, year| puts "#{name} was born in ", year }
# => Beethoven was born in 1770. Bach was born in 1685. Handel was born in 1685. Scarlatti was born in 1685. Brahms was born in 1833. names = str.scanf("%s%d") { |name, year| name.upcase } # => ["BEETHOVEN", "BACH", "HANDEL", "SCARLATTI", "BRAHMS"]
You can do the same thing with an IO stream:
fh = File.open("somefile", "rb") # "rb" is for Windows's benefit fh.scanf("%s%d") { |str,num| "#{str} goes with #{num}" }