File
EXAMPLES
Path: doc/EXAMPLES
Modified: Mon Sep 02 09:34:48 EDT 2002

scanf for Ruby, v.1.1.0

EXAMPLES

Variants of scanf for Ruby

Return values

scanf always returns an array of matches ("conversions"):

    str = "123 -456"
    n1, n2 = str.scanf("%d %d")       #  => 123, -456
    array  = str.scanf("%d %d")       #  => [123, -456]

Use var, syntax to store the first conversion only, and discard the rest:

    first, = str.scanf("%d %d")       #  => 123

Width specifiers

Any specifier can include a maximum field width. For a field width of n, matching stops when n characters have been matched, or when a character causes the match to fail, whichever comes first.

    str = "123456"
    str.scanf("%4d %2d")              #  => 1234, 56
    str = "123X45"
    str.scanf("%6d")                  #  => 123

Whitespace handling

The string specifier %s matches only non-whitespace, and terminates on finding whitespace (even if a width has been specified).

    str = "  Alan Turing  "
    str.scanf("%s %s")                #  => ["Alan", "Turing"]
    str.scanf("%8s %8s")              #  => ["Alan", "Turing"]

The character specifier %c ignores leading whitespace ONLY if preceded by whitespace in the format string. Note that %c returns strings (even if it matches digits).

    str = "123 456"
    str.scanf("%d%c%d")               #  => [123, " ", 456]
    str.scanf("%d %c%d")              #  => [123, "4", 56]

%c with a width specifier matches internal whitespace. It ignores leading whitespace only if preceded by whitespace in the format string.

    str = "42   is the key"
    str.scanf("%d%20c")               #  => 42, "  is the key"
    str.scanf("%d %20c")              #  => 42, "is the key"

Newlines count as whitespace:

    str = "12 345\n 567 \n678"
    str.scanf("%d %d %d %d")          #  => [12, 345, 567, 678]

Matching literal non-whitespace characters

Literal non-whitespace characters in the format string match themselves themselves in the input string.

    book = "The Red and the Black"
    book.scanf("The %s and the %s")   #  => ["Red", "Black"]

Whitespace between sequences of non-whitespace literal characters is ignored:

    str = "23 hrs, 42 min"
    str.scanf("%d hrs, %d mins")      #  => 23, 42

    str = "23   hrs,   42    min"
    str.scanf("%dhrs,%dmins")         #  => 23, 42

Whitespace in the format string is ignored in matching literal characters:

    str = "23hrs,42mins"
    hr, min = str.scanf("%d h r s ,%dm i n s")
                                      #  => 23, 42

Assignment suppression

When the "assignment suppression" flag appears (the character '*' following the '%'), a match is required as usual but no corresponding value is returned.

    str = "James K. Polk"
    str.scanf("%s %*s %s")            #  => "James", "Polk"

    str = "123 234 345 456 567"
    str.scanf("%d %*d %*d %*d %d")    #  => [123, 567]

Matching octal and hexadecimal integers

%o converts an octal string with or without the leading zero.

    str = "345 0345"
    str.scanf("%o %o")                #  => [229, 229]

An invalid octal digit terminates the match, even when a field width is present.

    str = "2378"
    str.scanf("%5o%c")                #  => [140, "8"] (i.e. [0237, "8"])

The %x specifier (which can also be written as "%X") converts hexadecimal integers in the same way. The leading 0x (or, equivalently, 0X) in the input string is optional. Hex digits are case-insensitive.

    str = "beef 0xbeef0 BEEF 0Xbeef 0xBEef"
    str.scanf("%x%x%x%x%x")           #  => [48879, 48879, 48879, 48879, 48879]

An invalid hexadecimal digit terminates the match, even when a field width is present.

    str = "beefsteak"
    str.scanf("%6x%s")                #  => 48879, "steak"

The %i specifier will convert an integer from any of the common bases. The leading 0 or 0x must be specified for octal and hex numbers; otherwise, the number will be assumed to be decimal.

    str = "345 0345 0x345"
    str.scanf("%i %i %i")             #  => [345, 229, 837]

Floating point decimal numbers

scanf for Ruby allows floats written in any format which is meaningful to Ruby's String#to_f method. On conversion, the float will be represented in a standardized way.

    str = "1.2, +1.2, 1.2e34, 1.2e+34, 1.2e-34"
    str.scanf("%f,%f,%f,%f,%f")       #  => [1.2, 1.2, 1.2e+34, 1.2e+34, 1.2e-34]

Character classes

Character classes are similar to those in regular expressions. However, unless a field width is provided, scanf will by default match any number of consecutive characters matching the class (similar to the '+' quantifier in a regular expression). The conversion always results in a string, even if the characters are digits. Leading whitespace is NOT ignored.

    str = "TX78754"
    str.scanf("%[A-Z]%[0-9]")         #  => ["TX", "78754"]
    str = "(512) 459-2222... pizza"
    phone, = str.scanf("%[()1-9 -]")  #  => "(512) 459-2222"

Calling scanf with a block

When called with a block, scanf repeatedly scans the input, yielding a new array of results every time it matches the format string. This is a convenient way of doing a series of scanf calls on a single string (or stream).

    str = <<-EOS
      Beethoven  1770
      Bach       1685
      Handel     1685
      Scarlatti  1685
      Brahms     1833
    EOS

  str.scanf("%s%d") { |name, year| puts "#{name} was born in ", year }
    # =>  Beethoven was born in 1770.
          Bach was born in 1685.
          Handel was born in 1685.
          Scarlatti was born in 1685.
          Brahms was born in 1833.

  names = str.scanf("%s%d") { |name, year| name.upcase }
  # => ["BEETHOVEN", "BACH", "HANDEL", "SCARLATTI", "BRAHMS"]

You can do the same thing with an IO stream:

    fh = File.open("somefile", "rb")  # "rb" is for Windows's benefit
    fh.scanf("%s%d") { |str,num| "#{str} goes with #{num}" }