Evgenii Legotckoi
Evgenii LegotckoiJune 21, 2016, 1:52 a.m.

User Guide #05 - Ruby - Regular expressions

Let's put together a more interesting program. This time we test whether a string fits a description, encoded into a concise pattern .

There are some characters and character combinations that have special meaning in these patterns, including:

[] - range specificication (e.g., [a-z] means a letter in the range

a
to z)
\w - letter or digit; same as [0-9A-Za-z]
\W - neither letter or digit
\s - space character; same as [ \t\n\r\f]
\S - non-space character
\d - digit character; same as [0-9]
\D - non-digit character
\b - backspace (0x08) (only if in a range specification)
\b - word boundary (if not in a range specification)
\B - non-word boundary
* - zero or more repetitions of the preceding
+ - one or more repetitions of the preceding
{m,m} - at least m and at most n repetitions of the preceding
? - at most one repetition of the preceding; same as
{0,1}

| - either preceding or next expression may match
() - grouping


The common term for patterns that use this strange vocabulary is regular expressions . In ruby, as in Perl, they are generally surrounded by forward slashes rather than double quotes. If you have never worked with regular expressions before, they probably look anything but regular , but you would be wise to spend some time getting familiar with them. They have an efficient expressive power that will save you headaches (and many lines of code) whenever you need to do pattern matching, searching, or other manipulations on text strings.

For example, suppose we want to test whether a string fits this description: "Starts with lower case f, which is immediately followed by exactly one upper case letter, and optionally more junk after that, as long as there are no more lower case characters." If you're an experienced C programmer, you've probably already written about a dozen lines of code in your head, right? Admit it; you can hardly help yourself. But in ruby you need only request that your string be tested against the regular expression /^f A-Z $/.*

How about "Contains a hexadecimal number enclosed in angle brackets"? No problem.

ruby> def chab(s)   # "contains hex in angle brackets"
    |    (s =~ /<0(x|X)(\d|[a-f]|[A-F])+>/) != nil
    | end
  nil
ruby> chab "Not this one."
  false
ruby> chab "Maybe this? {0x35}"    # wrong kind of brackets
  false
ruby> chab "Or this? <0x38z7e>"    # bogus hex digit
  false
ruby> chab "Okay, this: <0xfc0004>."
  true

Though regular expressions can be puzzling at first glance, you will quickly gain satisfaction in being able to express yourself so economically.

Here is a little program to help you experiment with regular expressions. Store it as regx.rb and run it by typing "ruby regx.rb" at the command line.

# Requires an ANSI terminal!

st = "\033[7m"
en = "\033[m"

while TRUE
  print "str> "
  STDOUT.flush
  str = gets
  break if not str
  str.chop!
  print "pat> "
  STDOUT.flush
  re = gets
  break if not re
  re.chop!
  str.gsub! re, "#{st}\\&#{en}"
  print str, "\n"
end
print "\n"

The program requires input twice, once for a string and once for a regular expression. The string is tested against the regular expression, then displayed with all the matching parts highlighted in reverse video. Don't mind details now; an analysis of this code will come soon.

str> foobar
pat> ^fo+
foobar
~~~

Matches part was marked the following line "~~~".

Let's try several more inputs.

str> abc012dbcd555
pat> \d
abc012dbcd555
   ~~~    ~~~

If that surprised you, refer to the table at the top of this page: \d has no relationship to the character

d
, but rather matches a single digit.

What if there is more than one way to correctly match the pattern?

str> foozboozer
pat> f.*z
foozboozer
~~~~~~~~

foozbooz is matched instead of just fooz ,since a regular expression maches the longest possible substring.

Here is a pattern to isolate a colon-delimited time field.

str> Wed Feb  7 08:58:04 JST 1996
pat> [0-9]+:[0-9]+(:[0-9]+)?
Wed Feb  7 08:58:04 JST 1996
           ~~~~~~~~

"=~" is a matching operator with respect to regular expressions; it returns the position in a string where a match was found, or nil if the pattern did not match.

We recommend hosting TIMEWEB
We recommend hosting TIMEWEB
Stable hosting, on which the social network EVILEG is located. For projects on Django we recommend VDS hosting.

Do you like it? Share on social networks!

Comments

Only authorized users can post comments.
Please, Log in or Sign up
e
  • ehot
  • March 31, 2024, 9:29 p.m.

C++ - Тест 003. Условия и циклы

  • Result:78points,
  • Rating points2
B

C++ - Test 002. Constants

  • Result:16points,
  • Rating points-10
B

C++ - Test 001. The first program and data types

  • Result:46points,
  • Rating points-6
Last comments
k
kmssrFeb. 9, 2024, 2:43 a.m.
Qt Linux - Lesson 001. Autorun Qt application under Linux как сделать автозапуск для флэтпака, который не даёт создавать файлы в ~/.config - вот это вопрос ))
Qt WinAPI - Lesson 007. Working with ICMP Ping in Qt Без строки #include <QRegularExpressionValidator> в заголовочном файле не работает валидатор.
EVA
EVADec. 25, 2023, 6:30 p.m.
Boost - static linking in CMake project under Windows Ошибка LNK1104 часто возникает, когда компоновщик не может найти или открыть файл библиотеки. В вашем случае, это файл libboost_locale-vc142-mt-gd-x64-1_74.lib из библиотеки Boost для C+…
J
JonnyJoDec. 25, 2023, 4:38 p.m.
Boost - static linking in CMake project under Windows Сделал всё по-как у вас, но выдаёт ошибку [build] LINK : fatal error LNK1104: не удается открыть файл "libboost_locale-vc142-mt-gd-x64-1_74.lib" Хоть убей, не могу понять в чём дел…
G
GvozdikDec. 19, 2023, 5:01 a.m.
Qt/C++ - Lesson 056. Connecting the Boost library in Qt for MinGW and MSVC compilers Для решения твой проблемы добавь в файл .pro строчку "LIBS += -lws2_32" она решит проблему , лично мне помогло.
Now discuss on the forum
a
a_vlasovApril 14, 2024, 1:41 p.m.
Мобильное приложение на C++Qt и бэкенд к нему на Django Rest Framework Евгений, добрый день! Такой вопрос. Верно ли следующее утверждение: Любое Android-приложение, написанное на Java/Kotlin чисто теоретически (пусть и с большими трудностями) можно написать и на C+…
Павел Дорофеев
Павел ДорофеевApril 14, 2024, 9:35 a.m.
QTableWidget с 2 заголовками Вот тут есть кастомный QTableView с многорядностью проект поддерживается, обращайтесь
f
fastrexApril 4, 2024, 11:47 a.m.
Вернуть старое поведение QComboBox, не менять индекс при resetModel Добрый день! У нас много проектов в которых используется QComboBox, в версии 5.5.1, когда модель испускает сигнал resetModel, currentIndex не менялся. В версии 5.15 при resetModel происходит try…
AC
Alexandru CodreanuJan. 19, 2024, 7:57 p.m.
QML Обнулить значения SpinBox Доброго времени суток, не могу разобраться с обнулением значение SpinBox находящего в делегате. import QtQuickimport QtQuick.ControlsWindow { width: 640 height: 480 visible: tr…

Follow us in social networks