Regular expression is a sequence of characters that define a search pattern, mainly for use in pattern matching with strings, or string matching, ie “find and replace” -like operations.

Check regular expressions:

SMS message: “01.01.2016 12:12 Card 4444, balance 12345.67 USD. Bonus 0 USD”

Simple template for balance:

  • Find the line that describes the value. For example, “balance 12345.67 USD“, or «Bonus 0 USD“.
  • Replace the value to be found by “(.*?)“. Should be “balance (.*?) USD“, or “Bonus (.*?) USD“.
  • If the text contains a symbol “.” or “*“, in front of each symbol so you need to put “\“, such as “Bonus (.*?) USD\.“

Search the balance only for 4444 card:

The template will be “4444.*balance (.*?) USD“, where “.*” – Any character 0 or more times. This pattern will look for a value only in SMS with text: “4444[any text]balance [value] USD”.

Search the balance for not 4444 card:

The template will be “^(?!.*4444). *balance (.*?) USD“, where “^(?!.*4444)” – check the text “4444”. The template will look for value only if there is no 4444 in sms text.

Special characters:

  • d or [0-9] – Digital symbol
  • D or [^0-9] – non-numeric characters
  • s or [\f \n\r\t\v] – whitespace
  • S or [^\f\n\r\t\v] – non-whitespace character
  • w or a [[:word:]] – alpha or numeric character or an underscore
  • W or [^[:word:]] – Any character except an alphabetic or numeric character or an underscore
  • ^ – Start of text
  • $ – End of Text

Use special characters, for search time:

To find the time in the SMS, using first example, template will be “01.01.2016 (.*?) Card“. But this is the wrong template, because the date can be any, not only 01.01.16. So use special characters:

  • “\s(.*?)Card” – searching for the text “[space][value]Card.”
  • “[\d]{4}\s(.*?)Card” – searching for the text “[two digits][space][value]Card”, where “[\d]” – digits, and “{4 }“- digits count.
  • “[\d]{2}\. [\D]{2}\.[\D]{4}\s(.*?)\sCard” – searching for the text “[two digits].[two digits].[four digits][space][value]Card”, where “[\d]{2}\.[\d]{2}\.[\d]{4} ” is full date format.