What are regular expressions?

bri189a

Well-known member
Joined
Sep 11, 2003
Messages
1,004
Location
VA
Ive been doing this forever; I see these regex problems from time to time but never look into them; why? - no interest; but maybe theyd be useful to me if I saw them used in a way that Im doing in another way that takes more work. What exactly are they and what are they good for; the syntax looks extremely convoluted - too convoluted to be in these high level languages we work in at least. Can someone show me an example of where it is good to use vice using there own procedure?

As an example I pulled off this from one of the websites mentioned in the sticky:

^[A-Za-z0-9](([_\.\-]?[a-zA-Z0-9]+)*)@([A-Za-z0-9]+)(([\.\-]
?[a-zA-Z0-9]+)*)\.([A-Za-z]{2,})$

is suppose to:
does not allow IP for domain name : hello@154.145.68.12
does not allow litteral addresses "hello, how are you?"@world.com
allows numeric domain names
after the last "." minimum 2 letters

I dont see "hello" anywheere in that crazy string up there, and if
I was trying to just filter out "hello" and qoutations Id have:

if(var.IndexOf("hello")!=0 || var.IndexOf("\"")!=0)
SomeError();

Why am I going to spend 20 minutes trying to filter out what the heck all
that syntax is getting to?!!! Somebody reading someone elses
code could easily figure out what I typed.
 
Last edited by a moderator:
RegExp are very powerfull as you see in your posted example. Try to do the same using if statements :D
Its not a good example to learn how to use regexp though because its very complicated.
RegExp are much faster and very important if you filter stuff (e.g. files) very often.
Bottomline: theyre faster and more elegant but harder to read
Find a tutorial about regExp and youll learn step by step.
IMHO regExp are not something that perl scripters use to show off but a powerfull tool for the serious programmer. Sooner or later everybody will come accross a point where regExp will/should be the tool of choice.
 
Last edited by a moderator:
^[A-Za-z0-9](([_\.\-]?[a-zA-Z0-9]+)*)@([A-Za-z0-9]+)(([\.\-]
?[a-zA-Z0-9]+)*)\.([A-Za-z]{2,})$

There are many tutorials out there on RegEx but to give a quick explanation of the above...

^ means the begining of your string
[] is a grouping of characters and the - means through, so A-Z means "everything A thru Z"

so ^[A-Za-z0-9] means the begining of a string that contains a letter (upper or lowercase) or a number

--------------
some more info:

* stands for 1 or more of the previous character, so 0* means 1 or more 0s, and [A-Za-z]* means 1 or more letters (upper and lower case)

? stands for 1 or none match of the previous character

\ is used to escape characters with special meanings such as . which stands for ANYTHING, or * or ?

$ means the end of a string

----------------

RegEx is extremely usefull in some languages (PHP, Perl, unix admin) but I can see how you avoid using them in .Net as theres so much functionality already in .Net that you dont have in these other enviroments.

Personally I finh RegEx a corner of more traditional programming and something any serious coder knows. Even if .Net can help you get around using RegEx, I highly recommend you learn them anyways. Youll find it very powerfull for doing matches that might be complicated otherwise, like say if you wanted to find time in a string just match for "\d\d:\d\d\s?(AM|PM)" and it will look for 2 digits, colon, 2 digits, 1 optional space and then AM or PM.
 
Last edited by a moderator:
Back
Top