Xe Blog

  • Home
  • Ebooks
    • Design & Illustration
    • Code
    • Web Design
    • General
  • Learning
    • Design & Illustration
      • Photoshop
      • Illustrator
      • Corel Draw
    • Code
      • Web Development
      • Wordpress
      • PHP
      • JavaScript
    • Web Design
      • HTML-5
      • CSS-3
      • Portfolios
    • Computer Skills
      • Windows
      • Linux
      • Terminal
      • Hardware
    • Office
      • MS-Access
      • MS-Excel
      • MS-PowerPoint
      • MS-Word
    • General
      • Urdu Inpage
      • Tips & Tricks
      • Security
  • Software
    • Antivirus
    • CD DVD
    • Converters
    • Drivers
    • Dictionary
    • Graphics
    • Media Player
    • Office
    • Windows OS
    • Linux OS
  • Games
    • Action
    • Fighting
    • GTA
  • Social
    • Facebook
    • Twitter
    • YouTube
    • Instagram

Saturday, 10 October 2015

Regular Expressions ~ Hands On!

Posted by Unknown at 21:18 Labels: javascript

Sooner or later you’ll run across a regular expression. With their cryptic syntax, confusing documentation and massive learning curve, most developers settle for copying and pasting them from StackOverflow and hoping they work. But what if you could decode regular expressions and harness their power? In this article, I'll show you why you should take a second look at regular expressions, and how you can use them in the real world.


Why Regular Expressions?

Why bother with regular expressions at all? Why should you care?
  • Matching: Regular expressions are great at determining if a string matches some format, such as a phone number, email or credit card number.
  • Replacement: Regular expressions make it easy to find and replace patterns in a string. For example, text.replace(/\s+/g, " ") replaces all chunks of whitespace in text, such as " \n\t ", with a single space.
  • Extraction: It's easy to extract pieces of information from a pattern with regular expressions. For example, name.matches(/^(Mr|Ms|Mrs|Dr)\.?\s/i)[1] extracts a person's title from a string, such as "Mr" from "Mr. Schropp".
  • Portability: Almost every major language has a regular expression library. The syntax is mostly standardized, so you don't have to worry about relearning regexes when you switch languages.
  • Coding: When writing code, you can use regular expressions to search through files with tools such as find and replace in Atom or ack in the command line.
  • Clear and Concise: If you're comfortable with regular expressions, you can perform some pretty tricky operations with a very small amount of code.
  • Fame and Glory: Regular expressions will give you superpowers. 

How to Write Regular Expressions

The best way to learn regular expressions is by using an example. Let's say you're building a web page with a phone number input. Because you're a rockstar developer, you decide to display a checkmark when the phone number is valid and an X when it's invalid.
<input id="phone-number" type="text">
<label class="valid" for="phone-number"><img src="check.svg"></label>
<label class="invalid" for="phone-number"><img src="x.svg"></label>
input:not([data-validation="valid"]) ~ label.valid,
input:not([data-validation="invalid"]) ~ label.invalid {
  display: none;
}
$("input").on("input blur", function(event) {
  if (isPhoneNumber($(this).val())) {
    $(this).attr({ "data-validation": "valid" });
    return;
  }

  if (event.type == "blur") {
    $(this).attr({ "data-validation": "invalid" });
  }
  else {
    $(this).removeAttr("data-validation");
  }
});
With the above code, whenever a person types or pastes a valid number into the input, the check image is displayed. When the user blurs the input and the value is invalid, the error X is displayed.
Since you know that phone numbers are made up of ten digits, your first pass atisPhoneNumber looks like this:
function isPhoneNumber(string) {
  return /\d\d\d\d\d\d\d\d\d\d/.test(string);
}
This function contains a regular expression between the / characters with ten \d's, or digit characters. The test method returns true if the regex matches the string and false if it doesn't. If you run isPhoneNumber("5558675309"), it returns true! Woohoo!
However, writing ten \d's is little redundant. Luckily, you can use the curly braces to accomplish the same thing.
function isPhoneNumber(string) {
  return /\d{10}/.test(string);
}
Sometimes, when people type in phone numbers, they start with a leading 1. Wouldn't it be nice if your regex could handle those cases? You can with the ? character!
function isPhoneNumber(string) {
  return /1?\d{10}/.test(string);
}
The ? symbol means zero or one, so now isPhoneNumber returns true for both"5558675309" and "15558675309"!
So far, isPhoneNumber is pretty good, but you're missing one key thing: regexes are more than happy to match parts of a string. As it stands, isPhoneNumber("555555555555555555")returns true because that string contains ten numbers. You can fix this problem by using the^ and $ anchors.
function isPhoneNumber(string) {
  return /^1?\d{10}$/.test(string);
}
Roughly, ^ matches the beginning of the string and $ matches the end, so now your regex will match the whole phone number.

Getting Serious

You released your page, and it's a smashing success, but there's one major problem. In the U.S., there are many common ways to write a phone number:
  • (234) 567-8901
  • 234-567-8901
  • 234.567.8901
  • 234/567-8901
  • 234 567 8901
  • +1 (234) 567-8901
  • 1-234-567-8901
While your users could leave out the punctuation, it's much easier for them to type out a formatted number.
While you could write a regular expression to handle all of those formats, it's probably a bad idea. Even if you nail every format in this list, it's very easy to miss one. Besides, you really only care about the data, not how it's formatted. So, instead of worrying about punctuation, why not strip it out?
function isPhoneNumber(string) {
  return /^1?\d{10}$/.test(string.replace(/\D/g, ""));
}
The replace function is replacing the \D character, which matches any non-digit characters, with an empty string. The g, or global flag, tells the function to replace all matches to the regular expression instead of just the first.

Getting Even More Serious

Everybody loves your phone number page, and you're the king of the water cooler at work. However, being the pro that you are, you want to take things one step further.
The North American Numbering Plan is the phone number standard used in the U.S., Canada, and twenty-three other countries. This system has a few simple rules:
  1. A phone number ((234) 567-8901) is broken up into three pieces: The area code (234), the exchange code (567) and the subscriber number (8901).
  2. For the area code and exchange code, the first digit can be 2 through 9 and the second and third digits can be 0 through 9.
  3. The exchange code cannot have 1 as the third digit if 1 is also the second digit.
Your regex already works for the first rule, but it breaks the second and third. For now, let's only worry about the second rule. The new regular expression needs to look something like the following:
/^1?<AREA CODE><EXCHANGE CODE><SUBSCRIBER NUMBER>$/
The subscriber number is easy; it's four digits.
/^1?<AREA CODE><EXCHANGE CODE>\d{4}$/
The area code is a little tricker. You need a number between 2 and 9, followed by two digits. To accomplish that, you can use a character set! A character set lets you specify a group of characters to choose from.
/^1?[23456789]\d\d<EXCHANGE CODE>\d{4}$/
That's great, but it's annoying to type out all the characters between 2 and 9. Clean it up with a character range.
/^1?[2-9]\d\d<EXCHANGE CODE>\d{4}$/
That's better! Since the exchange code is the same as the area code, you could duplicate your regex to finish off the number.
/^1?[2-9]\d\d[2-9]\d\d\d{4}$/
But, wouldn't it be nice if you didn't have to copy and paste the area code section of your regex? You can simplify it up by using a group! Groups are formed by wrapping characters in parentheses.
/^1?([2-9]\d\d){2}\d{4}$/
Now, [2-9]\d\d is contained in a group and {2} specifies that that group should occur twice.
That's it! Here's what the final isPhoneNumber function looks like:
function isPhoneNumber(string) {
  return /^1?([2-9]\d\d){2}\d{4}$/.test(string.replace(/\D/g, ""));
}

When to Avoid Regular Expressions

Regular expressions are great, but there's some problems you just shouldn't tackle with them.
  • Don't be too strict. There's little value in being too strict with regular expressions. For phone numbers, even if we did match all of the rules in NANP, there's still no way to know if a phone number is real. If I rattled off the number (555) 555-5555, it matches the pattern but it's not a real phone number.
  • Don't write an HTML parser. While it's fine to use regexes to parse simple things, they're not useful for parsing entire languages. Without getting too technical, you're not going to have a good time parsing non-regular languages with regular expressions.
  • Don't use them for really complicated strings. The full regex for emails is 6,318 characters long. A simple, imperfect one looks like this: /^[^@]+@[^@]+\.[^@\.]+$/. As a general rule of thumb, if you regular expression is longer than a line of code, it might be time to look for another solution.

Wrapping Up

In this article, you've learned when to use regular expressions and when to avoid them, and you've experienced the process of writing one. Hopefully regular expressions seem a bit less ominous, and maybe even intriguing. If you use a regex to solve a tricky problem, let me know in the comments!
Tweet

No comments :

Post a Comment

Newer Post Older Post Home
Subscribe to: Post Comments ( Atom )

Blog Archive

  • 2015 (11)
    • October (11)
      • How To Create A WordPress Plugin
      • WordPress Custom Post Type Complete - Easy Way
      • How to Create CSS Sliding Background Effect
      • 10 PHP Tips Every W.Developer Should Know :)
      • Flip Wall With jQuery & CSS
      • 7 Essential, Most Important JavaScript Functions
      • jQuery topLink Plugin
      • Layers vs. Artboards: ADOBE ILLUSTRATOR
      • Regular Expressions ~ Hands On!
      • Clipboard.js makes it easy to copy and cut text fr...
      • Unraveling the Secrets of WordPress' Comments.php ...
  • 2014 (2)
    • December (2)

© Xe Blog 2013 . Powered by Bootstrap Blogger templates and RWD Testing Tool