Validate IP Address
LeetCode Problem #468 (Medium)
Problem Statement
Write a function to check whether an input string is a valid IPv4 address or IPv6 address or neither.
Problem Explanation
IPv4 Addresses
IPv4 addresses are canonically represented in dot-decimal notation, which consists of four decimal numbers, each ranging from 0 to 255, separated by dots ("."), e.g.,172.16.254.1
;
Besides, leading zeros in the IPv4 is invalid. For example, the address 172.16.254.01
is invalid.
IPv6 Addresses
IPv6 addresses are represented as eight groups of four hexadecimal digits, each group representing 16 bits. The groups are separated by colons (":").
For example, the address 001:0db8:85a3:0000:0000:8a2e:0370:7334
is a valid one. Also, we could omit some leading zeros among four hexadecimal digits and some low-case characters in the address to upper-case ones so 2001:db8:85a3:0:0:8A2E:0370:7334
is also a valid IPv6 address(Omit leading zeros and using upper cases).
However, we don't replace a consecutive group of zero value with a single empty group using two consecutive colons (::) to pursue simplicity. For example, 2001:0db8:85a3::8A2E:0370:7334
is an invalid IPv6 address.
Besides, extra leading zeros in the IPv6 is also invalid. For example, the address 02001:0db8:85a3:0000:0000:8a2e:0370:7334
is invalid.
Testcases
Prerequisite Knowledge
Regular Expressions
A regular expression or regex is a syntactical expression which specifies the sequence of characters that must be present in a search pattern. Conventional applications of regex include string searching algorithms such as those used in search engines to search for keywords or find/replace algorithms such as those used in word processors.
Each regular expression has characters which either have literal meaning or special meaning ("metacharacters").
Consider the regex string below:
Here, a
is a character with a literal meaning. In other words, the search must definitely find a string with the starting letter as a
.
[a-z]
indicates that the character following a
can be any lowercase letter within the range a
to z
.
Range Specification
The range of characters or literals in a string is one of the simplest criteria used in regex. The range is specified using the square bracket notation "[]"
. If we wanted to include a bracket as a part of the pattern, we would have to use an escape sequence like "[\\[0-9]"
.
Consider the regex string below:
This expression specifies the range containing one uppercase character, one lowercase character, and one digit from 0 to 9.
Repeated Pattern
Repeated patterns can be searched using specific regex syntax.
To match a minimum of one string satisfying the given range criteria, the plus operator (+)
is used.
[a-z]+
matches strings like "a", "abc", "helloworld", etc. This sequence will never match a blank string.
If a particular sequence is optional i.e. we need to match zero or more strings, the multiplication operator (*)
is used.
[a-z]*
matches strings like "a", "abc", "helloworld", etc. But this sequence will also accept " " as a valid sequence.
Example
The regex shown in the example above can be interpreted as:
[a-zA-Z_]
: Match a letter (lowercase letter and then uppercase letter) or an underscore.[a-zA-Z_0-9]*
: Match zero or more characters which may be a letter, an underscore, or a digit.\\.
: Match a literal dot.
.[a-zA-Z_0-9]+
: Match one or more characters which may either be a letter or a digit.
Valid Expression: textfile.txt
Invalid Expression: 123file.t_x_t
Code
Last updated
Was this helpful?