/http:\/\/(?:www\.)?([a-z0-9\-]+)(?:\.[a-z\.]+[\/]?).*/i http:\/\/ matches the "http://" part (?:www\.)? is a non-capturing group that matches zero or one "www." ([a-z0-9\-]+) is a capturing group that matches character ranges a-z,0-9 in addition to the hyphen. This is what you wanted to extract. (?:\.[a-z\.]+[\/]?) is a non-capturing group that matches the TLD part (i.e. ".com",".co.uk",etc) in addition to zero or one "/" .* matches the rest of the url