Parsing search terms from CGI.http_referrer using regular expressions

Filed under: regular expressions

comments (2) Views: 8,654

Today someone over at the House of Fusion mailing list asked how to parse out search terms from an incoming referrer link. Several people responded with options of looping over the referrer, and other suggestions. Then I put forth the idea of using a simple regex to extract the required string. Here's what I came up with, maybe it'll help you?

This searches for a literal string 'q=' that is immediately preceded by a ? or an & and looks for any text after the = but before an &.

  • [?|&] Either ? or &
  • q=literal string, used to know where to start the match
  • [^&]+The actual string to match. Any character, any number of times, except for an &

That worked really well, but I didn't like the fact that the q= is also returned in the match, especially when that meant I'd have to remove that string from my final result. So I asked on Twitter and was referred to Ben Nadel's post on REMatchGroup where I found out about negative look-behinds. That did the trick. So here's the new regex.

This regular expression uses functionality called negative look-behind. It basically says "only match the target string if it's immediately preceded by another string". Let's break it down.

  • ( Opens the negative look-behind
  • ?<=begins the negative look behind matching
  • [?|&]q=matches ? or & followed by a literal string q=
  • )Closes the negative look-behind
  • [^&]+The actual string to match. Any character, any number of times, except for an &

Hope this helped you out. It was a great challenge, and I learned something new about regex that I didn't know.

Amazon logo

If this article was interesting, or helpful, or even wrong, please consider leaving a comment, or buying something from my wishlist. It's appreciated!

comments powered by Disqus
coach outlet online jordan 13 grey toe beats by dre cyber monday michael kors black friday beats by dre cyber monday jordan 6 black infrared north face cyber monday michael kors cyber monday north face black friday coach outlet black infrared 23 13s north face cyber monday jordan 6 black infrared north face black friday coach cyber monday jordan 11 legend blue north face cyber monday black infrared 6s lebron 12 north face black friday jordan 11 legend blue louis vuitton outlet jordan 13 grey toe grey toe 13s beats by dre black friday coach black friday jordan 13 grey toe coach cyber monday uggs black friday jordan 13 black infrared 23 uggs cyber monday barons 13s uggs black friday beats by dre cyber monday black infrared 6s jordan 13 bred jordan 13 black infrared 23 north face black friday black infrared 6s jordan 11 legend blue michael kors black friday jordan 13 grey toe coach black friday michael kors black friday michael kors cyber monday beats by dre cyber Monday north face cyber monday coach black friday michael kors cyber monday beats by dre cyber Monday north face black friday beats by dre black friday lululemon black friday uggs black friday jordan 13 bred coach cyber monday beats by dre black friday uggs black friday coach black friday black infrared 6s