regex dash – inside bracket []

We have a requirement to remove some illegal character in file name like slash etc… So I put on a regex in our code:

[^a-zA-Z0-9.-_]

I assume the above regex will match everything that is not number/char/dot/dash/underscore.

Turns out I was wrong! The - is special case inside [...] that is used for range. It should be in the beginning or in the last or escaped. Otherwise it will match all the character that is in between . and _ in ASCII character set. So in my case, the .-_ part will try to match characters from 46(.)-95(_) in ASCII char table.

The correct one should be putting it in the last  [^a-zA-Z0-9._-] . Or just escape it: [^a-zA-Z0-9.\-_]

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s