I’ve had an open issue for the last few weeks regarding why “odd” characters were showing up in Custom Property fields in my company’s Solarwinds Orion/Network Performance Monitor install. I detected them when I put a business-rules-integrity-checker in place, but I didn’t know where they came from. I cleaned up all the existing cases and waited to see if it would happen again. It did, and now I know where they came from and, I think, why.
Even after I understood the problem I couldn’t Google up any references to this situation, so I’m documenting it. One caveat – we’re running a non-current version of NPM (10.1.3; yes, we’re working on upgrading) so the issue I’m going to describe may not exist in later versions.
What’s The Problem?
I didn’t know what these characters where when I started, but I do now, and explaining the situation works best if I spoil the ending – the characters are Unicode U+200B (0xE0808B, ‘ZERO WIDTH SPACE’). The reason the problem went undetected for a while is that the glyph for this character is… nothing. It’s pretty much defined as an invisible character. Pretty much every windows tool I could use to look at this data knew how to interpret this character, so it was not possible to detect the situation with a visual review of the data. This includes using any web browser, the Solarwinds Custom Property Editor, and SQL Server Studio.
The reason this is still a problem if you can’t see the characters is that it still affects SQL matching, and trying to send an alert to an email address with an embedded, invisible Unicode character also doesn’t go so well.
The following screenshots demonstrate how queries are affected. I’ve artificially set up two nodes with long strings in the Department field. The Node with “GOOD” does not have the hidden characters, the node with “BAD” does.
Where Are Those Characters Coming From?
Turns out the characters get into the database when an operator copies a Custom Property via a web browser from the Custom Properties display box and then pastes it back in for a different Node. This is actually fairly common for us since we use Custom Properties for grouping, alerting, etc. The example strings used for this post were created the same way. The GOOD Node had “ExampleDepartmentName” typed in. That was then copied from the GOOD Node’s Node Details page and pasted into the Department field of the BAD Node.
After ruling out a browser-specific issue, I captured the HTTP session itself, upstream of the browser, using Wireshark. This showed me that Solarwinds itself is inserting these characters:
I’ve highlighted a few different things in this image
- The GOOD highlighted in blue demonstrates that I am viewing the Custom Property for the Node which DOES NOT have the invisible chars in it in the database (as demonstrated by the queries above).
- The grey and red highlights demonstrate that, despite the characters not being in the database, they are in the string as it is sent over the wire from Solarwinds to the web browser (in other words, before the web browser has a possible chance to mangle anythign).
- The green highlight shows that Solarwinds isn’t just inserting these characters into the Custom Property values, it’s inserting them into the Custom Property field names as well
Why Would Solarwinds Do That?
I’m almost positive that Solarwinds is doing this to facilitate page layout. If a Custom Property name or value was extremely long and didn’t have a natural place to break, either the widget would be wider than the page layout allowed, or the text would overflow the display area and not be visible. As a quick demonstration, I put a very long string in a field and displayed it:
One additional item I figured out after the fact, and which adds additional support to the idea that this is all about layout – Solarwinds doesn’t do this unconditionally. It appears that in some cases when Solarwinds detects a breakable character in the value, it doesn’t insert the invisible spaces. I didn’t realize this because all of my testing was using our original data which uses long email addresses. I tried to use “Example Department Name” for this post and suddenly no more invisibles. I switched to “ExampleDepartmentName”, and the characters came back.
So Now What?
So now nothing really, except I sleep better at night now that I understand where the characters are coming from. I’ve advised my coworkers that they’re better off copying from the Custom Properties edit page rather than the display widget, but some people will ignore me or forget. For those cases I have a DB auditor that will tell me when those characters are present and I can clean them up manually.
As a general rant, I do dislike this use of Unicode for purposes like this. Because the characters are displayed literally at both the display and markup layer, it makes it really, really hard to tell what’s going on. A variation of this same problem is what caused page layout oddities in WordPress with double-spaces after a period. You see a layout issue and you immediately view-source, and you can’t see the problem because the markup layer is rendering the characters the same as the display layer. If the software authors used HTML markup ( for non-breaking space instead of Unicode U+00A0, or &zwnj; (zero-width non-joiner) or ​ instead of Unicode U+200B) it would render the same in the display layer, but be obvious that the characters were present in the display layer.