Tags list includes '#' and '@'

Shida · November 16, 2017, 8:50pm

The change hasn’t been deployed to the apps yet. Right now it’s only on the web version.

Alex_Pasternak · November 17, 2017, 5:42pm

I still have 3 tags reported for # in the my tag list. What is the expected behavior when a hashmark is used in regular sentences? E.g., if I have notes like “need to reduce # of bugs”, is that supposed to count as a tag?

I wanted to find which 3 items are being counted, but this is very hard. Clicking # just launches a search with “#” character, which identifies hundreds of items. If I search for "# " with space to find sentences like above, I get more than 3 results.

Shida · November 17, 2017, 9:48pm

It should not be recognized… I can’t figure out why you can still see # in tag list… Guess what we can do is manually exclude # and @ from the tag list for now while we continue to debug the issue.

Let me know if you can find an occurrence of # that’s being recognized as a tag.

Joseph_Fieber · November 18, 2017, 1:16am

Strange that we both have 3 results for #. Related to this is a feature request I’m going to make. The ability to ‘hide’ or ‘ignore’ certain tags. I’m thinking of, similar to you, those non-tag uses of the #. Like an address that includes something like ‘Suite #200’. It would be unreasonable to expect the software to properly predict every use of a tag, so having the ability to ‘ignore’ specific instances in our documents would be a good work-around to keep our lists clean, without missing those things we really want to be tags.

Alex_Pasternak · November 18, 2017, 2:01am

While I support in general the feature to hide certain tags, that’s almost unworkable for me, as I have too many I’d have to do that for. My dream feature would be to ignore tags starting with a number, which would also fix your “Suite #200” example.

We use BitBucket here for code reviews, and pull requests are numerically numbered. So any time I reference an email subject, for example, in DL that’s a tag in the list. Just a small taste:

I’d love to ignore these en masse.

@Shida, the tags are probably being indexed from real data like this, typically when people use # as shorthand for the word “number”, which is quite common. From my real data:

Automation result for Prod build # 582 – 591
Wrong version # in installer

Erica · November 20, 2017, 11:07pm

Could you refresh (web) or restart (desktop) and see if that clears it? What Shida meant is that examples like those shouldn’t be indexed at all, so they might be ghost tags and are clearable on refresh/restart.

Let me know if that doesn’t work.

Erica · November 20, 2017, 11:09pm

Yeah, I agree.

@Joseph_Fieber maybe there should be an option to ignore tags that only contain numbers. That should eliminate these en masse without any manual work.

Alex_Pasternak · November 20, 2017, 11:25pm

I’ve restarted Firefox since then, which certainly refreshed the page. Also just now logged out and back in. The # (3) is still there, sorry, along with its useful friends. (In case you may ask, #1x1 is a non-junk tag, but I would be happy to rename it to #OneOnOne if numerics became ignored)

Erica · November 20, 2017, 11:31pm

#1x1 is fine because it’s not 100% numeric (the “x” in the middle).

And that’s weird, after restarting the browser it still doesn’t work (logging out and back in is not required but I appreciate that).

Here’s a screenshot from latest production and you can see # alone is not longer indexed.

Alex_Pasternak · November 21, 2017, 2:16pm

I’m happy to try something else, if you have suggestions. I agree that with newly created data, the empty tag does not seem to appear. But for historic data, it seems to be there.

Since you can’t click the tag to see its instances, only launch a search with “#” as default, it’s really impossible to know what it’s indexing (for me). I tried adding ##, various combinations of spaces, etc. My 2 suggestions from real data from above also don’t make that tag register in a new doc.

However, the # (3) for me is a super-minor issue, compared to the others. This can be the bottom of the backlog With 100% numeric tags removed, the list is already really cleaned up. Now if trailing punctuation could just be ignored, the tag list would be nearly perfect. After much manual cleanup, I still have stuff like this:

Erica · November 22, 2017, 1:14am

That’s weird, as we re-count the tags every time you restart/refresh, so it shouldn’t matter which tags are new and which are historical I hope we can figure that out soon.

Erica · December 8, 2017, 4:44pm

Just to follow up on the status of this bug, since we’re cleaning up the existing bug reports.

Is the original bug (tag list includes # and @) still a problem?

I understand there are other related requests, like ignoring numeric-only tags, could you please make a feature request for that? Thanks in advance!

@Joseph_Fieber @Alex_Pasternak

Alex_Pasternak · December 8, 2017, 4:55pm

No empty @ tag, but I’m sorry to report that I still have “# (3)” showing in the tag pane. Even after signing out and back in. Not at all a major issue for me though.

Joseph_Fieber · January 11, 2018, 5:53pm

Same as Alex. No more empty @ tag, but still have “# (3)” which when clicked pulls up search of every “#” that exists.

Erica · January 12, 2018, 12:07am

Weird, you even have the same count of # as @Alex_Pasternak…

Could you try showing checked items and note and see if you can find those 3 mysterious tags?

Joseph_Fieber · January 12, 2018, 12:41am

Yea, strange. Like I said, if I click it I just see a search for every ‘#’ in the document (hundreds). I’ve scrolled through a bit, I can’t tell how it would be picking out any 3 in particular. I thought maybe it was finding just ‘#’ by itself, without being part of a larger tag, but when I search for # followed by a space, there are dozens, so it’s not that.

Erica · January 12, 2018, 12:44am

I just thought of something: do you have the habit of using unicode characters after #?

Alex_Pasternak · January 12, 2018, 2:20am

I’m glad someone else is echoing this. The solution should be to tie clicking tags to actual indexed results, not just to a search default. Then you would immediately see where empty # or any other bad tags lead.

Joseph_Fieber · January 12, 2018, 3:43am

It’s possible. I have some different types of coding snippets here and there. Some of it is marked as code, but not everything. Will probably go through and mark it better once the option for multiline code becomes available. I agree with Alex on tags leading to indexed results!

Erica · January 12, 2018, 10:08pm

Tie items to the tag itself is a good idea to avoid things like this, also to avoid scenarios where you search for “#t” and end up seeing “#tag, #test, #teehee”.

Unfortunately the way tag list is done right now, it’s more of a helper that helps you list, sort, and search for tags. It’s not treating tags as anything more than a string right now, but it should. That’s another feature request on its own though.

Regarding the mysterious #/@, the thought just came to me that it might be caused by invisible unicode characters, but that’s kind of hard to debug. Of course it’s not possible if you never copy from other sources or use unicode extensively. We’ve seen this case where people use some unicodes that invalidates the OPML file (because it’s an invalid XML file in the first place), thus preventing the OPML file from being imported at all, because an invalid XML file cannot be properly parsed.