Collapsing item on mobile not getting synced to web

This ā€œlast synced versionā€ is stored per item/node right? And this per-node version is sent to the server for all nodes in the document on every sync? Or are the version numbers of only a subset of nodes sent?

Also, is this version number specifically tied to collapse state changes, or is it a more general version number for the node?

No, only one ā€œlast synced versionā€ is stored on the client per document. Itā€™s only used for syncing collapsed changes, and it is different than the versioning/syncing structures used for syncing document content.

The last synced version for collapse state is sent to the server, and the server returns all nodes that have collapsed version numbers higher than the clientā€™s last version, which is the subset of all nodes that had their collapsed state changed since last time the client had synced with the server.

@Shida I still experience this frequently and Iā€™m really curious whatā€™s causing it. Based on the protocol described I can think of a few possible scenarios:

  1. Thereā€™s some race condition, like you described above, where the collapse state is applied before the itemā€™s children are synced down. In this case, is there a console error message when a collapse is applied to an empty item, or is it silently ignored? You noted that this shouldnā€™t happen, but maybe thereā€™s some small edge case here related to Chrome/Windows backgrounding a tab (Iā€™ve only seen this bug when resuming my PC after being asleep for a while).

  2. The client side last synced version is somehow bumped without getting the collapse state changes.

  3. Thereā€™s some client side bug that has to do with it being the last item in the list combined with some race condition causing its collapse state to be out of sync.

The first two scenarios donā€™t line up with the behavior I observed where collapsing/recollapsing the ā€œproblem itemā€ doesnā€™t sync to the broken client, as those should be properly reflected on update even if we missed the initial update.

Iā€™m going to leave the network tab open for a few days and hopefully this repros, then Iā€™ll be able to see if there are collapse_changes in the update bundle_binary which should determine whether itā€™s a server side or client side issue.

1 Like

@shida I finally have a repeatable repro! Itā€™s a fun little edge case that I backed into from thinking about the protocol (thanks for going into the details there, it wouldnā€™t have been possible without that!).

Have Dynalist open in two tabs, A and B. Open dev console in tab A.

  1. Create item ā€œFooā€ in A with child item ā€œBarā€ and collapse ā€œFooā€. Make sure it syncs to B.
  2. In the ā€œnetworkā€ tab of dev console on A change from ā€œOnlineā€ to ā€œOfflineā€.
  3. In A, zoom into the ā€œFooā€ item and delete ā€œBarā€. The item is now no longer collapsed in A since it has no children.
  4. In B, zoom into the ā€œFooā€ item and add a new item ā€œBazā€. It should remain collapsed and now have two children.
  5. Change A back to ā€œOnlineā€ and sync changes (e.g. Ctrl-s).

Tab A should now be in an inconsistent state where itā€™s uncollapsed with child ā€œBazā€, as opposed to B and and other clients which will have ā€œBazā€ and be collapsed. No matter how many times you sync or perform new changes tab A will remain in this inconsistent state.

The root cause is that an offline tab can move into an uncollapsed state while other clients (and thus the server) never become uncollapsed. This means there are no collapse_changes to sync down to the uncollapsed client since from the perspective of the server the item never became uncollapsed (all other clients always had at least one child).

When Iā€™ve experienced this problem I donā€™t think it was exactly like this repro, but it likely has the same root cause, so if the fix is fundamental enough it should hopefully resolve it (as opposed to just patching this one edge case). I still need to figure out what the exact case is when Iā€™ve hit this in the pastā€¦

One robust solution is to always provide syncing clients the collapse state of new itemsā€™ parents, which should hopefully resolve this issue.

EDIT: Thinking about this a bit more, itā€™s likely that when Iā€™ve experienced this Iā€™ve quickly deleted the last item in a collapsed child and then put my computer to sleep before the collapsed change is synced, but after the item deletion synced (was able to repro locally that that can occur). I think even switching tabs is sometimes sufficient to prevent the collapsed state syncing from occurring.

Wow nice debugging. I think that explains the problem well and is probably whatā€™s happening here. Iā€™ll see how we can get this fixed!

1 Like

@shida I just had this happen again and Iā€™m positive that the collapse event (on mobile) happened after the diff version stored by the web client, so itā€™s a different issue than the one reproed above (that earlier one only applies if from the server side it doesnā€™t think there were any collapse events).

I have two theories:

  1. The client collapse event is somehow not bumping the collapse version number. Maybe from the clientā€™s perspective the item collapse state hasnā€™t changed since the last sync (the item deletion and addition happened from other clients and the API, so before syncing down itā€™s ā€œcollapsedā€), so when I collapse the item it doesnā€™t increment the collapse document version even though it does sync the collapse state back to the server (verified by seeing the item collapsed in a new client). Is this possible? When it decides to increment the version number does it diff to the last sync state and only increment if they differ, or is it a simple ā€œdid any collapse events happenā€? If itā€™s the former, this could be the issue. More generally, is incrementing the version done client side or server side? I think this only makes sense if itā€™s done client side.

  2. Thereā€™s a race on stale clients update. Is something like the following possible?
    a. Client sees update that removes item, making it uncollapsed.
    b. Client thinks itā€™s now up to date since it processed an update and pulls down collapse diffs which includes the collapse. Since item has no children in the client itā€™s a no-op. The document version number on the client is updated, so this collapse diff is essentially lost.
    c. Client sees update that adds item. Since the collapse state was already applied (and ignored) the item is now improperly uncollapsed.

Before the client pulls down the collapse diffs, does it ensure itā€™s fully synced to the latest version and not just processed partial updates? If so, then this second theory doesnā€™t make sense.

I was able to repro theory #2 (update race)!

Hereā€™s what I see in the network tab (some contents removed for brevity/privacy):

(1) bundle_binary

"user_version":1931
"added":[]
"removed":["T8FQfQlQbCi06qUBDmIkzRLd"]
{"id":"T8FQfQlQbCi06qUBDmIkzRLd","parent_old":"y8JtkASdlDsmcq-4Odz0n-UJ","index_old":0,"index_new":0}

(2) bundle_binary

"user_version":1933
"collapse_changes":[{"node_id":"y8JtkASdlDsmcq-4Odz0n-UJ","collapsed":true}]

(3) bundle_binary

"user_version":1933
"added":["gFtYTJrJ2Zx7USbnzLcCYbvM"]
"removed":[]
{"id":"gFtYTJrJ2Zx7USbnzLcCYbvM","parent_old":"","parent_new":"y8JtkASdlDsmcq-4Odz0n-UJ","index_old":0,"index_new":1}

So it seems that we arenā€™t guaranteed to be fully synced before we fetch the collapse changes.

From the serverā€™s perspective the collapse event must occur on or after the new item is added, so this just appears to be a client syncing issue (eagerly fetching collapse changes before all changes synced down).

Iā€™m curious if the same fix can be used for both of these bugs. If whenever an item is added it includes the collapse state of its parent in the meta that may work (unless add/remove updates can be processed out of order). A fix that only ensures we are fully synced before grabbing collapse changes will not fix the earlier bug.

Iā€™ve actually now reproed theory #1 as well (client doesnā€™t properly bump version number). Although I think this is essentially the same problem as the very first repo.

Repro steps:

  1. Open Dynalist in a tab (A).
  2. Open Dynalist in a new tab (B). Create an item with a child and collapse it. Ensure this is fully synced to tab A (including being collapsed).
  3. In tab B, verify the current document version number by inspecting user_version for the bundle_binary that is pulled down after hitting Ctrl-s in the network tab.
  4. In tab B, zoom into the collapsed item and delete the child, then immediately create a new child item and zoom out and collapse the item.
  5. Donā€™t Ctrl-s, just wait for the page to naturally sync.
  6. Open Dynalist in a new tab ( C ) and ensure that the item is collapsed, meaning that the server thinks the item is collapsed.
  7. In tab A, wait for it to naturally sync. If the item is uncollapsed, the issue reproed. Otherwise repeat from the top until this occurs.
  8. Inspect the user_version in tab A (or any tab for that matter). It is likely the same as in step 3, meaning the version wasnā€™t bumped even after the item was collapsed.

Based on this, I assume the version bump is done on the server side, and since all it saw was that the item was collapsed before and after the client synced it didnā€™t think there was a difference.

1 Like

Yeah the version handing is all server sided. Will think of a better solution in the coming days!

1 Like

Will be fixed in the upcoming release on web, with mobile release coming a bit later!

1 Like

Iā€™m wondering what the rule is here. Is it that there is ONE collapsed state for a node, irrespective of who looks at it?

Itā€™s stored on the server side as a per-user/per-document blob, which contains all the nodeā€™s collapsed state.

Per user but not per device? Okay. Just wondering.

It should be synced across devices, at least on web. Desktop might work a bit different.

@Shida Have any changes gone out related to this already? I noticed something odd today on mobile Iā€™ve never seen before (even if mobile code hasnā€™t changed, maybe affected by server side logic change). I had an item with no children, and Iā€™m sure that my mobile client was up to date. I added two child items using Quick Dynalist (API), and when I reopened the android app the item was collapsed. This shouldnā€™t happen since the item was childless (therefore uncollapsed), and then new child items were added. Could recent changes have caused this unexpected behavior?

It definitely looks like this change has started rolling out as the first repro I gave is fixed (thanks for that!). Although Iā€™m seeing some other weird behavior now, likely due to these changes:

On web:

  1. Create item with one child and collapse.
  2. Zoom into item and delete the child.
  3. Zoom out (item will not have any children, so is uncollapsed).
  4. Zoom into the item again and add a child.
  5. Zoom out, and observe that the item is collapsed again.

NOTE: If on step 4 you add the child while zoomed out it will properly be uncollapsed. Itā€™s only an issue when you add the child while zoomed in to the parent item.

I also have a new repro (slightly different than the original one, but in a similar vein) that shows that this isnā€™t fixed.

On the web:

  1. Create item with one child and collapse (Foo -> Bar).
  2. Go offline (dev tools -> network).
  3. Zoom in and delete the child item.
  4. Zoom out and add a new child item ā€œBazā€. The item will be uncollapsed.
  5. In a new tab, zoom into the item and add a second item ā€œQuxā€.
  6. Reconnect the first client and sync up.
  7. Verify the reconnected client is uncollapsed, while all other clients are still collapsed (even though all clients do show the proper child items ā€œBazā€ and ā€œQuxā€).

Iā€™m guessing that in this recent update thereā€™s some tentatively_collapsed state thatā€™s kept for nodes with deleted children that havenā€™t synced up yet, is that correct? But when you add a new child item zoomed out the client ā€œknowsā€ that it is no longer collapsed, leading us back to the original bug. This type of logic likely also explains the other issue I reported one post above.

In this case I actually think it makes more sense for the server-side state to be ā€œuncollapsedā€ if weā€™re following a ā€œlast-write-winsā€ model. This is made trickier by the fact that we donā€™t want to uncollapse items if they only became empty (no new children added), like in the original repro.

Hereā€™s an idea for what might fix this: For each node that experiences a collapse state change keep track of the following (clearing out this state after a sync occurs):

  1. The last synced collapse state for the node.
  2. The current collapse state for the node (e.g. if you explicitly uncollapsed the item OR deleted all children, it is uncollapsed).
  3. Whether the collapse state change is tentative or definite. An explicit collapse/uncollapse is a definite change, whereas an item becoming uncollapsed due to all children being deleted is tentative. Adding the first child to an item is also a definite change (since the item is now visibly uncollapsed).

If a change is tentative, after a sync it should apply the following state changes (in this order of precedence):

  1. Any new collapse state changes synced down for that node.
  2. The last synced collapse state for the node.

If a change is definite, it should override any collapse state changes it sees and sync up its collapse state to the server (last-write-wins).

(this also relies on the server side explicitly setting nodes without any children as ā€œuncollapsedā€, including bumping the user version number when this occurs. I think this was the old behavior, but after the update this seems to have changed.)

It looks like the current solution is doing something like 1 and 2, but maybe itā€™s missing the 3rd part?

If we follow this new model, then the above repro should result in the server-side incrementing the user version number, and forcing all other clients to be uncollapsed. For the original repro, the reconnecting clientā€™s change would only be tentative, so once it resyncs it would re-apply the original collapse state it had for the item (collapsed), which would make it consistent with the other clients.

I followed your steps and got a repro. Upon further investigation this is caused by a bug in the sync algorithm where anytime you upload changes, but there are remote changes to be downloaded, the server rejects the upload, but the collapse changes were discarded accidentally and not uploaded on retry.

A fix has been prepared and has been deployed to web.

3 Likes

@shida Thanks for fixing that. I still have some questions about the new ā€œcollapsed item with no childrenā€ behavior Iā€™m seeing (repro steps here: Collapsing item on mobile not getting synced to web).

This new behavior manifests when you add the first child to an item with no children that was previously in a collapsed state before its last child was removed, using either of these two methods:

  1. Zoom into the item and add the first child.
  2. Add the first child via API (e.g. Quick Dynalist).

Previously an item with no children was always considered uncollapsed, so adding the first child would have it remain uncollapsed. With the new behavior the previously collapsed item becomes collapsed when the first child is added using these two methods, but is uncollapsed if added by hitting ā€œEnterā€ and indenting.

Was this change intentional? Itā€™s slightly confusing from a usability perspective that the collapse state is sticky, as thereā€™s no way to tell the difference between childless items that are ā€œcollapsedā€ and ones that are ā€œuncollapsedā€.

Overall Iā€™m fine with this new behavior as it fits in with how I use collapsed items + API, but I just wanted to make sure that this change was intentional.

Also, do you have an ETA for when the Android app will be updated? Since the Android app does not exhibit this behavior yet itā€™s now sometimes uncollapsed even when itā€™s collapsed on all other clients which is kind of annoying.

Yes it was intentional - previously the ā€œuncollapsing when last child is removedā€ behavior is done in a deeper position that doesnā€™t allow the change to be tracked and synced, so we had to remove that behavior to allow proper sync of collapse states. Now it only does that when you indent items into a collapsed item, so the user experience remains relatively close (but there are edge cases as you mentioned).

1 Like