After the web app recognizes the document version number has changed, how does it figure out what has changed? Is there a log of changes that have occurred in each document version that are replayed to reach the final state, or does it simply update its current state to the latest user data state (for the latest document version)?
I can think of one case where it could fail - If at the time of syncing, the item on the receiving clientās side doesnāt have any child item, then it fails to collapse. Later once the child items are added, it is not collapsed since itās considered synced.
The only issue is - when syncing a document, the contents are synced first, then the collapsed states are applied. This means that case shouldnāt happen.
In either case, to answer your question, each node id is tagged with a version ID and itās updated to a new version when a client changes the collapsed state. The sync protocol sends out a list of items for which the collapsed states have changed, and their resulting values (true/false). The version is per-user per-document, and is different from the document version because itās different for each individual user.
As it still was inconsistent after uncollapsing/recollapsing repeatedly on the mobile side, I donāt think this could be the issue.
Is this server side pushed to each syncing client, or on the mobile client it sends out a list of items where the collapsed states have changed which then are replayed by the server to each syncing client?
If itās server side, how does it determine which items it should send to the web client, does it internally track the synced node version ID for each connected session to identify which nodes differ and which updates to send? If so, if the client was offline for a while could this client session be cleared from the server side, leading to inconsistency?
If these updates are pushed from the mobile side, is it possible if I closed the app too quickly they werenāt sent out? Are these updates the only way that the server detects collapse state? If so then this argument doesnāt hold water as new clients showed the proper collapse state.
On the server side, we store the latest collapsed/expanded state and a version number that it was changed in. Each client holds its own ālast synced versionā which is presented to the server on sync. The server selects all collapse/expand states from the document that has version number strictly above the one presented and sends the list back to the client, as well as the current version number.
This ālast synced versionā is stored per item/node right? And this per-node version is sent to the server for all nodes in the document on every sync? Or are the version numbers of only a subset of nodes sent?
Also, is this version number specifically tied to collapse state changes, or is it a more general version number for the node?
No, only one ālast synced versionā is stored on the client per document. Itās only used for syncing collapsed changes, and it is different than the versioning/syncing structures used for syncing document content.
The last synced version for collapse state is sent to the server, and the server returns all nodes that have collapsed version numbers higher than the clientās last version, which is the subset of all nodes that had their collapsed state changed since last time the client had synced with the server.
@Shida I still experience this frequently and Iām really curious whatās causing it. Based on the protocol described I can think of a few possible scenarios:
-
Thereās some race condition, like you described above, where the collapse state is applied before the itemās children are synced down. In this case, is there a console error message when a collapse is applied to an empty item, or is it silently ignored? You noted that this shouldnāt happen, but maybe thereās some small edge case here related to Chrome/Windows backgrounding a tab (Iāve only seen this bug when resuming my PC after being asleep for a while).
-
The client side last synced version is somehow bumped without getting the collapse state changes.
-
Thereās some client side bug that has to do with it being the last item in the list combined with some race condition causing its collapse state to be out of sync.
The first two scenarios donāt line up with the behavior I observed where collapsing/recollapsing the āproblem itemā doesnāt sync to the broken client, as those should be properly reflected on update even if we missed the initial update.
Iām going to leave the network tab open for a few days and hopefully this repros, then Iāll be able to see if there are collapse_changes
in the update bundle_binary
which should determine whether itās a server side or client side issue.
@shida I finally have a repeatable repro! Itās a fun little edge case that I backed into from thinking about the protocol (thanks for going into the details there, it wouldnāt have been possible without that!).
Have Dynalist open in two tabs, A and B. Open dev console in tab A.
- Create item āFooā in A with child item āBarā and collapse āFooā. Make sure it syncs to B.
- In the ānetworkā tab of dev console on A change from āOnlineā to āOfflineā.
- In A, zoom into the āFooā item and delete āBarā. The item is now no longer collapsed in A since it has no children.
- In B, zoom into the āFooā item and add a new item āBazā. It should remain collapsed and now have two children.
- Change A back to āOnlineā and sync changes (e.g.
Ctrl-s
).
Tab A should now be in an inconsistent state where itās uncollapsed with child āBazā, as opposed to B and and other clients which will have āBazā and be collapsed. No matter how many times you sync or perform new changes tab A will remain in this inconsistent state.
The root cause is that an offline tab can move into an uncollapsed state while other clients (and thus the server) never become uncollapsed. This means there are no collapse_changes
to sync down to the uncollapsed client since from the perspective of the server the item never became uncollapsed (all other clients always had at least one child).
When Iāve experienced this problem I donāt think it was exactly like this repro, but it likely has the same root cause, so if the fix is fundamental enough it should hopefully resolve it (as opposed to just patching this one edge case). I still need to figure out what the exact case is when Iāve hit this in the pastā¦
One robust solution is to always provide syncing clients the collapse state of new itemsā parents, which should hopefully resolve this issue.
EDIT: Thinking about this a bit more, itās likely that when Iāve experienced this Iāve quickly deleted the last item in a collapsed child and then put my computer to sleep before the collapsed change is synced, but after the item deletion synced (was able to repro locally that that can occur). I think even switching tabs is sometimes sufficient to prevent the collapsed state syncing from occurring.
Wow nice debugging. I think that explains the problem well and is probably whatās happening here. Iāll see how we can get this fixed!
@shida I just had this happen again and Iām positive that the collapse event (on mobile) happened after the diff version stored by the web client, so itās a different issue than the one reproed above (that earlier one only applies if from the server side it doesnāt think there were any collapse events).
I have two theories:
-
The client collapse event is somehow not bumping the collapse version number. Maybe from the clientās perspective the item collapse state hasnāt changed since the last sync (the item deletion and addition happened from other clients and the API, so before syncing down itās ācollapsedā), so when I collapse the item it doesnāt increment the collapse document version even though it does sync the collapse state back to the server (verified by seeing the item collapsed in a new client). Is this possible? When it decides to increment the version number does it diff to the last sync state and only increment if they differ, or is it a simple ādid any collapse events happenā? If itās the former, this could be the issue. More generally, is incrementing the version done client side or server side? I think this only makes sense if itās done client side.
-
Thereās a race on stale clients update. Is something like the following possible?
a. Client sees update that removes item, making it uncollapsed.
b. Client thinks itās now up to date since it processed an update and pulls down collapse diffs which includes the collapse. Since item has no children in the client itās a no-op. The document version number on the client is updated, so this collapse diff is essentially lost.
c. Client sees update that adds item. Since the collapse state was already applied (and ignored) the item is now improperly uncollapsed.
Before the client pulls down the collapse diffs, does it ensure itās fully synced to the latest version and not just processed partial updates? If so, then this second theory doesnāt make sense.
I was able to repro theory #2 (update race)!
Hereās what I see in the network tab (some contents removed for brevity/privacy):
(1) bundle_binary
"user_version":1931
"added":[]
"removed":["T8FQfQlQbCi06qUBDmIkzRLd"]
{"id":"T8FQfQlQbCi06qUBDmIkzRLd","parent_old":"y8JtkASdlDsmcq-4Odz0n-UJ","index_old":0,"index_new":0}
(2) bundle_binary
"user_version":1933
"collapse_changes":[{"node_id":"y8JtkASdlDsmcq-4Odz0n-UJ","collapsed":true}]
(3) bundle_binary
"user_version":1933
"added":["gFtYTJrJ2Zx7USbnzLcCYbvM"]
"removed":[]
{"id":"gFtYTJrJ2Zx7USbnzLcCYbvM","parent_old":"","parent_new":"y8JtkASdlDsmcq-4Odz0n-UJ","index_old":0,"index_new":1}
So it seems that we arenāt guaranteed to be fully synced before we fetch the collapse changes.
From the serverās perspective the collapse event must occur on or after the new item is added, so this just appears to be a client syncing issue (eagerly fetching collapse changes before all changes synced down).
Iām curious if the same fix can be used for both of these bugs. If whenever an item is added it includes the collapse state of its parent in the meta
that may work (unless add/remove updates can be processed out of order). A fix that only ensures we are fully synced before grabbing collapse changes will not fix the earlier bug.
Iāve actually now reproed theory #1 as well (client doesnāt properly bump version number). Although I think this is essentially the same problem as the very first repo.
Repro steps:
- Open Dynalist in a tab (A).
- Open Dynalist in a new tab (B). Create an item with a child and collapse it. Ensure this is fully synced to tab A (including being collapsed).
- In tab B, verify the current document version number by inspecting
user_version
for thebundle_binary
that is pulled down after hittingCtrl-s
in the network tab. - In tab B, zoom into the collapsed item and delete the child, then immediately create a new child item and zoom out and collapse the item.
- Donāt
Ctrl-s
, just wait for the page to naturally sync. - Open Dynalist in a new tab ( C ) and ensure that the item is collapsed, meaning that the server thinks the item is collapsed.
- In tab A, wait for it to naturally sync. If the item is uncollapsed, the issue reproed. Otherwise repeat from the top until this occurs.
- Inspect the
user_version
in tab A (or any tab for that matter). It is likely the same as in step 3, meaning the version wasnāt bumped even after the item was collapsed.
Based on this, I assume the version bump is done on the server side, and since all it saw was that the item was collapsed before and after the client synced it didnāt think there was a difference.
Yeah the version handing is all server sided. Will think of a better solution in the coming days!
Will be fixed in the upcoming release on web, with mobile release coming a bit later!
Iām wondering what the rule is here. Is it that there is ONE collapsed state for a node, irrespective of who looks at it?
Itās stored on the server side as a per-user/per-document blob, which contains all the nodeās collapsed state.
Per user but not per device? Okay. Just wondering.
It should be synced across devices, at least on web. Desktop might work a bit different.
@Shida Have any changes gone out related to this already? I noticed something odd today on mobile Iāve never seen before (even if mobile code hasnāt changed, maybe affected by server side logic change). I had an item with no children, and Iām sure that my mobile client was up to date. I added two child items using Quick Dynalist (API), and when I reopened the android app the item was collapsed. This shouldnāt happen since the item was childless (therefore uncollapsed), and then new child items were added. Could recent changes have caused this unexpected behavior?
It definitely looks like this change has started rolling out as the first repro I gave is fixed (thanks for that!). Although Iām seeing some other weird behavior now, likely due to these changes:
On web:
- Create item with one child and collapse.
- Zoom into item and delete the child.
- Zoom out (item will not have any children, so is uncollapsed).
- Zoom into the item again and add a child.
- Zoom out, and observe that the item is collapsed again.
NOTE: If on step 4 you add the child while zoomed out it will properly be uncollapsed. Itās only an issue when you add the child while zoomed in to the parent item.