Skip to content
This repository was archived by the owner on Aug 18, 2020. It is now read-only.

Matt noonan/restoration #3470

Merged
merged 4 commits into from
Aug 27, 2018
Merged

Matt noonan/restoration #3470

merged 4 commits into from
Aug 27, 2018

Conversation

matt-noonan
Copy link
Contributor

@matt-noonan matt-noonan commented Aug 24, 2018

This is the WIP wallet restoration. Some things are not working:

  • Not hooked up with the wallet worker yet, so background restoration tasks continue even if switchToFork fails.
  • Transaction history restoration is working, but they are marked "wontApply". Also, the confirmation number stays at 0.
  • There is a bug that is causing the sync progress to get reported as Synched when it is really still restoring. Still trying to figure that one out.
  • After the wallet restores itself, addresses that we found by scanning the node's utxo set at the start of restoration are still marked as "unused".
  • At startup, we need to check for wallets that are partially restored and continue restoration.

Hopefully some of these bugs have obvious root causes to the right person...

@matt-noonan matt-noonan requested a review from edsko August 24, 2018 04:16
@edsko
Copy link
Contributor

edsko commented Aug 24, 2018

Transaction history restoration is working, but the amounts on each tx are 0, and they are marked "wontApply". Also, the confirmation number stays at 0.
After the wallet restores itself, addresses that we found by scanning the node's utxo set at the start of restoration are still marked as "unused".

I'm still making my way through the PR, but these two together suggest to me that the block meta data is not set correctly, because the block metadata tells the wallet when transactions get confirmed (and they are not confirmed and not pending they are marked as WontApply) and also records if addresses are used. That's unrelated to the amount though, that's separate. Anyway, will continue with the review, maybe something will jump out.

@edsko
Copy link
Contributor

edsko commented Aug 24, 2018

Transaction history restoration is working, but the amounts on each tx are 0, and they are marked "wontApply".

Another thought on this "amount" thing: I think the amount depends on recognizing which address are ours. Maybe that's where something is going wrong.

continues reviewing

Copy link
Contributor

@edsko edsko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the general outline is there, but still quite a few of loose ends. I've identified a few things that I could see that may or may not explain the bugs you identified (sync progress not reported correctly, transactions with incorrect status), but I don't know if it suffices to explain them entirely. In particular, I think there are some boundary conditions near the start of the chain and near the current point (at the start of restoration) -- detaisl below -- that need to be fixed, but whether or not that explains the behaviour you're seeing I'm not sure; if the chain it short, it might, but for longer chains I don't think it could. Let's try to see if we can get this thing fixed and merged together when you start work.

mkUpdate (accId, mPB) = AccountUpdate {
accountUpdateId = accId
, accountUpdateAddrs = pfbAddrs pb
, accountUpdateNew = AccountUpdateNew Map.empty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that you didn't change the AccountUpdate stuff to be able to create accounts that start in a restoration state, which I presume is done elsewhere. That means that we don't expect this accountUpdateNew value to ever be used right?

hdOutsideKHistorical .= NE.head (getNewestFirst history')
return txIds)
-- Update the account state, if needed.
withZoom $ \acc _zoomTo ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

withZoom where you're not actually using zoomTo seems unnecessary :)


isUpToDate :: HdAccountWithinK -> Bool
isUpToDate st = let cur = st ^. hdWithinKHistorical . currentCheckpoint
tgt = st ^. hdWithinKCurrent . currentCheckpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite right is it? This would make tgt a moving target :) We want to check the distance from the most recent full checkpoint to the oldest (first) partial checkpoint, not to the most recent partial checkpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, yes, of course!


isWithinK :: HdAccountOutsideK -> Bool
isWithinK st = let cur = st ^. hdOutsideKHistorical
tgt = st ^. hdOutsideKCurrent . currentCheckpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then same here.

isWithinK st = let cur = st ^. hdOutsideKHistorical
tgt = st ^. hdOutsideKCurrent . currentCheckpoint
SecurityParameter k0 = k
in checkpointDistance cur tgt <= fromIntegral k0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I had to scribble a bit to convince myself of that <= but I think it's correct :)

@@ -47,7 +44,7 @@ bracketPassiveWallet logFunction keystore rocksDB f = do
Kernel.bracketPassiveWallet logFunction keystore rocksDB $ \w -> do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note that we should remember to rename this rocksDB thing to just node or something. This is a remnant from when the NodeStateAdaptor was more limited in scope.

getWal :: WalletId -> n (Either GetWalletError Wallet)
getWal wId = ro (Wallets.getWallet wId) >>= \case
Right v1wal -> Right <$> Wallets.updateSyncState w v1wal
err -> return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not define this logic here, but rather do this in Wallets.getWallet itself; basically, all the logic of the wallet layer is defined in WalletLayer.Kernel.SomeSubModule. That has the additional advantage that in updateSyncState we don't have to do decoding again; that really shouldn't be taking V1 types but rather proper wallet types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it makes for sense to just set the sync state in getWallet. However, getWallet is pure and updateSyncState isn't, because it has to peek into the MVar that holds all of the restoration progress data. I guess I could keep getWallets pure and explicitly pass in the WalletRestorationInfo. Then this getWal will just have to read the MVar and pass the result over to getWallet.

getWallets would need a related change, but I'm less clear about what the right thing to do there is.

=> Kernel.PassiveWallet
-> V1.Wallet
-> m V1.Wallet
updateSyncState wallet v1wal =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments above. This should not be defined in terms of V1 types, and we should not have to decode stuff here. The case for badSync should not be needed.

, _wriThroughput = MeasuredIn 0
, _wriCancel = cancel restoreTask
}
modifyMVar_ (wallet ^. walletRestorationTask) (pure . M.insert wId restoreInfo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You spawn the worker before you make this change to the walletRestorationTask MVar. This means that when the worker does

              pure . M.adjust ( (wriCurrentSlot .~ flat slotId)
                              . (wriTargetSlot  .~ flat tgtSlot)) wId

that will have no effect until the above happens; and when that happens, the sync state will be set to zero again.

Perhaps that explains the bug you were seeing where sync state was not reported correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good point. I think it doesn't explain the sync state bug I was seeing, but we'll see.


-- Synchronously restore the wallet balance, and begin to
-- asynchronously reconstruct the wallet's history.
let prefilter :: Blund -> IO (Map HD.HdAccountId PrefilteredBlock, [TxMeta])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the only reason for that IO is the fallback Timestamp. Would be good if we could avoid that. But that's cleanup only.

@matt-noonan matt-noonan force-pushed the matt-noonan/restoration branch from 2e92160 to db55d24 Compare August 24, 2018 14:36
@edsko
Copy link
Contributor

edsko commented Aug 24, 2018

We should verify that when we try to restore the same wallet twice, we get an error the second time. (This currently does not seem to be the case.)

@edsko edsko force-pushed the matt-noonan/restoration branch 3 times, most recently from 91f57e9 to f8ea3d9 Compare August 26, 2018 12:52
matt-noonan and others added 4 commits August 26, 2018 15:40
This makes a number of important changes to `restoreWallet`:

* Rather than assuming that the genesis UTxO is empty, we filter `genesisUtxo`
  using the wallet's key.
* If there are no blocks on the block chain yet, we still create the new
  wallet, but using normal wallet creation, rather than restoration.
* We lock the node state when we get the tip and the current UTxO, so that
  they must be in sync with each other.

Additionally, this provides additional infrastructure in 'AccountUpdate' so
that we are better able to express what needs to happen during restoration.

This removes the `ChainState` module entirely, we no longer need it.
BlockContext allows us to detect in applyBlock when we are trying to apply a
block that doesn't fit onto our chain. This will be very important to detect
when the wallet behind be behind the node; `applyBlock` can now through an
`ApplyBlockNotSuccessor` failure (we don't yet deal with those errors yet
though).

This has quite a few ramifications throughout the code, which is unfortunate,
but on the positive side, it cleans up a lot of edge cases.

This also makes a number of corrections and improvements to restoration:

* We get rid of the OutsideK, and simply distinguish only between `UpToDate`
  and `Incomplete`. This uses _slightly_ more memory during restoration but
  it's not a huge deal and it simplifies the code.
* We no longer try to change an account's state from incomplete to up-to-date
  in apply block. This logic was tricky, and actually unneeded: we _already_
  have a _separate_ check for completion, in the restoration worker itself, and
  it makes more sense to just have that one check. Instead there is now a
  `RestorationComplete` acid-state transaction.
* We take advantage of the new block context to have a more accurate check
  in `finishRestoration`, which should hopefully also work with accounts
  that we discover during restoration (this was not the case so far).
* We deal correctly with account discovery during discovery.
@edsko
Copy link
Contributor

edsko commented Aug 26, 2018

Rebased.

@edsko edsko force-pushed the matt-noonan/restoration branch from f8ea3d9 to 047bdf1 Compare August 26, 2018 13:40
@adinapoli-iohk
Copy link
Contributor

@parsonsmatt @KtorZ I am going to force the hand and merge this PR anyway despite CI failing for the following reasons:

  1. The AppVeyor failure is the "usual one" (build exceeded time limit)
  2. Hydra one on the integration tests seems (but correct me if I'm wrong) a false positive as I'm not aware of the integration tests using our new layer yet and the failure seems totally unrelated to restoration (being delivered as part of this PR):
Failures:

  integration/Util.hs:118:5: 
  1) Transactions posted transactions appear in the index
       predicate failed on: Left (ClientWalletError (UnknownError "Request error (Cannot send transaction: Transaction creation error: not enough money, need 155382 coin(s) more)"))

  integration/Util.hs:118:5: 
  2) Transactions asset-locked wallets can receive funds and transactions are confirmed in index
       predicate failed on: Left (ClientWalletError (UnknownError "Request error (Cannot send transaction: Transaction creation error: Attempted to send 0 money)"))

Randomized with seed 1078454931

Finished in 272.5982 seconds
19 examples, 2 failures, 1 pending
The build failed with exit code 1
mkdir: created directory '/nix/store/67w9w1sspzah6ynvxpngjjfymrjzcfg8-cardano-wallet-integration-tests'
mkdir: created directory '/nix/store/67w9w1sspzah6ynvxpngjjfymrjzcfg8-cardano-wallet-integration-tests/nix-support'
state-demo/logs/
state-demo/logs/relay1.log
state-demo/logs/core3.json
state-demo/logs/core4.log
state-demo/logs/core2.json
state-demo/logs/node.pub
tar: state-demo/logs/node.pub: file changed as we read it
state-demo/logs/core3.log
state-demo/logs/wallet.log
state-demo/logs/core2.log
state-demo/logs/relay1.json
state-demo/logs/node
tar: state-demo/logs/node: file changed as we read it
state-demo/logs/core1.log
state-demo/logs/core4.json
state-demo/logs/core1.json
builder for '/nix/store/rkc39mndri3m91mhd1k6ajbp1k605sn4-cardano-wallet-integration-tests.drv' failed with exit code 1

Therefore, putting this on your radar as "yet another round of investigation on integration tests" (😁 ) and merging this bad boy.

@adinapoli-iohk adinapoli-iohk merged commit 4818274 into develop Aug 27, 2018
@adinapoli-iohk adinapoli-iohk deleted the matt-noonan/restoration branch August 27, 2018 08:13
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants