Summer of Bitcoin 2022 Mid Review PART 2

·

5 min read

On to the second part now. As I mentioned, this is more in the form of a report so expect a little bit of stating the obvious.

Next I worked on two minor PRs. These were easy. First was removing all the SPV node related code we have. Once the neutrino node was implemented, this was basically dead weight, which made changing node related stuff tougher for me since it would also have to be compatible with SPV node so I took this up. A little work and this was good to. Second, was a simple PR I did over the weekend. Bitcoind v23 had been released so this was mostly modifying our rpcs to support that and using the new binaries in tests. So let me tell you if you are still not sure, this is how you do open source. I found the PR that upgraded to v22 and used that as a reference, alongwith v23 changelog, and with that, got my first PR working on a completely new module, merged!

With the uninteresting bits out of the way. Now's time for some actual hard work.

Find and switch peers

Seems quite simple right? Added support to find and switch peers. Well, this was roughly 1500 lines of code modifying core p2p functionality. And let me tell you, from experience, debugging and writing code for anything that runs in parallel is hard, p2p handling, where everything runs in parallel, is even harder.

Let's break it down now, shall we?

First part first. Find. So reminding you again, I was trying to get a multi peer neutrino running. But how can you connect to multiple peers without knowing them? At that time, the node worked by connecting to the one peer that we have in our config file. Well, I could have made it to connect to the two peers in our config file and skipped the find part, but let's make things useful, shouldn't we, rather than just completing it in the shortest possible way.

Anyways, onto find, so on startup the node now finds peers through dns seeds and a resource file for peers that is the same one bitcoin-core uses, name as nodes_main. Once we find a peer, can request the peers it knows by sending it and addr message. A few catches here though and this might be interesting in context of the state of bitcoin network. A lot of peers from dns seeds just don't work, one would expect otherwise, but that's the case. It's not like you wouldn't get anything, but I did find it peculiar. Second is that I am using these for neutrino node right, so I want peers that support serving compact filters. Now how do peers indicated support? Well, you get a service bytes field with an integer whose bits define which service the peer signals support for in the version message that is exchanged in initial handshake. The newer addrV2 messages, also send service bytes along with peers list so that is used too.

Ok, so off to connecting to peers. This is where it gets interesting. I have to connect to hundreds of peers every few seconds, initialize with them (exchange version and verack messages), then use them for whatever use they have else disconnect them. Now this part is a bit different from a usual full node but in my case I found that peers with support for compact filters are rare, with working ones being even more rarer. Wdym working? Oh, so this is another interesting find related to bitcoin network, a lot of nodes are online, but with a significant percentage of them either immediately disconnect you post initialization or don't initialize at all! There were a ton of bugs and issues that popped up when I first started with connecting so many peers, all running in parallel, any of which can fail for any myriad number of reasons. The node at the moment, was not really ran with a random bitcoin peer, rather it was designed to be used with their own hosted node which is always online and good.

Fixed the issues one by one, took some time but made it. The node at this point can, on its own find a specified number of peers on its own. Great! So next was switch. This basically meant, switch to a different peer if the current one is failing. You would be surprised how many peers fail in between! They may just outright disconnect which is by far the most common, second was not responding in time, with the third being the peers going offline entirely. One critical thing that was needed was to identify Queries. Query is anything that expects a response, like getheaders expects a headers message as response. For all such queries I added support to trigger callbacks if the peer fails for any reason triggering a peer switch.

This is just the overview of it. Can't really go into the code details without you, the reader, understanding our codebase. With this the node can be self sufficient finally! Well... not so fast.

Since the node was single peer earlier, there really was not a lot of validation logic, or rather there was not a lot to validate since the one peer you have is all you have. What are you going to validate it against! But now we have multiple. Not all of them may be good. So next was adding validation logic for header sync first. This means making the header first sync to be identical to bitcoin-core's. What does this mean? So this is how the process would go. You first start syncing with any random peer that can serve filter. Just syncing from one peer since headers can't be synced in parallel. Then once you are done, you verify that the chain you have is the best of all peers you are connected to, if now sync from whoever's better. Also, you should disconnect any peer that is significantly behind.

So this was the next part for me. i already have an open PR does this. For the remaining half, I would have to work on parallel sync of filters (yes, they can be synced in parallel) and validation logic for them. Can't have dishonest nodes, now can you :)

Oh, and there was an entire week of intense debugging too. Did I mention working on parallel stuff is hard ;). That was tiresome but I did end up figuring out some existing issues in our node test environment so atleast it ended in something.

That's all for now. I do hope the remaining part would not be written in retrospect right before the end.