Cardano-node 1.33.0 in P2P mode on mainnet?

Thanks for that reference. I had never really looked into those settings much.
I got these from an IOHK reference somewhere that I can’t recall now.

"MaxConcurrencyBulkSync": 2,
"MaxConcurrencyDeadline": 4,

I have never changed them. If I am understanding things correctly, then once the node is fully sync’ed “MaxConcurrencyBulkSync” won’t matter much whether it is 2 or 1?

In regard to the MaxConcurrencyDeadline: The IOHK link you referenced says:

“The MaxConcurrencyDeadline configuration option controls how many attempts the node will run in parallel to fetch the same block. Considering that getting the same block as soon as possible is important for both relay nodes and block producer nodes, we recommend setting the MaxConcurrencyDeadline value to 4.”

I wonder why they suggest 4? Two does sound like it should be enough, especially on the block producer that only connects to your own trusted relays.

Update:
I changed MaxConcurrencyDeadline to 2 on both my block producer (running in P2P mode) and my P2P relay. Now my block producer block delay times are only 80-100ms slower than my fastest relay. This is a significant improvement from around 150-200ms before the change. Basically half the delay for that extra hop.

I also changed over one of my other relays to run in P2P mode so now I have 2 running P2P mode.

2 Likes

Right, my guess is as good as mine and yeah I do think that MaxConcurrencyBulkSync is only for when the node (whether bp or relay) are way behind tip.

MaxConcurrencyDeadline seems to refer to the number of nodes from which concurrently trying to fetch the next block. Fetching a block from a node also seems to be a costly operation.
For the reasons above, it looks like:

  1. for relays 4 is the magic number, more than 4 will generate CPU overhead and slow down the node
  2. for BP it should really match the number of relays. Of course the CPU overhead is the limit, hence it should match the minumum between num-relays connected to BP and 4.

I’m really glad you’re pulling blocks faster. I came here for the P2P and found out about MaxConcurrencyDeadline.

1 Like

I was wondering that too since I have 3 relays connected to my block producer.

However, if you run your relay with “TargetNumberOfActivePeers”: 20 and “MaxConcurrencyDeadline”: 4 then obviously there are many more active peers than this concurrency deadline value. Also that IOHK reference says:

“The MaxConcurrencyDeadline configuration option controls how many attempts the node will run in parallel to fetch the same block”

What I don’t understand is why the node needs to fetch from more than only 1 in parallel.

I believe the nodes gossip about the blocks they have and share the block number and hash value for the blocks. Then a node can request the block and pull (fetch) it from the peer. Obviously once the block has been fetched it will check the hash to ensure it received the valid block. If other peers are confirming that this is the latest block number and hash also, then why does the node need to fetch the block from more than 1 in parallel?

Maybe you just need 2 for redundancy in case one fetch fails. 4 seems like overkill if I am understanding it correctly?

I currently have 2 relays running P2P and my block producer with:
“MaxConcurrencyDeadline”: 2

Hey, how are things? Still minting away?

Did you try p2p with 1.34.1?

I can’t remember where, but I seem to remember someone said 1.34.x has p2p bugged. Can’t find it anymore.

Yes, P2P works fine with 1.34.1.

You need to compile with the following change:

sed -i 's/tag: 4fac197b6f0d2ff60dc3486c593b68dc00969fbf/tag: 48ff9f3a9876713e87dc302e567f5747f21ad720/g' cabal.project

Otherwise, cardano-cli doesn’t work while running in P2P mode.
See: Cardano-node 1.34.1 and P2P

Do you guys get anything show up in Bi-Dir or Duplex? They both show 0 for me.

image

Also interesting is that the peers page only shows inputs and now outputs, where my non-P2P relays show i/o on peers page

image

I have my block producer running in P2P mode as well as 2 relays. These P2P relays set up duplex connections to the block producer. Other relays connect using unidirectional connections.

Here is output from one of the P2P relays:

curl -s -H 'Accept: application/json' http:/localhost:12788 | jq '.cardano.node.metrics.connectionManager'                         
{                                                                               
  "incomingConns": {
    "type": "g",
    "val": 19
  },
  "outgoingConns": {
    "type": "g",
    "val": 50
  },
  "duplexConns": {
    "type": "g",
    "val": 2
  },
  "unidirectionalConns": {
    "type": "g",
    "val": 66
  },
  "prunableConns": {
    "type": "g",
    "val": 1
  }
}

At the same time, the logs on this relay show:

TrConnectionManagerCounters (ConnectionManagerCounters {fullDuplexConns = 1, duplexConns = 3, unidirectionalConns = 66, inboundConns = 20, outboundConns = 50})

I am not sure what is the exact difference between duplexConns and fullDuplexConns. I do know that when I restart the P2P relay and it connects to the P2P block producer it establishes a duplexConn as I see this recorded in the logs at both ends when it happens.

Here I captured the transition from fullDuplexConns 0 to 1 and it looks like the connection from an external node 174.89.218.127:42467 caused the change. I don’t know why this would establish a fullDuplexConn when it doesn’t seem to between my P2P relay and my P2P block producer.

[relay1:cardano.node.ConnectionManager:Info:245] [2022-03-25 07:38:28.23 UTC] TrConnectionManagerCounters (ConnectionManagerCounters {fullDuplexConns = 0, duplexConns = 1, unidirectionalConns = 63, inboundConns = 17, outboundConns = 47})
[relay1:cardano.node.InboundGovernor:Info:245] [2022-03-25 07:38:28.23 UTC] TrPromotedToWarmRemote (ConnectionId {localAddress = 192.168.27.7:2700, remoteAddress = 174.89.218.127:42467}) (OperationSuccess (InboundIdleSt Unidirectional))
[relay1:cardano.node.InboundGovernor:Info:245] [2022-03-25 07:38:28.23 UTC] TrInboundGovernorCounters (InboundGovernorCounters {coldPeersRemote = 1, idlePeersRemote = 0, warmPeersRemote = 16, hotPeersRemote = 1})                                                              
[relay1:cardano.node.ConnectionManager:Info:245] [2022-03-25 07:38:31.69 UTC] TrConnectionManagerCounters (ConnectionManagerCounters {fullDuplexConns = 1, duplexConns = 1, unidirectionalConns = 63, inboundConns = 18, outboundConns = 47})

Thanks for the info.

May I ask for an example of the topology file you are using for P2P on your BP.

I am concerned that I will expose my BP if I have the wrong settings.

Is it as simple as having ‘advertise’ set to false on both local and public roots?

mainnet-topology.json on my BP:

{
  "LocalRoots": {
    "groups": [
      {
        "localRoots": {
          "accessPoints": [
            {
              "address": "relays.mypool.com",
              "port": 3001
            }
          ],
          "advertise": false
        },
        "valency": 4
      }
    ]
  },
  "PublicRoots": []
}

Notes:

  1. I run “unbound” caching DNS resolver on my network and relays.mypool.com resolves to 4 IP addresses representing each of my relays.
    If you don’t have this setup, then you need to list each IP address and associated port separately as a set of accessPoints.
    Also, Valency will equal the number of relays you have.
  2. advertise is false, because you don’t want to advertise your relays or your block producer to other relays.
  3. PublicRoots is empty.
  4. “useLedgerAfterSlot” is not set in the config.
    Leaving this out of the config results in the following log message when started: “Don’t use ledger to get root peers.”
    (I think you can also set it to “-1” to disable getting PublicRoots from the ledger.)

Remember that your block producer should only connect to your own relays and not any external relays (PublicRoots). Thus the reason for 3. and 4. above.

1 Like

this was a good post - good stuff @7.4d4

1 Like

Did you manage to get around this issue / anomaly in the end? I have experienced a similar issue where my P2P Relay establishes a bi-dir connection between my BP and another “normal” Relay Node. After a few hours operating normally my BP looses an outgoing connection to the “normal” relay node. Only by restarting either the P2P Relay or BP (“normal” mode) does the connection re-establish…

Block propagation has definitely improved overall (3 hour window)
image

@RickCADA
I wasn’t able to reproduce the problem reliably. I ended up putting that particular node in P2P mode as well. With the BP and Relay both running P2P there is no problem at all.

I do have another relay running in normal mode and it has never had the same problem over months. This other relay is on the other side of the world though.

Strange problem. I can’t understand why a relay running on the same network would be more likely to get the problem. I must admit that I haven’t been that motivated to investigate further because the other relay doesn’t have issues and because P2P is going to become the standard soon.

That is fantastic!!! I will try it out today. Till now I was running p2p on relay only and without TestEnableDevelopmentNetworkProtocols. My BP used to have non-p2p config.

Additionally I would like to add that Valency is amount of IPs what your domain name points too, but not amount of relays, at least based on this:

https://github.com/input-output-hk/cardano-node/blob/master/doc/getting-started/understanding-config-files.md

1 Like

I agree that it is clear what valency refers to in the old style topology file.

However, the new topology layout has the “valency” value one step further out in the json layout at the same level as the “localRoots” value. Within localRoots there is “accessPoints” which is an array which can contain multiple values.

It is a bit confusing, but I think “valency” refers to the entire “localRoots” value which can contain multiple (domain name, port combinations) in the “accessPoints” array.

This is one of the reasons that I set up a dns record “relays.mypool.com” which returned multiple IP addresses and used the same port for all my relays. This allowed me to have only one element in the “accessPoints” array. In turn this made the “valency” value a moot point.

Consider this alternative:

{
  "LocalRoots": {
    "groups": [
      {
        "localRoots": {
          "accessPoints": [
            {
              "address": "relay1.mypool.com",
              "port": 3001
            },
            {
              "address": "relay2.mypool.com",
              "port": 3002
            }
          ],
          "advertise": false
        },
        "valency": 2
      }
    ]
  },
  "PublicRoots": []
}

By the way, my BP has been running in P2P mode for months without any issues and producing blocks. I still have a relay running in normal mode just in case but my other relays are all running P2P as well.

Hi,

Seems you are correct, I just re-read and it says:

valency tells the node how many connections your node should try to pick from the given group. If a dns address is given, valency governs to how many resolved ip addresses should we maintain active (hot) connection.

Was you able to push block from BP using TestEnableDevelopmentNetworkProtocols, meaning that BP do not have incoming connections only outgoings.

I have the following settings in mainnet-config.json on my BP:

  "TestEnableDevelopmentNetworkProtocols": true,
  "EnableP2P": true, 
  "MaxConcurrencyBulkSync": 2,
  "MaxConcurrencyDeadline": 2,
  "TargetNumberOfRootPeers": 5,
  "TargetNumberOfKnownPeers": 5,
  "TargetNumberOfEstablishedPeers": 5,
  "TargetNumberOfActivePeers": 5,

I have the TargetNumberOfRootPeers, Known, Established and Active set to 5 but I only have 4 peers in my mainnet-topology.json:

{
  "LocalRoots": {
    "groups": [
      {
        "localRoots": {
          "accessPoints": [
            {
              "address": "relays.mypool.com",
              "port": 3001
            }
          ],
          "advertise": false
        },
        "valency": 4
      }
    ]
  },
  "PublicRoots": []
}

I set the peers to 5 in the config because I figured it needed to be equal or higher and higher gives me flexibility to add a peer by adding an IP address in my DNS for “relays.mypool.com”. If I add a 5th IP for “relays.mypool.com” then presumably the node would cycle through the IPs and select the best 4 that provide blocks fastest based on the P2P algorithms.

For connections between my BP (running P2P) and the relays running P2P:

  • connections are full-duplex BP port 3001 to Relay port 3001

For connections between my BP (running P2P) and relays running in normal mode:

  • connection from BP to relay is BP port 3001 to relay 3001
  • connection from relay to BP is relay random high port to BP port 3001

In other words: All connections initiated by a node running in P2P mode originate from the cardano-node port (3001). Whereas connections initiated by a node running in normal mode originate from a random high port.

The BP (P2P mode) is producing blocks fine and all relays have no issue pulling its blocks whether they are running P2P or normal mode.

Also running one node in P2P mode on mainnet:

Seems fine.

P2P works very well. I have only one relay left running in normal mode. Everything else, including my block producer, is running in P2P mode.

Are any users of p2p getting a transaction count? I see the screenshot from @weebl2000 has a transaction count of 0 (which is an issue I am getting on my testnet p2p)

Sorry, I don’t use that liveview script to monitor my nodes.

You can grab data from the prometheus port directly with:

curl -s -H 'Accept: application/json' http:/localhost:12788 | jq

When I do that it doesn’t give me any details about transaction count either. But, my mainnet-config.json has:

  "TraceLocalTxSubmissionProtocol": false,
  "TraceLocalTxSubmissionServer": false,
  "TraceMempool": false,
  "TraceTxInbound": false,
  "TraceTxOutbound": false,
  "TraceTxSubmissionProtocol": false,

Maybe you need to switch on a certain combination of these to see the transaction count???