Policy hash in python

Hi,

Im trying to reimplement generating a policy id from a policy script file in python. In the ledger specs it says that blake2b_224 is used for the hashing of multi-signature scripts, which I’m trying to replicate. I am using hashlib.blake2b like this:

import hashlib
m=hashlib.blake2b(digest_size=28)
m.update(open("test.script"), "rb").read()
m.digest().hex()

but the output i get is never the same as with
cardano-cli transaction policyid --script-file test.script
I also assume that the hash is calculated after deserialization of the json file, since adding newlines or whitespaces does not change the output of the cardano-cli command. It does change the output of the python script however.

The policy script looks like this btw:

{
   "type":"all",
   "scripts":[
      {
         "keyHash":"37c19eca0d623d804bfeb8951bf1eb0f7fe193deeb71793985dbd25e",
         "type":"sig"
      },
      {
         "type":"before",
         "slot":32241869
      }
   ]
}

So I guess my question boils down to: how do i need to deserialize the policy script, before hashing it?
I tried to check in the source of cardano-cli and cardano-wallet, but my haskell skills are not the best, so that wasn’t very successful.

Any help is appreciated :pray:

My guess is that you’d need to serialize it to CBOR according to the CDDL specs for native scripts and then compute the hash.

1 Like

Hey, thanks for your reply. I think that pushed me into the right direction.

native_script =
  [ script_pubkey
  // script_all
  // script_any
  // script_n_of_k
  // invalid_before
     ; Timelock validity intervals are half-open intervals [a, b).
     ; This field specifies the left (included) endpoint a.
  // invalid_hereafter
     ; Timelock validity intervals are half-open intervals [a, b).
     ; This field specifies the right (excluded) endpoint b.
  ]

script_pubkey = (0, addr_keyhash)
script_all = (1, [ * native_script ])
script_any = (2, [ * native_script ])
script_n_of_k = (3, n: uint, [ * native_script ])
invalid_before = (4, uint)
invalid_hereafter = (5, uint)

CBOR and CDDL are both new to me, so I’m not sure if I am doing it right.

According to the above specification, my script

{
   "type":"all",
   "scripts":[
      {
         "keyHash":"37c19eca0d623d804bfeb8951bf1eb0f7fe193deeb71793985dbd25e",
         "type":"sig"
      },
      {
         "type":"before",
         "slot":32241869
      }
   ]
}

would evaluate to:
native_script = [1, [[0, $hash28], [5, uint]]]

Is this correct? Do I need to use any cbor tags for serializing?

1 Like

The way that you’ve encoded the JSON according to the CDDL looks correct. I’m not sure about CBOR tags, however. You might want to try a simple test of just encoding what you wrote to CBOR (with one of Python’s CBOR libraries) and then hash that.

A complementary approach would be to use cardano-cli to create and sign a transaction that includes your script, open the file containing the signed transaction in a text editor, and then manually inspect the CBOR to see if the script is encoded in the way that your suspect.

1 Like

Hey, great idea, thanks for your input!
I have tried that and I am almost certain that I have found the correct cbor now. Using cbor.me and a signed transaction, I was able to confirm that my serialized bytes were in fact correct:
...[1, [[0, h'37C19ECA0D623D804BFEB8951BF1EB0F7FE193DEEB71793985DBD25E'], [5, 32241869]]]...
The same array is generated when I enter the hex of my serialized cbor policy into that tool:
8201828200581c37c19eca0d623d804bfeb8951bf1eb0f7fe193deeb71793985dbd25e82051a01ebf8cd

So I am almost certain that so far I am on the right way. However I still was not able to calculate the correct hash. Here’s my code:

import hashlib
from cbor2 import dumps, loads, shareable_encoder, CBORTag

class Sig:
        def __init__(self, hash, slot):
                self.hash=hash
                self.slot=slot

def default_encoder(encoder,value):
        encoder.encode([1, [[0, bytes.fromhex(value.hash)], [5, value.slot]]])


obj=Sig("37c19eca0d623d804bfeb8951bf1eb0f7fe193deeb71793985dbd25e", 32241869)
serialized = dumps(obj, default=default_encoder, value_sharing=False)

print(serialized)
print(serialized.hex())

# above line prints 8201828200581c37c19eca0d623d804bfeb8951bf1eb0f7fe193deeb71793985dbd25e82051a01ebf8cd

m=hashlib.blake2b(digest_size=28)
m.update(serialized)
print(m.hexdigest())

# prints hash value e99dbb2cb3022fc5955875ccd5feea402daaf431eb061483cbbdee7f
# but expected hash value is: 1337d200f344c546ba1c253e94b37becf405ab5de474edd37789e0ed

result = loads(serialized)
print(result)
# lastly, deserializing and printing gives us
# [1, [[0, b'7\xc1\x9e\xca\rb=\x80K\xfe\xb8\x95\x1b\xf1\xeb\x0f\x7f\xe1\x93\xde\xebqy9\x85\xdb\xd2^'], [5, 32241869]]]
# this matches the output from cbor.me.

So either I am doing something wrong with the hash function, or something else is done when calculating the hash of a script. I have tried quite a few variations of the above python script already, and this is the one that seems most promising to me. However, I can’t seem to make the hashes match.
Is there anything else I’m not seeing?

The short answer is that you need to prepend a single zero byte (“00”) to the CBOR serialization of the script before you apply blake2b_244 to it. I tried this out on your example and got the hash that cardano-cli transaction policyid computes.

The longer answer is that here and here is the code that calls the hash function:

nativeMultiSigTag :: BS.ByteString
nativeMultiSigTag = "\00"

hashMultiSigScript =
  ScriptHash
    . Hash.castHash
    . Hash.hashWith (\x -> nativeMultiSigTag <> serialize' x)
3 Likes

Thank you so much. I didn’t think I’d spend this much time on something that has an open source reference solution. I digged for so long and I didn’t even come across those files that you linked. I guess it’s been some time since I took my haskell classes.
Anyways, I’ve tried it and you’re right, it works! Thanks!! :pray:

1 Like