Linux hostid and endianness

The hostid is often used to provide a host identification for a software license. I wanted to use Puppet to deploy a configuration file containing the hostid for a number of Linux hosts. Normally you would write a custom fact to call the hostid command on the host and use the returned value in your Puppet manifest. But I wanted to avoid forking another process, so here is my non-forking implementation.

The hostid on Linux seems to be just the primary IPv4 address formatted in a distinct way. For a host using the as primary IP address I got 00017f00 as the hostid. You can easily spot the 127 = 0x7f and the other three octets in the output.

It got a bit weird, because on another machine I got 007f0100 instead. Then I realized that the first output was coming from Linux on s390x - a big-endian architecture - and the second output was from Linux on x86_64 - a little-endian architecture.

It seems the hostid is generated by taking the four octets in a certain order and interpret these four bytes as a 32-bit word using the native machine endianness. This is why the output differs on the two architectures.

I am using something like the following Ruby snippet in my custom Puppet function to calculate the hostid from a given IP address.

octets = ''.split('.').map { |x| x.to_i }
hostid = octets.rotate(2).pack('C*').unpack('L')

puts "%08x" % hostid

The code takes a string with the IPv4 address and extracts the for octets as an array of integers. This array is rotated by two positions, packed byte by byte into a 32-bit word, which is interpreted as integer in the native endianness. The result is returned as single element in an array and finally formatted using base-16.

I tested the code on big-endian and little-endian machines and it returns the same output as the hostid command.


