Using PostgreSQL MD5 hash to calculate a 64 bit hash value for advisory lock functions?
I have a particular problem addressing the PostgreSQL advisory locking functions using the bigint variants. Basically I want to create a 64 bit bigint value from a text type obtained with the PostgreSQL md5() function from an arbitrary text_input_context input. The idea is to emulate / simulate a table row lock behavior1, where the text context is build from the full table name (schema + table name), and the table's keyfield values for a particularly selected row. It might be acceptable in our application to lock even more rows due to collisions of these hash values, but ensuring to at least have that row is protected for updates from other sessions. The lock / unlock would be ever straightforward and sequential to protect concurrent accesses of a specific table by design. Originally the md5() function returns a 128 bit value represented as a hexadecimal text value of 32 bytes. I want to convert this to a bigint value, which I can use with the PostgreSQL advisory lock functions. When doing some research I found a way to just cut the 128 bit value output of the md5() function for it's 1st 16 bytes, and convert these to a 64 bit bigint value that can be passed to the PostgreSQL advisory lock functions2: SELECT pg_catalog.pg_advisory_lock(('x' || pg_catalog.left(pg_catalog.md5(text_input_context),16)::bit(64)::bigint) I am well aware that using the md5() algorithm already looses information regarding the text_input_context, though I want to reduce that to a minimum, and even more when just using the 1st 16 bytes of the md5 hash value. My question is: Is that just a too naive approach, or can we have better implementations to compute a 64 bit value from a 128 bit hash value? When researching I found the idea to additionally xor-ing the lower 16 byte part with the upper 16 byte part. But wouldn't that be even worse by means of distribution? Regarding some comments advising to use SELECT ... FOR UPDATE, SELECT FOR UPDATE NOWAIT or SELECT FOR UPDATE SKIP LOCKED, please note that this won't help to solve my specific problem. The SELECT commands are done internally by some DB access component, where I have little to no chance to change the behavior how the component issues these statements against the database. What I need is to indicate that a particular table row is locked in the GUI, but still show all the information in the form. PS.: May be this question better applies to SE Database Administrators, but in the end it's more about the algorithmic approach IMO. PPS.: In case that matters anyhow, the applications are written in Delphi, using the Devart UniDAC components. 1)The problem occurs in the context, where we try to replace an ISAM (flat file) database system (ADS) with PostgreSQL. The row locking in the original system is implicit, when an application process (session) holds a cursor to a table row for updating. 2)Of course the text context value would be passed as a prepared query parameter, so I am aware about the dangers of SQL-injection, or other unexpected syntax issues.
I have a particular problem addressing the PostgreSQL advisory locking functions using the bigint
variants.
Basically I want to create a 64 bit bigint
value from a text
type obtained with the PostgreSQL md5()
function from an arbitrary text_input_context
input.
The idea is to emulate / simulate a table row lock behavior1, where the text context is build from the full table name (schema + table name), and the table's keyfield values for a particularly selected row.
It might be acceptable in our application to lock even more rows due to collisions of these hash values, but ensuring to at least have that row is protected for updates from other sessions.
The lock / unlock would be ever straightforward and sequential to protect concurrent accesses of a specific table by design.
Originally the md5()
function returns a 128 bit value represented as a hexadecimal text value of 32 bytes.
I want to convert this to a bigint
value, which I can use with the PostgreSQL advisory lock functions.
When doing some research I found a way to just cut the 128 bit value output of the md5()
function for it's 1st 16 bytes, and convert these to a 64 bit bigint
value that can be passed to the PostgreSQL advisory lock functions2:
SELECT pg_catalog.pg_advisory_lock(('x' ||
pg_catalog.left(pg_catalog.md5(text_input_context),16)::bit(64)::bigint)
I am well aware that using the md5()
algorithm already looses information regarding the text_input_context
, though I want to reduce that to a minimum, and even more when just using the 1st 16 bytes of the md5 hash value.
My question is:
Is that just a too naive approach, or can we have better implementations to compute a 64 bit value from a 128 bit hash value?
When researching I found the idea to additionally xor-ing the lower 16 byte part with the upper 16 byte part. But wouldn't that be even worse by means of distribution?
Regarding some comments advising to use SELECT ... FOR UPDATE
, SELECT FOR UPDATE NOWAIT
or SELECT FOR UPDATE SKIP LOCKED
, please note that this won't help to solve my specific problem.
The
SELECT
commands are done internally by some DB access component, where I have little to no chance to change the behavior how the component issues these statements against the database.What I need is to indicate that a particular table row is locked in the GUI, but still show all the information in the form.
PS.: May be this question better applies to SE Database Administrators, but in the end it's more about the algorithmic approach IMO.
PPS.: In case that matters anyhow, the applications are written in Delphi, using the Devart UniDAC components.
1)The problem occurs in the context, where we try to replace an ISAM (flat file) database system (ADS) with PostgreSQL. The row locking in the original system is implicit, when an application process (session) holds a cursor to a table row for updating.
2)Of course the text context value would be passed as a prepared query parameter, so I am aware about the dangers of SQL-injection, or other unexpected syntax issues.