Announcement

Collapse
No announcement yet.

Purpose of hashed IP address

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Purpose of hashed IP address

    What's the purpose of the "Hashed IP Address" that gets appended to every post? It seems to me that this information can be used to connect users of this site to tracker accounts, which would be very bad. A 32-bit IPv4 address is not long enough to resist brute-force attacks unless combined with a unique salt (that isn't public) and/or many iterations. More information about this hash would be appriciated.

  • #2
    It seems you already know why this is used but to be clear...

    The site does not permit users to hide their IP address behind TOR, proxies or other such mechanisms. The hash provides a way to see that the users IP address is not changing rapidly, cycling on a round-robin or being used by another member at the same time. The hash is used to prevent the direct publication of the IP address.

    A decent hash has a couple of important properties.

    Firstly, it is unique across its data space. That is to say: It is vanishingly unlikely that two IP addresses will create the same hash. For this purpose, a space of 4 billion hashes would be required for IPv4. For IPv6, there are 79,000,000,000,000,000,000,000,000,000,000 (or 79 trillion trillion trillion) usable addresses Producing 160-bit hex hash (as this site does) is clearly adequate for this purpose as IPv6 addresses are 128 bits in length.

    Secondly, it should be impractical to recreate the original data from the hash without specific knowledge of the workings of the original hashing algorithm. In practice, this means it should be computationally expensive (in CPU hours) to get a users original IP address from the hash. Also that no element of the hash should reveal information about the original data. This is pretty simple to achieve buy using open-source algorithms which do receive a lot of attention from security researchers.

    As long as the hash computation specifics remain unknown, there isn't much side-channel information that is disclosed. If a given site is monitoring this one and they see that your IP address (seen in their own logs) changes at the same time as the hash of a user here then they may well be able to make that correlation. There's nothing stopping you from using TOR or a proxy (or a seedbox) to access those sites thus breaking that chain pretty simply.

    Personally, I'm a LOT mor concerned that this site STILL DOESN'T USE HTTPS. If you want to worry about anything, try that for a starter. Simple session replay from captured packets ought to be super simple. So don't put anything too personally identifying here, don't reuse any passwords/email addresses and hope to <deity of choice> that their servers are secured better than the traffic going to/from them...

    Comment


    • #3
      Originally posted by Clarissa View Post
      It seems you already know why this is used but to be clear...

      The site does not permit users to hide their IP address behind TOR, proxies or other such mechanisms. The hash provides a way to see that the users IP address is not changing rapidly, cycling on a round-robin or being used by another member at the same time. The hash is used to prevent the direct publication of the IP address.
      Right, but why does the hash need to be put public at the end of every post? For the purpose you're describing a private log only accessible to admins would do just as well.

      Originally posted by Clarissa View Post

      A decent hash has a couple of important properties.

      Firstly, it is unique across its data space. That is to say: It is vanishingly unlikely that two IP addresses will create the same hash. For this purpose, a space of 4 billion hashes would be required for IPv4. For IPv6, there are 79,000,000,000,000,000,000,000,000,000,000 (or 79 trillion trillion trillion) usable addresses Producing 160-bit hex hash (as this site does) is clearly adequate for this purpose as IPv6 addresses are 128 bits in length.

      Secondly, it should be impractical to recreate the original data from the hash without specific knowledge of the workings of the original hashing algorithm. In practice, this means it should be computationally expensive (in CPU hours) to get a users original IP address from the hash. Also that no element of the hash should reveal information about the original data. This is pretty simple to achieve buy using open-source algorithms which do receive a lot of attention from security researchers.
      That does not answer my concern. First of all, Kerckhoffs's principle states that "a cryptosystem should be secure even if everything about the system, except the key, is public knowledge". So if the security of the hash is dependant on implementation specific details, then it's not secure. In other words, it should be infeasible to recreate the original data from the hash even if one has every knowledge of every detail of the hash algorithm used, except for user-unique salts (keys), which should be long and random.

      The hash is 160-bits, so it's most likely SHA1. Brute-forcing the SHA1 hash of 2^32 (4 billion) IPv4 addresses is nothing, unless there are additional mitigations such as multiple iterations and salting, which is what I'm asking for. Is there?

      Also for a site operator with a list of its own users' IP addresses who wants to check if anyone is a member here, IPv4 or IPv6 doesn't matter since he only needs to hash the addresses on the list.

      Originally posted by Clarissa View Post

      As long as the hash computation specifics remain unknown, there isn't much side-channel information that is disclosed. If a given site is monitoring this one and they see that your IP address (seen in their own logs) changes at the same time as the hash of a user here then they may well be able to make that correlation. There's nothing stopping you from using TOR or a proxy (or a seedbox) to access those sites thus breaking that chain pretty simply.
      I'm not too concerned about side-channels, but many other sites also don't allow Tor and proxies for the same reason, so yeah, there is something stopping me from doing that.

      Originally posted by Clarissa View Post

      Personally, I'm a LOT mor concerned that this site STILL DOESN'T USE HTTPS. If you want to worry about anything, try that for a starter. Simple session replay from captured packets ought to be super simple. So don't put anything too personally identifying here, don't reuse any passwords/email addresses and hope to <deity of choice> that their servers are secured better than the traffic going to/from them...
      Actually, I connect to the site over HTTPS all the time, although some elements are still plain HTTP (mixed content). Still, doesn't help when its database has been hacked multiple times...

      Comment


      • #4
        You seem to be confusing the cryptography and hashing a little but never mind. If we assume only IPv4 is being hashed, the message spave and hash space are sufficiently different in size to pretty much guarantee a lack of collisions from legitimate data. This is the purpose: To identify identical and changing IP addresses without disclosing the address. Job done.

        You assume that the algo is SHA-1. There are plenty that can produce 160-bit hashes. But it most likely is.

        SHA-1 is deprecated for cryptograhic use, not hashing. It remains extremely useful (and quick) for the latter. The attacks it is susceptible to are pretty much irrelevant to the data it is used for here.

        If you want to know whether the data is salted, feed your IP address into an online SHA-1 calculator and see if the output matches the hash attributed to your account. Although SHA-1 doesn't directly support a salt (or a key, it's for hashing, remember?) the data space is so small even for IPv6 that you could easily append/prefix the data with the same effect. If this is done, there is no direct way to tell if the salt used is constant for all users or generated individually for each one.

        You may need to try a few different hashes, though: ASCII hash of the address, Unicode, binary, ASCII with no dots, etc. Have fun...

        Comment


        • #5
          Originally posted by Clarissa View Post
          You seem to be confusing the cryptography and hashing a little but never mind. If we assume only IPv4 is being hashed, the message spave and hash space are sufficiently different in size to pretty much guarantee a lack of collisions from legitimate data. This is the purpose: To identify identical and changing IP addresses without disclosing the address. Job done.

          You assume that the algo is SHA-1. There are plenty that can produce 160-bit hashes. But it most likely is.

          SHA-1 is deprecated for cryptograhic use, not hashing. It remains extremely useful (and quick) for the latter. The attacks it is susceptible to are pretty much irrelevant to the data it is used for here.

          If you want to know whether the data is salted, feed your IP address into an online SHA-1 calculator and see if the output matches the hash attributed to your account. Although SHA-1 doesn't directly support a salt (or a key, it's for hashing, remember?) the data space is so small even for IPv6 that you could easily append/prefix the data with the same effect. If this is done, there is no direct way to tell if the salt used is constant for all users or generated individually for each one.

          You may need to try a few different hashes, though: ASCII hash of the address, Unicode, binary, ASCII with no dots, etc. Have fun...
          Excuse me but you're the one that seems to be confused. Also, I find your tone a bit patronizing, which is ironic since you get several things wrong. My concern here is not with hash collisions. Nor is it with SHA1 being deprecated. My concern is

          1. What is the specifics of the hash? Input, salt, pepper, iterations, etc. I'm not saying the hash is vulnerable/bad, I'm saying there many ways this could be done right as well as done wrong, and I'd like to know which it is. Judging by your answer, you don't know for sure either.

          2. Why is the hash made public when it doesn't need to be? Of course, if the hash is done right it should matter. But is it?

          Your answer is a perfect example of mansplaining (although I'm not a woman) and does nothing to answer my actual inquiry. And FYI, any hash function "supports" a salt, appending the salt to the data before hashing is the very definition of a salt. We are not talking about HMACs here.

          If you're going to mansplain cryptography to me, at least get it right.

          Comment

          Working...
          X