D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 9771 - Remove toHash from Object
Summary: Remove toHash from Object
Status: NEW
Alias: None
Product: D
Classification: Unclassified
Component: druntime (show other issues)
Version: D2
Hardware: All All
: P4 enhancement
Assignee: No Owner
URL:
Keywords:
Depends on:
Blocks: 1824
  Show dependency treegraph
 
Reported: 2013-03-21 00:03 UTC by Jonathan M Davis
Modified: 2024-12-07 13:32 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Jonathan M Davis 2013-03-21 00:03:37 UTC
Per this thread http://forum.dlang.org/post/jtlj1k$1fdj$1@digitalmars.com , it was decided to try and go the path of removing opEquals, opCmp, toHash, and toString from Object, as they don't need to be there, and the fact that they're there is causing issues with attributes such as const ( issue# 1824 ).

This issue is specifically for toHash so that any work towards making it unnecessary and removing it can be referenced here, whereas the other 3 have separate bugzilla entries, as the work for them isn't necessarily related to each other. opEquals, opCmp, and toHash in particular may require substantial work to be done on the built-in AAs before they can actually be removed from Object, as the AA implementation is not currently templated like it should be.
Comment 1 Jonathan M Davis 2013-03-21 00:05:52 UTC
Related:

opEquals: issue# 9769
opCmp: issue# 9770
toString: issue# 9772
Comment 2 hsteoh 2013-03-22 18:50:04 UTC
This probably can't be implemented until we replace AA's with a library template, since otherwise it would make it impossible for e.g. classes to be used as AA keys.

Once we have the library type, though, we can probably make use of UFCS to provide default implementations of toHash for all types.
Comment 3 Jonathan M Davis 2013-03-22 19:09:41 UTC
> This probably can't be implemented until we replace AA's with a library
> template

I figured that that was probably the case, and that will certainly slow this down, but it still needs to be done.

> Once we have the library type, though, we can probably make use of UFCS to
> provide default implementations of toHash for all types.

I'd be very much against providing a UFCS toHash function which used compile-time reflection to figure out how to generate a hash for an arbitrary object. That's just begging for trouble, particularly since the behavior of opEquals and toHash need to match. They need to be controlled by the object itself and not 3rd party code. Providing a mixin to which does that would definitely be useful for avoiding boilerplate, but then it's properly under the control of the person writing the class or struct.

But providing toHash functions for built-in types via CTFE would certainly be good.
Comment 4 hsteoh 2013-03-22 19:28:25 UTC
I wouldn't go as far as using introspection to make a default toHash, but I think there's value in providing a default toHash for classes that just hashes the object's address (I think that's what the current toHash does).

But if we're going to be forcing users to manually define opEquals and toHash, then we should make rt.util.hash available to user code. The only thing worse than having a wrong default opEquals/toHash is a morass of duplicated user code implementing poorly-designed hashing algorithms for each user-defined type.
Comment 5 hsteoh 2013-03-22 19:30:38 UTC
Also, if the default behaviour of == for structs is to do bitwise comparison, then I think it justifies defining toHash for structs that just hashes the bit representation of the struct. It certainly beats having to define toHash for every single little helper struct that you write, just so you can use it in an AA. (That's what really turned me off in C++11's unordered_map -- so much verbosity for something that's supposed to be simple.)
Comment 6 hsteoh 2013-03-22 19:32:00 UTC
P.S. I meant *default* toHash for structs, of course. Obviously the user should be able to override the default behaviour (hence the usefulness of using UFCS for toHash, since a member function always takes precedence over a UFCS function).
Comment 7 Jonathan M Davis 2013-03-22 20:29:03 UTC
Whether it makes sense or not to use the bit representation for hashing depends on the struct, as according TDPL, bitwise comparison is _not_ supposed to be the default for structs (though as bug# 3789 indicates, that's currently buggy). However, it wouldn't be all that hard to come up with a default toHash which corresponded to the default opEquals by having it simply use the toHash methods of the member variables to generate an overall hash. So, we could definitely make it so that toHash is generated for structs as long as opEquals hasn't been defined.
Comment 8 Jacob Carlborg 2013-03-23 07:34:51 UTC
(In reply to comment #4)
> I wouldn't go as far as using introspection to make a default toHash, but I
> think there's value in providing a default toHash for classes that just hashes
> the object's address (I think that's what the current toHash does).

I think that will conflict with having a moving garbage collector.
Comment 9 hsteoh 2013-03-23 08:20:16 UTC
(In reply to comment #8)
> I think that will conflict with having a moving garbage collector.

Good point, maybe the correct approach is to hash the respective hashes of each member of the class?

@Jonathan: I agree, the default toHash should not be generated if the class/struct defines opEquals. It's the user's responsibility to keep the two consistent. So basically, the default toHash should correspond to whatever == corresponds with by default. If == is bitwise comparison, then toHash also should be bitwise hash, etc..
Comment 10 Jacob Carlborg 2013-03-23 08:33:35 UTC
(In reply to comment #9)
> Good point, maybe the correct approach is to hash the respective hashes of each
> member of the class?

Jonathan didn't seem to like that idea.
Comment 11 Jonathan M Davis 2013-03-23 12:09:46 UTC
> Jonathan didn't seem to like that idea.

Given that the default opEquals is supposed to compare each member variable using ==, I think that it would make good sense for the default toHash to call toHash on each member variable and generate a new hash from those. What doesn't make sense is blindly using reflection to go over all of member variables and generating hashes for them regardless of how they're implemented. Some folks in the newsgroup liked the idea of using external functions to use reflection to go through a type's member variables recursively to generate a hash, and given how opEquals and toHash are related, I think that that's an extremely bad idea, and that's the main thing that I'm objecting to. But if we're talking about a default-generated toHash that's only generated when opEquals is default-generated, then we know some things about what opEquals looks like and can generate a toHash which correctly corresponds to it, which would generally mean calling toHash on each member variable and generating a new hash from that - though if the compiler decided that it could do a bitwise comparison of the struct (e.g. because all of the member variables were integral types), then generating a hash in a similar manner would make sense.
Comment 12 Jacob Carlborg 2013-03-23 15:29:36 UTC
(In reply to comment #11)

> Given that the default opEquals is supposed to compare each member variable
> using ==, I think that it would make good sense for the default toHash to call
> toHash on each member variable and generate a new hash from those. What doesn't
> make sense is blindly using reflection to go over all of member variables and
> generating hashes for them regardless of how they're implemented. Some folks in
> the newsgroup liked the idea of using external functions to use reflection to
> go through a type's member variables recursively to generate a hash, and given
> how opEquals and toHash are related, I think that that's an extremely bad idea,
> and that's the main thing that I'm objecting to. But if we're talking about a
> default-generated toHash that's only generated when opEquals is
> default-generated, then we know some things about what opEquals looks like and
> can generate a toHash which correctly corresponds to it, which would generally
> mean calling toHash on each member variable and generating a new hash from that
> - though if the compiler decided that it could do a bitwise comparison of the
> struct (e.g. because all of the member variables were integral types), then
> generating a hash in a similar manner would make sense.

I see, so only when opEquals is default generated.
Comment 13 Jonathan M Davis 2013-03-23 19:12:53 UTC
> I see, so only when opEquals is default generated.

Yes. We could provide helper functions or mixins to make it easier to write hash functions if you define opEquals (or on classes which won't have any kind of default opEquals - let alone the useless one that they have now), but as soon as opEquals has been defined by the programmer, it has to be up to the programmer to define toHash correctly, as in order for hashing to work correctly, every object which is equal must have the same hash. And if the programmer defined opEquals, then the compiler will no longer have enough information to define a toHash which guarantees that all equal objects will have the same hash.
Comment 14 Martin Nowak 2013-06-03 08:15:21 UTC
(In reply to comment #2)
> This probably can't be implemented until we replace AA's with a library
> template, since otherwise it would make it impossible for e.g. classes to be
> used as AA keys.

That very simple to solve, let the compiler pass two functions (hash, equals) to the AA implementation instead of RTTI.

> Once we have the library type, though, we can probably make use of UFCS to
> provide default implementations of toHash for all types.

I would not want to bet that this will happen.
The current situation where the AA is partly implemented in the compiler and in the runtime is unfortunate to say the least.
Also performance-wise a generic AA implementation will never be on par with one that's suited for a special purpose. So it makes a lot of sense to have a lean builtin implementation and provide more configurable ones in std.container.
Comment 15 Jonathan M Davis 2013-10-13 16:17:28 UTC
Relevant discussion on how to transition away from having these functions on Object (the thread got broken up a bit - probably by the mailman bug):

http://forum.dlang.org/post/mailman.214.1369617617.13711.digitalmars-d@puremagic.com
http://forum.dlang.org/post/mailman.280.1369712394.13711.digitalmars-d@puremagic.com

The key suggestion is to change the compiler so that those 4 functions can be free functions in object_.d, similar to how the compiler already does special stuff for == rather than simply calling Object's opEquals.
Comment 16 dlangBugzillaToGithub 2024-12-07 13:32:29 UTC
THIS ISSUE HAS BEEN MOVED TO GITHUB

https://github.com/dlang/dmd/issues/17252

DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB