HHSmithy: Datamining is a Tool to Advance Ethical Behavior in Online Poker HHSmithy: Datamining is a Tool to Advance Ethical Behavior in Online Poker
Esther Gibbons, Creative Commons Attribution-NoDerivs License

Datamining and ethicsEsther Gibbons, Creative Commons Attribution-NoDerivs License

Last month, Paul Hoppe wrote an article condemning the practice of datamining in online poker. Pokerfuse invited Kader Belbina, founder of the notable datamining site HHSmithy, to write a rebuttal. Below is his response.

Since its inception, online poker has endured countless cheating scandals, security vulnerabilities and poker sites that are either ill equipped to deal with these issues, not interested in protecting their players, or both. Poker datamining, the practice of recording hand histories without actually playing them, has been under a lot of scrutiny over the years, and even more-so since April 15.

By importing datamined hands into personal HUDs and then using that information to game select and make better decisions, online poker players have been able to increase their winrates. The hands can also be used to study opponents and improve one’s own game by looking at winning “People often make the claim that by using hand histories, players are gaining an unfair advantage. Those same people then ignore the hundreds of dollars players spend on buying tracking software, game selection tools, equity calculators, video training site memberships and coaching.” player’s tendencies. The other area in which datamined hands have been used is uncovering large scale botting, and collusion rings. For example, datamined hands played a valuable role in the now infamous Absolute Poker “superuser” scandal.

In a recent article published on pokerfuse.com, Paul Hoppe came out strongly against datamining. I would like to take this opportunity to offer an alternative perspective to some of the points raised in that article.

So why is there so much controversy around datamined hands?

“It’s a bit like the steroid era in Major League Baseball, where players who declined to bend the rules were at a disadvantage to those who used every edge available.”

People often make the claim that by using hand histories, players are gaining an unfair advantage. Those same people then ignore the hundreds of dollars players spend on buying tracking software, game selection tools, equity calculators, video training site memberships and coaching. If you agree that the use of HUDs and tracking software are acceptable then it is hard to see why hand histories are any different.

Hand histories are one small part of the puzzle for winning players. Just like other services to improve poker players’ edges, they are available to everyone at a small cost. If you disagree with the use of any poker “The main reason poker sites disallow datamining is so they can rake more. As a poker room operator their most profitable model is their player pool playing as many slightly losing hands of poker as possible.”tools or training software then that is an entirely different argument. But if you agree that those tools are fine then fundamentally, how are hand histories any different?

“If the site declares datamining to be against its rules, then datamining is cheating. Those are the house rules of the game. Breaking the rules of a game is cheating and provides those that do it with an unfair advantage.”

Just because a poker site’s terms of service disallow an activity, that doesn’t mean that the activity is necessarily cheating. What it does mean that for disallowing the activity is in the best interested of the poker site, but it is not necessarily in the best interest of poker players.

The poker sites have their own motivations for not allowing datamining and those motivations are directly opposite to what a winning online poker player’s motivations should be. Even though things such as card counting aren’t always allowed in casinos, is it “cheating”?

What it all boils down to is rake. The main reason poker sites disallow datamining is so they can rake more. As a poker room operator their most profitable model is their player pool playing as many slightly losing hands of poker as possible. By players utilizing hand histories they can decide if the game they are about to play in is a winning game or a losing one before playing a single hand.

Another reason many poker sites do not allow datamining is because when they catch bots, collusion rings, cheaters and other such ills they would rather keep it to themselves than publicize it. No site wants tangible evidence available proving that they aren’t sufficiently policing their site. Without this kind of information though, how are players supposed to know that appropriate refunds and measures are being put in place to deal with the situation? In a world without datamined hands and statistics there is not much difference between someone pointing out a suspicious player and one of the typical and unfounded claims you may here that online poker random number generators are “rigged”.

In some cases, poker operators have even shown a complete disregard to bots on their sites as they populate the games and produce rake. There are more reasons why the poker site’s stance on datamining is not in the best interest of poker players, and I will be writing about those in more detail soon.

Hand histories are private property.

No datamining site observes or records any data that isn’t available to anyone in the world by opening up a poker client. There is no server side hacking and no personally identifiable information is ever revealed such as real names, addresses or ip addresses. In Paul’s anti-datamining “If datamining disappeared, players who clocked large volumes of hands and shared their databases with one another would gain a huge edge. This would be unenforceable to detect and stop, and further widen the gap between groups of highly skilled pro poker players and the rest of the poker community”article, he draws a line between watching a few tables and not recording it, to watching all the tables and recording it. To aide his argument he writes:

“It’s like if I say that it’s okay for you to use my lawn furniture but not okay to take it. The pro-mining argument here is like saying that it’s okay to take my lawn chair because it’s not inside my house.”

This is a flawed analogy. A hand history is not a unique item with only one copy and one owner like a lawn chair; it is a recording of an event in time with no owner that is infinitely and freely replicable.

A hand history is also different to digital goods such as movies and music in that there is no copyright on an individual hand history and it costs no money to produce. Ownership has never transferred from one person to another and no breaking and entry or trespassing (server side hacking) was used to record the hand history.

It doesn’t matter that anti-datamining rules are hard to enforce, sites need to a better job.

If datamining disappeared, players who clocked large volumes of hands and shared their databases with one another would gain a huge edge. This would be impossible to detect and stop, and further widen the gap between groups of highly skilled pro poker players and the rest of the poker community (casual players, weekend warriors, part-time pros etc). Datamining sites can actually help to level the playing field in this way.

Hand histories have helped catch cheaters but this doesn’t make it acceptable.

Whether buying and using hand histories is fair will end up being a personal decision similar to the decision between whether HUDs and tracking software are fair. However, unlike other poker software, datamining can and does provide a clear benefit to the community. In this unregulated environment the only way to catch cheaters, botters, colluders, and other problems is by datamining. Is it better to completely stop datamining and run the risk of wide scale insider cheating, collusion rings, bot teams, unsecured software, and other ills or allow datamining to continue and for regulars who utilize these hands to rake less and profit more? What Paul would really like to see is instead of datamining sites is:

An independent security and auditing firm.

He suggests, “What if every regulated poker site sent every hand history and all player data to an independent security firm? This company would need to be comprised of extremely intelligent and trustworthy former poker players capable of analyzing reams of data to ferret out cheaters of all types. Information would be kept private and secure, but there “...we have over 500 computers which cost significant amounts of money to maintain and power. We have spent thousands of hours writing software to track and store ten gigabytes of hand histories daily…”would be a level of transparency in investigations.”

This would clearly be a great thing to happen for the industry and especially for poker players, but what Paul fails to address is who is going to pay for this. Within a regulated environment there needs to be legislation which mandates sites have to allow for independent auditing of their hand histories and poker software for irregularities, and also pay for it to be done.

Unfortunately, no poker site is going to want to have to pay for such services, especially when certain things such as botting actually might be beneficial to a poker operator. It also seems, keeping up with current legislation talks, that a service like this isn’t a priority.

It takes a lot of time, money and expertise to datamine and the only way to pay for it in the current environment is through hand history sales. At HHSmithy we have over 500 computers which cost significant amounts of money to maintain and power. We have spent thousands of hours writing software to track and store ten gigabytes of hand histories daily, and constantly have to update our software to keep up with new sites and patches.

There is no regulation in place and no poker companies who are willing to pay the money for a group of independent people to datamine and analyze the data for the greater good; the only way for this service to currently be funded is through hand history sales.

At HHSmithy we believe that dataminers are in a unique situation to really start making some positive changes in the poker industry. Datamining sites could definitely be more proactive in helping the community, and despite a current lack of funding for such activities, we are moving toward an independent security and auditing firm.

We datamine over ten million hands daily and have the expertise to analyze poker clients for potential security vulnerabilities. In fact, we have just partnered with a large security company to begin an independent security and auditing firm, PokerSec.org. We also plan to start an initiate at HHSmithy:

Ethical Datamining

In order to move toward a safer online poker environment, and to continue to show our commitment to online poker players HHSmithy will begin doing the following:

  • Ongoing poker software and security analysis in order to reveal vulnerabilities and security holes like our series of blog posts on Bodog security.
  • Investigation into what the poker sites really do on your computer and the often invasive and unwarranted breaches of your privacy.
  • Developing automated and open-source algorithms to detect botting, collusion, super-users, hacking and other forms of cheating.
  • Automated, daily, and detailed RNG analysis.
  • Detailed statistical analysis on various rake structures and how they impact players and the community.
  • Regular publishing of obfuscated and aggregate poker data so smarter statisticians than us can investigate site fairness.
  • Opening a data platform as an API to the entire community.

I’d love to get a constructive discussion going on what more we can do to help the online poker community so please don’t hesitate to contact me.