Sunday, December 22, 2013

The Difficulties of Making it Worth it's Points

Greetings Gentlereaders,

One of the biggest distinctions in talking about units in 40k, especially in light of our recent podcast, is how to make units "worth their points."  From a design perspective, this is a much more difficult task than most of us would probably think.  Listen to how our discussion grew beyond its stated bounds and began re-designing the whole game from the ground up.  Thinking more about the basic mechanics of the game brings me to think the statistic system of the game make calculations of value damnably complex.  Hopefully you'll be willing to go through why 40k is so frustrating from a mathematical perspective and how that makes designing 40k such a difficult task.  If you're not up to delving into it with me, let's just say that the core rulebook's rules causes drastic shifts in the value of weapons and armor from target to target.  That means that making a rule for determining what a stat point is worth is possible, but quite complex.

--Beginning of Math--

Okay, so the first thing we need to look at is our assumptions of what makes a unit worth its points.  A unit needs to have a certain amount of strength, toughness, leadership and an appropriate armor save of its points.  When we look at units we can compare the dire avenger to the tactical marine and see a lot of differences in statistics and rules, and that's without looking at the supplementary units that each has access to.  That's a mind-boggling question that I might go into later, but for now let's look just at the units in a vacuum.  Would the dire avenger be 'worth its points' if it went up one point per model and took on the statline of the tactical marine?  The problem with figuring that out is that we are all assuming there is some formula to determine the how much each stat point and weapon characteristic is worth in terms of points.  The problem is that that formula, as far as I can tell doesn't exist, so let's solve that.

The most basic aspect of the game to look at mathematically is damage taken and dealt.  Each of these combines WS/BS, model's/weapon's S and model's/weapon's AP values and compares them to the target's T, armor/invulnerable/cover save and possibly FnP to give us a probability that any given shot will result in a lost wound.  Again, we're leaving out things like multiwound models and instant death for simplicity's sake, so this isn't as complex as it could be.  That being said we've still got an equation with six variables to work with. Let's start building this equation with shooting.

We know that the rule for rolling to hit is that if your d6+BS is at least seven you hit.  That works really well for most models, who have BS 1-5 because you can just you have a  (BS)/6 probability to hit.  Sadly, we can't just move on to evaluating strength if we want a complete formula.  For simplicity sake we could lump all BS higher than five into five, but then we're giving some models 'free stats' because, rare as it is, there are times a BS 8 model will hit that a BS 5 model won't.  Alright, let's try to keep this moving.

Looking at strength, we know that a simple fraction won't work because we need to know the target's toughness to know anything about whether our strength value is useful (e.g. S3 shooting at T4 or T7 makes a huge difference).  Okay we know we need a number over six because we're using a d6 to roll: (X)/6.  Now we need out variables inserted: (S-T)/6, well that won't work because that means centurions can't wound each other.  So we need some constant in the mix to make it work out: (S-T+C)/6.  Being as we know centurions wound each other on 4+, let's add in a three and see if that holds in our one case: (S-T+3)/6.  Let's start with bolters; (4-5+3)/6 = 2/6 = 0.333, good it holds and does so with strength five through seven weapons, too.  The problem comes in with strength eight and above weapons ([<7>0.83], which means that if you had a weapon that was strength eight would always wound and higher strength weapons would generate more than one wound per shot.  Looking the other way, we get that our equation holds for strength three weapons, but for strength two we get (2-5+3)/6 = 0/6 and we know that doesn't work.  We should get 1/6 by game rules.  Strength one weapons should have a 0/6 chance, but our equation gives us a negative chance of wounding and that doesn't work either.

I fall to pieces...
Looking into the 'fancy mathematics' of the problem we do have a solution.  When if you remember questions like 'solve for Y if Y = 8x+7' you know what a function is.  You plug something in, mess with it and get something else out.  We're not just putting in one variable (something that can change, e.g. S, T, Sv) and we don't have a rule that works for each roll we're trying to calculate for any values we want to put in.  Now we need to get into another type of function called a peicewise function, which essentially says that we set up different rules for different values of our variables.  Looking at  our simplest test, rolling to hit  we can and already did construct a simple rule for if our ballistic skill was 1-5 before.  Now we need to write one for the rest of the scale (except zero): if BS > 5 (5 + (BS - 5)/6)/6 so if we combine that with our original rule for ballistic skill we can get one step that applies to all of values of BS.

Probability of hitting
dependent on firer's BS
if BS = [0, 1, 2, 3, 4, 5], then (BS)/6
if BS = [6, 7, 8, 9, 10], then (5 + (BS - 5)/6)/6

We get the chart to the left to display the probability of hitting based on ballistic skill.  As you can see, we have a steady return from increasing ballistic skill until we hit BS5, after which the slope drops significantly.  This shows us that, even without considering the prevalence of things like overwatch, not all points of ballistic skill are created equally.

Then we do something similar with strength and toughness, but our function needs to become more complex because strength doesn't mean anything without an opposing toughness value.  Because we have both strength and toughness that factor into our probability of wounding, we have two independent variables and one dependent.  Normally this would be illustrated in three dimensions (S, T, Prob) and look like a sheet billowing in the breeze, but for simplicity's sake let's just use our previous example of shooting at toughness five models to see how different strengths measure up.  For clarity our function would be

Probability of wounding a T5 model
dependent on weapon's strength
if (S-T+3)/6 is less than 0, then prob = 0
if (S-T+3)/6  equals 0, then prob = 1/6
if (S-T+3)/6 is greater than or equal to 5/6, then prob = 5/6
otherwise prob =  (S-T+3)/6

Just like with ballistic skill we can see that the value of each point of strength isn't created equally.  In our example, each point of strength above seven is valueless as it doesn't increase our chance of wounding.  If we were looking at a multiwound model, then strength ten would have some value, if and only if the model had multiple wounds remaining, but strength eight and nine would still be worthless.

Unlike the ballistic skill question, we can't answer how much a point of weapon's strength (or model's toughness/AV) without knowing the prevalence of its opposite.  In a world of T5, S1 is useless, and S2 and S3 are exactly the same, but that isn't how things are.  There's a wide spectrum of units available each with their own values without thinking about how many of each unit would be taken by a player.  For instance, does the existence of howling banshees effect the value of a heavy bolter as much as the existence of dire avengers?  I would argue not, simply because dire avengers are more likely to be taken and  are more necessary to winning most games than banshees are.  Without knowing how prevalent units will be in actual games we can't put a prevalence value on stats to weight them and determine the value of a point of the opposing stat.

--End of Math--

So we need to know that actually happens in a game of 40k, how prevalent charges (and thus overwatch dropping the value of additional points of BS) are, how often specific units are taken and how prevalent specific stat values are so we can price opposing stats appropriately.  But we can't forget that how we price things will change how much people take them and we need to start the loop all over again.  There are ways to finish this and find an optimal value for each stat, but that's going to take more math than I can stand right now.


  1. A couple thoughts to consider in trying to evaluate adequate points costing:

    1) The math for high str weapons (6+) also needs to account for its ability to punch AV as well as wounds. Otherwise, none of your high quality, low quantity weapon platforms will appear to ever be worth their points (las cannons, krak, etc).

    2) Any serious attempt cannot evaluate or establish proper costing in a vacuum, precisely because, as you noted, the value of the weapon str is entirely dependant on the targets durability (toughness, armor, #of wounds, fnp, etc). So instead of getting to this point and then throwing up your hands, I suggest taking as basic assumption one (or more) of the most basic profiles are precisely worth the points listed, and going from there. Good canidates would be your meq (marine equivalent) for t4 and 3+, and the geq (guard equivalent) for your light infantry and most xenos (t3 and 5+). Those will cover most troops.

  2. "Any serious attempt cannot evaluate or establish proper costing in a vacuum..."

    This pretty much sums up my feelings on the matter.

  3. Here's the problem, the value of having the ability to easily punch through armor value X only adds value if you're up against an army that is running armor value X. The prevalence of any toughness or armor value changes the value added of any weapon's ability to fight it. For example, if we get three more AV 14 vehicles out of a new codex, the value of meltaguns increases relative to the value of plasmaguns.

    Taking as given that current profiles are precisely worth the points listed is that we have multiple statistically identical profiles that have different points levels. Assuming that the existing profiles are worth their points assumes away the problem and the whole point of the article.

    The real heart of the problem is endogeniety. Essentially, the value of a point of strength changes the value of a point of toughness or armor value, which changes the value of a point of strength. This feedback mechanism means calculating values, especially out to whole numbers is extremely rare.

  4. Judging a single stat point, or a single situation based on a single possible circumstance will end with results that do not equate to what could possibly or even probably happen. While AP3 doesn't mean much against guard infantry... it means a whole heck of a lot to Space Marines who then pay a lot for a stat they don't get to utilize. Yes, not having any 3+ saves reduces the efficiency of paying for AP3, but does not render it irrelevant as it still pieces all saves less than 3+.

    Additionally, if you were to judge every single possibility in it's own vacuum, you'd have an unnecessary number of equations, each not holding enough data to really mean anything in and of themselves. In your approach, in order to make any of it this work worth it, is if you put in equations for each S value vs each toughness value. Then you'd have to go back and apply a whole new set of equations applying AP values 1 through - to each of those equations. This process would have to again be repeated for Melee by applying all of the above (S + AP) vs. T by further adding the algorithm for the probablity of hitting by challenging WS vs WS.

    This is a massive amount of data that must be considered in your approach. Unfortunately the approach itself appears to be flawed. Your article makes a great point and expresses it clearly (in the graphics) about diminishing returns. That is true, and I doubt anyone would argue that. However, the means of using that information by placing it in a vacuum scenario is flawed because it is completely dependant on things that the game is based around to not be represented. These situations include Cover, Line of Sight blocking the number of shots a unit can actually fire, USR's like Stealth/Shrouded/FnP or other save modifying rules, and of course the impact using multiple units against a single enemy unit can have on the outcome, just for starters.

    Not all armies are built to fight in the same way. Most require some level of synergy with other units on the table to actually be useful, or in this case efficient. This effect is compounded further when you test it against MEQ vs testing it against IGEQ.

    Again, my biggest argument here is that attempting to apply equations based on vacuum tests will result in a mountain of data which isn't really useful when you consider the constantly shifting environment of the game's battlefield, and the surplus of differing targets you'll be up against. If you want to search of optimal efficiency or absoulte value, denying or ignoring mechanics central to the game's balance is a large oversight in my opinion.

  5. You're getting the idea, but you're not taking it far enough to grasp the whole of the problem.

    To use your example the point of AP from 4 to three has value against models with 3+ armor, but only against them. An AP 3 weapon is just as good as an AP four or two weapon against dire avengers, as it also pierces their armor. The value of the additional (marginal) point from four to three depends on the prevalence of models with 3+, but not the prevalence of any other armor type. Likewise, the point from five to four matters against models with 4+ armor, but no others, as they are either pierced by lesser weapons or would ignore the marginal point of AP from five to four.

    The reason I began in a vacuum is because when you're trying to figure out a rule, you start as simple as you can, then build on so it can stand up to more complexity. I was trying to demonstrate a very simple version of how to find a baseline taking into account as little relating to other models as possible. To expand the model to incorporate close combat we would need to find some way to calculate the probability of getting into CC range and weight the value of CC stats more lightly than shooting stats. CC stats should be cheaper because the probability of getting into CC range is most likely less than getting into shooting range, even for a pistol. But that weighting relies on a lot of factors, so it's got a lot of complexity. I wanted to start with KISS and work from there.

  6. That is why you assume the most commonly prevalent stat lines. The meq is 14 pts, a geq is worth 5. Each encompasses the entirety of the stat line, from ws, ldr, as, and loadout. From there, it is a matter of comparing similiar stat lines to evaluate any variations. Reverse engineering from there would produce more meaningful results than trying to snatch formulas from the aether without any assumptions. The point values aren't absolute unto themselves, and only have meaning in relation to one another and what they buy. And since the meq and geq profiles approximate 90% of all the troops youd ever encounter, they really beg for a very strong reason to not be used. The hardest part is finding a decent baseline for those last odd troops (usually the very hard variety). Variations usually mean 3++ saves, multi wounds, high t, or a combo of them. The th/ss termie would be a decent baseline, but they usually seem a bit overcosted in practice, so there might be a better canidate.

    And the value of a weapon is not based on if your opponent has the target you payed to counter. The value is based on your ability to bring the appropriate tool when your opponent does bring X. If it absolutely seemed necessary, you could always do some data research on the likelihood of an army bringing X, and then base values on the probability to counter X modiefied by the % of it going unused.

  7. Mojo, I realize you're trying to help, but you're assuming away the problem if you assume some stat lines are correct. If we say that X is correctly costed we are implicitly saying that X conforms to some rules for how much its components are worth in terms of points. If we say something isn't costed correctly, we need to establish some rule that says what part of the model is either too cheap or too expensive for its value. If we start with guardsmen and marines as being correctly costed we are assuming that we have that rule and that those statlines abide by those rules. We don't have that rule, and acting like we do won't help us get some way being our personal feelings to justify that something isn't correctly costed.

    You're absolutely right in that points values are relative to each other, as I was hoping to show in the to wound portion. You're right in needing to weight the cost of a thing designed to counter X based on the probability of someone bringing X, which needs real world data. That would be the last step of what I was trying to do, but before we try to jump in the deep end we need to get into the kiddie pool to make sure we don't drown.

    I started with the model only interacting with the BRB rules because those are the only things that every model in every game interacts with. I'm not pulling this out of the aether, but out of the BRB. With the exception of psykers using witchfires and models with weapons that automatically hit follow essentially the same rules for how to shoot. Twin-linked, we can add another step, but I was aiming for the simplest way of looking at the basic rules so we could get a very basic idea of just how complex figuring out any way to establish 'what is worth X points' is.

  8. It doesnt assume away the problem precisely because the points are in relation. For example, if you assume the marine is correctly costed at 14 pts and then during your evaluation of all other comparable units end up providing more value per pt than your 14 pt meq, than whether you say x, y, and z are undercosted or the meq is overcosted is all the same. From that initial assumption, it is much easier to derive a set of rules for /why/ the profiles are properly costed instead of trying to find a formulae that doesn't exist.

  9. Except that the set of rules you're talking about are an implicit formula for how much certain things are worth. The value of points depends on the other models in existence, but without some metric to determine how much things are worth we can't say whether or not something is over or undercosted. There needs to be some conversion rate that says how many points going from AP 5 to AP 3 is worth. Without some rule, like the formula I started, the only way we have is making things up.

    At the moment, it's assumed by the writers that most stats are worth five points and wounds are worth ten points. The problem is that we can easily see that different points of stats bring different benefits; they are not equal. A wound on an autarch is not worth as much as a wound on a marine captain, because of the toughness difference.

  10. I'd argue this isn't as black and white as you may be considering. Often times it isn't just the stat points that must be considered, but also the means in which you purchase them which may factor into the price point set for them.

    For example, a Space Marine Devastator unit consists of 5 models, each of which can purchase a heavy weapon. They also has a Signum on their sergeant, which increases the unit's ability to hit with one model, but only marginally (increasing the BS 4 to BS 5.) Each of these models pays 20 points if they want a Lascannon for example.

    Compare this now to the Heavy Weapons Team from Imperial Guard. They only pay 15 points for their Lascannon upgrades. This lower price may be because of a lower Ballistics Skill, the fact that the unit is swapping a Mortar for the Lascannon, or that the unit is only capable of gaining 3 weapons instead of 4. However the unit is also a Troop Choice for Codex: Imperial Guard. Also consider the Order's system, which contains the "Bring it Down" command, which is often a crucial element for this unit.

    Does all of that factor into the point's cost? Probably, but if you do not consider those outside factors like orders, then it may throw off the balance of why the price points are set so differently.

    Now, take a look at the recently updated Death Korps army list, where the same unit is a Heavy Support option, not a troop unit, and contains special rules like "Iron Discipline" and "Death Korps." The unit only pays 10 points for Lascannons, making it 15 points cheaper. There is absolutely no difference in the unit statistically to that of the standard IG unit, but it has rules that standard IG units do not (which pertain to other aspects of the game like moral checks for shooting, and increased Weapon Skill), and is not a scoring unit natively.

    So how do you factor in something like that? Even if the math is the same in terms of their shooting output, it doesn't account for a large number of factors the designers need to account for when pricing a unit.

  11. Your devastator example is actually very easy to portray in this model. You modifying one of your shooting attacks (the bolter, single shot) for +5 S, -2 AP, almost double range and forfeiting your option to double tap. Modify it further with snapshots after moving and you're basically there. The signum substitutes the sgt's option to shoot for an increase of +1 BS, so you have the choice of which is worth more to you in the situation.

    Things like being a troops unit and having different movement values is more difficult to calculate and honestly I'm not sure how to, but it does need to be taken into consideration.

    Iron Discipline can be represented as a statistical increase in leadership: conditional on being within 6" of an officer and under quarter strength, the unit's chance to regroup goes up from 2.78% to the appropriate percentage based on their unmodified leadership. Death Korps could be represented similarly, based on the expected value of their margin of losing a close combat and the prevalence of powers and rules that would modify their leadership values downward. Likewise, ATSKNF would represent a 100% chance of regrouping and a movement bonus as well as an option to fire as stationary with a heavy weapon if moving during regrouping.

  12. None of that equates to putting value on those rules or situations. How can you base value on one set of rules as compared to another without there being personal bias involved? How can your understanding of mathematical probability be applied to pricing a unit or a unit's wargear when the rules themselves are not capable of being factored numerically?

    How do you equate the movement bonus of ATSKNF? What is the point value of being able to act normally in the following turn? What is the actual numerical price value of being a scoring unit natively vs. Heavy support? What is the price penalty for granting your enemy bonus victory points when your heavy support unit is destroyed in a game of Big Guns Never Tire vs any other mission type?

    If these questions can't be answered, then what is the point of determining the other "simple" equations? It'd be a compiled lot of information than becomes irrelevant the instant an "undetermined scenario" occurs, or at the very least becomes so inconsistent that it's unusable as anything more than a best guess, which is what most veteran players do subconsciously already when judging the viability of a unit, or a coarse of action.

  13. The value of those rules comes in when you consider that they are stat buffs. We can take personal bias out of the equation by using a sufficiently large sample of games and observing how many moral checks a unit with a given leadership takes or by determining some arbitrary baseline from which to test, say three moral checks in a game. We could go from there to attempt to find a mean value and assume a normal distribution of results.

    As I've said, at this point movement is a factor of the game I haven't figured out how to deal with in this model. The value of being able to act normally is the difference between the range and accuracy that would be reduced by standard regroup rules. For rapid fire weapons this wouldn't be a large difference (3" range), but for heavy weapons it could be a very dramatic difference in accuracy (snapshots).

    You're getting ahead of where I've attempted to go in this article and expecting me to have an answer for you. Right now I don't.

    Compared to some equations, these are rather simple. The point of a model is to help us try to understand a complex system by making a simpler version of it. In the two equations I provided above, we can use mathematics to demonstrate what we've been able to intuit from playing the game: 1. Not all statistic increases are created equal and 2. The value of a weapon is highly dependent on what models you're playing against. These are things we already know, but the mathematics gives us a way to prove it.

    In the end, the point of this article was to take a look at game design from a mathematical perspective and to look at the complexity of the game system as it relates to the pricing of models and stats. I've often heard the complaint that models are 'too cheap' or 'too expensive.' Knowing that there must be some baseline of what is 'just right' behind those ideas. In showing how difficult it is to try to determine where that baseline is, or even how to find it, I came to realize how difficult determining the right cost for a model, stat or rule is. In the end I came out with some sympathy for the designers.

  14. Sure, but pointing to problems and calling them impossible doesn't help either...

    ... so thumbs up to this article for at least poking at the surface of the problem!

    However, I believe that the benefits of the proposed "model" (not a plastic one... a conceptual/mathematical model), designed to establish a costing hierarchy for the elements of the 40k-game, are easily overshadowed by the inherent variability of the game. I found it useful even if the reference enemy had to be non-general.

    The models that generate a reliable prognosis are those that rely on large effect sizes... and most of them are stupid-simple, e.g., bring gun X to defeat dude Y, and hide behind structure J to avoid death by K.

    Large effect sizes are found, as we all know, in threshold rules such as armor penetration, and so on. Concluding that a certain weapon is 10% too much is an insignificant finding when compared to the difference between saturating/not saturating ones army with AP3 weapons when playing against MEQ.

  15. The tactical flexibility of the game is one massive factor for which no model can fully account. Glad the model is useful to you, I was considering going into the effects of AP in the next post (it was the original point of the article, but that changed...).

    What I discovered in writing the article was just how many factors go into calculating the value of a model, not even going into the values of different firing modes, movement, etc. was just how complicated game design is and I gained a bit more appreciation and leniency for how the game designers point-cost models.