Cat dps gemming choices

Topics: Rawr.Cat
Sep 7, 2009 at 5:55 AM

A friend of mine has about 451 armor penetration, and from everything he and i have read is that after you reach 250 armor pen you continually gem for armor pen.  However as we have starting using rawr for him when we go to optimize it says that he should gem agility instead of armor penetration.  Is this the correct choice now or is there something we are missing.   His name is Grozhitchuk on Ghostlands.

Coordinator
Sep 7, 2009 at 6:56 AM

you need to make sure to give the optimizer the option to check ArP gemming. a quick way to do this is to check the Fractured (Red) and Puissant (Purple) gems in the Gem>Normal tab section underneath where is says "comparison" at the top.

Sep 7, 2009 at 7:11 AM
Edited Sep 7, 2009 at 7:19 AM

Yea we have it selected.  If i  have a +20 agility gem in a piece of gear, and switch just that gem to a +20 armor pen, dps drops by 3 for each gem we switch to armor pen.  Hence the reason i'm confused.


Correction, we just did it again after optimizing, and now it switched it, but we had to equip the banner and get to 430 armor pen rating first for it to switch. Changing a agility to an armor pen gem is a net gain of..... 1 dps per :p.  Thx for the response though.

Coordinator
Sep 7, 2009 at 7:20 AM

This whole "once you reach 250 arpen, you should gem for arpen" (or its variant "once you reach 250 arpen, you should use the Shred idol") is a total fallacy, mostly spread by FeralByNight's terrible 'simulation'. Rawr's calculations are accurate. Gemming ArPen can indeed produce higher DPS than Agi at higher gear levels, but it's much significantly than 250, and varies quite alot that by other stats. Regardless, ArPen is being nerfed in 3.2.2 (which is likely to be this tuesday), making it almost impossible for ArPen to beat Agi in all but the most extreme circumstances, so you'd be wasting a ton of epic red gems if you gemmed ArPen at this point.

Sep 7, 2009 at 7:53 AM

Both Rawr and FBN do come up with similar results as far as ArPen goes.   The "250" figure is a rough estimate given by NightCrowler given a BiS gear list at one stage.  The figure has been focused on too much when the reality that you need to either use RAWR to examine your own gear or use the simulator with the stats from your own gear punched in.   Both will give you reasonably similar results as both model feral combat pretty well.   I know that Rawr tells me to use ArPen gems until I hit the soft-cap at the moment and has done so since I hit about 400 ArPen or so (based on the rest of my stats of course, your mileage may vary).

At the end of the day though, it is very difficult for either to truly model things like latency and human error etc and no matter what, the difference between ArPen and Agi is pretty small, especially when you consider the variability caused by the RNG.     If you are comparing a DPS range on a fight of 5000-7000dps to 5050-7050dps then you are not likely to even really be able to tell the difference.

Certainly at the moment, while the changes on the PTR are pending, I wouldn't be spending much gold either way - probably just put blue quality gems in any upgrades until we see which way the wind blows.

 

Sep 7, 2009 at 10:33 AM

I use FbN and Rawr, and with both I calculated the same BiS (ArP hard cap with Executioner), so the calculations aren't that far away.

I hope we can see a "Use 3.2.2 values for Armor Penetration" check box soon in Rawr...

Developer
Sep 7, 2009 at 4:16 PM

The 3.2.2 value is already set in the Unreleased code

Sep 8, 2009 at 9:53 AM
Edited Sep 8, 2009 at 9:56 AM

Hi Astrylian,

I've been using Rawr for quite some time now and have also turned on my friend's hardcore raiding guild to it (they've gotten yogg-saron to <10% on hard mode and have made the program a mandatory download), so thank you.  So ever since I've gotten Rawr, I've basically stopped looking for any additional resources on how to play my character.  Now I want to get more serious about raiding myself and I'd like to know all the stuff like "best way to use OoC" and "ArPen v Agi" and how to use Berserk correctly, etc...  I was pointed by my friend back to Elitist Jerks where FbN seemed to be a major topic of discussion, when I stumbled across this thread and your criticism of the program.

If I could ask:  If FbN is terrible, what simulator (or forums) do you recommend?  Appreciate all your work and any direction you can give me.

Thanks,

MrDruid

P.S. I'm sure it's obvious, but I'm a feral kitty.  I also off-spec tank.  Using Rawr to toy with the idea of a viable hybrid until my guild catches up to me :)

Coordinator
Sep 8, 2009 at 5:16 PM

Rawr is what you should use for gearing/gemming/talenting/glyphing/etc. On skill usage, EJ is still the best resource; you just have to wade through alot of FbN-inspired stupidity there. Try to stay out of the FbN thread, and double-check that posts you reference there are from reputable sources.

Sep 8, 2009 at 7:45 PM

I've found that FBN tends to produce fairly accurate information when properly used, even if the code makes your eyes bleed and runs like molasses.  The problem is that it's incredibly hard for the average user to properly use (even if they manage to compile it), and Nightcrowler tends to make a lot of sweeping generalizations in mildly broken English.  People latch on to "arpen after 250" without remembering what few qualifications were given with that statement, and it gets perpetuated as the absolute truth.  I wouldn't completely disregard the FBN thread, but don't believe much of anything unless you can figure out how to use the simulator and verify it yourself.

Shameless plug, but I personally use a simulator I wrote myself over FBN (link).  It's designed to be much easier to use and simulate a more Rawr-like high bleed uptime rotation, plus it's around 40 times faster than FBN.  It hasn't gotten much in the way of feedback, so definitely don't take it as infallible, but it works well for me.  I'd stick with Rawr as your primary method for normal gear choices, though.

Sep 9, 2009 at 11:43 PM

Allport,

I wonder if you had the current release of RAWR when you made those calcs?  The newest version has the 3.2.2 patch in mind which has an Arp Pen nerf included.  Before that patch both FBN and RAWR were giving similar numbers.  At about 440 passive Arp Pen RAWR was nudging it ahead of agi.  I believe the 250 Arp Pen that FBN was promoting was with an Arp Pen trinket which would be close.

Sep 14, 2009 at 9:33 PM
Edited Sep 14, 2009 at 9:36 PM

Astrylian, 

As a long-time member of the kitty theorycrafting community, what are your specific criticisms of FBN? Obviously it lacks the user-friendliness of Rawr (a gap that no theorycrafting tool can hope to close) but what spurs you to call it a "terrible 'simulation'"? From what I've gathered both having sifted through the code, while its speed and user-friendliness haven't come close to Rawr, I see no reason to doubt the numbers it ends up with (even though I know your personal aversion to using stat weights and the like).  

Also, do you harbor similar opinions about SimulationCraft? Both of those simulations ended up with incredibly close formulas the last time a thorough comparison was done. 

I ask these questions because intuitively, simulations should come closer to the "real" data than a calculator can. 

Edit: I can understand the feeling towards the community of FBN users, as a lot of people take declarations there by faith.  But on the other hand, I feel like Rawr asks just as much faith from users as anyone in that community does, as well (in that the only way to understand what is going on is to browse source code). 

Coordinator
Sep 14, 2009 at 10:46 PM
Edited Sep 14, 2009 at 10:48 PM

I've occasionally been asked what specifically is wrong with FBN, and I honestly have trouble pointing out specific issues, exactly. 

Say some 5 year old comes to you and says, "Look! I made a rope bridge to cross that 50yd canyon behind our house! Come try it out!" So you say, "Uhhh... You're 5. But lets see what you've got." So you go out back, along with all your friends, and sure enough, there's something vaguely resembling a rope bridge there. "Uhh... okay..." So a few people try it, and fall to their death. "No way am I trying that", you say. Then he comes back to you a while later, and says "I fixed it!" You say, "I don't believe you, show me your building plans". He balks at first, but eventually shows you them, and you see that he has applied some duct tape to the bridge, on the plans, but there are still several holes to fall through. You point these out, and he responds by applying more duct tape to them. Many people do cross the bridge now, though you're unsure how many actually make it across successfully. But anyone who looks at it for a second can see that it's still a rope bridge made by a 5yr old, held together by duct tape. They don't have to see any obvious holes in it know that it's a bad idea.

It's the same issue with FBN. Anyone who has a bit of programming experience can look at his code and just shudder, cringe in pain. It looks like it's written by a 5yr old, and is written in such an incomprehensible way as to make it difficult to double-check. Nobody wants to spend the time reading through it to identify specific cases where it's doing things wrong, when it's obvious at first glance that it just needs to be totally thrown away.

 

SimulationCraft, on the other hand, seems to be decently well written (though I have not looked at its code recently), and has been validated by many many people who know what they're talking about.

Sep 15, 2009 at 12:30 AM

I cringe in pain when I read code with no comments in it ;)

Kidding aside, I have several problems with the metaphor-- let's start with scale.  "Fall to their death" is a little extreme here-- it's a purposely misleading metaphor to state that anyone had serious problems with his simulator.  He was initially very wrong on a hit rating detail where he inverted 80% energy refunds and 20% energy refunds, but his work overall has been quite accurate in terms of viability of the simulator.  While there have been differences in terms of exact values of when to switch gems, etc. 

Second, I totally agree with you in terms of professionalism of code-- I would not use it as a basis for designing anything, nor would I attempt to put my gear into it to see it work.  However, you're a liar if you say no one is looking through it to identify its flaws-- in fact, I've done it personally as have others, which led to the corrections above.  In terms of accessibility, while it's less readable, it's just as accessible as the code here.  Everything you need to look at is in a single file, and you don't have to learn an entire platform like Rawr to see what's going on.  Since I hadn't checked in on reading theorycraft a bit, I took a few minutes to try to understand both his code and Rawr's cat code.  Sadly, it took me more time to understand what was happening in Rawr, even though it is clearly more elegantly designed and easier to maintain for anyone who knows what's going on.  In other words, I think it's easier to understand exactly what's going on in FBN as it is in Rawr, so long as you 1) are unfamiliar with Rawr's code platform and 2) don't mind reading ugly code. 

I understand your instant dismissal of his code based on looks alone, but you've definitely implied that it's factually incorrect when you've only provided evidence that it's terrible as an example of code.  It's a terrible ad hominem attack to say that because it's not professional-looking code, it can't possibly be generating a valid output, so I'll say it's definitively wrong.  Especially when you know the problems associated with calculations versus simulations.  Especially when it was outputting results identical to the other well-written simulator as of 3.1, which led me to believe in its initial accuracy. 

 

Coordinator
Sep 15, 2009 at 4:24 AM
Edited Sep 15, 2009 at 4:30 AM

I agree about the "fall to their death" being too extreme. More like only not end up at the right spot or something. Sorry, was the best metaphor I could come up with on a few min notice. :) (EDIT: Oh, or 'take longer to get there', perhaps!)

And I didn't say nobody was reading his code; I said nobody *wants* to read his code. It's so poor, that it definitely impacts the 'accessibility' of it. And it's much harder to find defects in his code by noticing patterns in the outputs, since it's so hard to provide lots of inputs. (ie, notice that adding a certain stat to a certain item reduces its value when it should increase it. Or your DPS doesn't change as you increase/decrease the points in a talent. Or that a set bonus doesn't work when paired with a certain buff. Or that... you get the idea.)

My point was only that I didn't know of any examples right now, because I hadn't looked; but that if you look into the past, there has been a steady stream of terrible bugs like that, and the programming 'style' he has only encourages more. Just because the results seem to make sense or agree with other theorycraft doesn't mean that they're done correctly, which means they're less likely to provide correct results in all situations.

Sep 15, 2009 at 5:27 PM

Perhaps my memory is being selective, but I don't have nearly the recollection of "terrible bugs" which you do.  Browsing through a handful of pages, the majority of issues are related to the associated add-on's problems (when developing new features and not syncing it with WoW patches) or platform compiling issues.  I've seen very, very few bugs with "the math is wrong" conclusions, and I think that while his code might be faulty, his record in implementing changes and having it work for his specific cases is very, very good.  Regardless of whether his coding style is prone to making mistakes, he may be making that up in his testing procedures.  

You have a valid point in that because it's not user-friendly, it can make the process less reliable.  However, this certainly doesn't explain the cookie-cutter cases, which is the meat of the broad generalizations you criticized (the "fallacy" in the initial reply, for instance).  And you can say that the broad generalizations aren't explained or qualified enough-- recently my guild had a druid app who didn't know you needed an ArPen trinket for the 250 ArP rule, or whatever it was, for starters-- but it's definitely the range of where you should be looking at your gear and seeing if your gear is high enough... at least when he made the statement back in 3.1, when the range of ArPen gear was relatively limited to the high-end.  At that point, it was merely a discussion of whether the line was at 250 ArPen or 400 ArPen, but regardless, you couldn't get that much without close-to-BiS gear.  And I think those differences were explainable at the time-- for instance, in a 5-minute fight simulator, you will have higher uptime on ArPen trinket procs than an infinite-time calculator.  

The thing is, FBN people just weren't conscious that people can get 250 ArPen really easily, without being close to the 3.1 BiS gear level (in other words: too much from 5-man epics).  It's also reasonable to say that your calculator is upper-bounding benefits from crit, given that it is 100% efficient with combo points (or was the last time I checked) whereas the sim and actual players never are (unless they are DPSing) and your DPS method is bleed-heavy (objectively) whereas his is FB-heavy (relatively).

In summary: 

I agree that the FBN tool may be prone to bugs because of the way it is written.  I don't agree that there is any evidence that these bugs exist, have been present in the simulation or the data which he derives, or that there is any evidence which concretely discredits the accuracy of the tool, or a solid basis to be using words like "fallacy", "terrible 'simulation'".  It would be one thing if you made the claim that it's so poorly written that it shouldn't be trusted, but you seem to be claiming that it couldn't possibly be correct at all.  

Coordinator
Sep 15, 2009 at 6:36 PM

Most of what I remember is from before that thread even started, when he built his simulator privately, and specifically didn't release the code, but insisted it was right anyway. After quite a bit of pestering, because it didn't agree with Rawr, he released the source code, and many people pointed out several glaring holes. Energy usage, hit rating conversion, FB damage, using FB at all at low gear levels, double counting expertise, etc.

Would it help if I said "totally untrustworthy" instead of "terrible"?

That whole '250arpen' or '400arpen' rule is one of the big sticking points; it's not only wrong, it shouldn't exist at all, as any decent theorycrafter should know. There's no such 'magic point' where everyone should do something differently. If there's one thing Rawr has taught me, it's that the 'cookie cutter' cases are actually rare; everyone's got a different selection of gear available to them, commonly from a wide variety of instances/sources, so no magic rule works in practice for a significant portion of the user-base. 

(P.S. No, Rawr.Cat isn't 100% efficient with combo points. Rawr.Cat's rotations are DPS-heavy. They may appear bleed-heavy, because bleeds provide the highest DPE. A large variety of rotations are tried, including heavy bite usage, and they simply don't hold up in most situations. In some situations, they do, and Rawr will correctly use them then.)

Sep 16, 2009 at 1:28 AM

Thanks for this little discussion.  I was also wondering at the strength of Astrylian's anti-FBN feelings and it is interesting to get some insight.

I remember the early issues with the simulator.  There were a few early issues that were picked up pretty quickly when the code was released.   Overall now though, the simulator seems to be pretty accurate and there are several people who do seem to look through the code.  Unfortuantely, EJ seems to be being takem over by a pack of nub posters and it is harder to find the wisdom amongst the crap.   Never thought i'd be suggesting that EJ get even stricter in their policing the forums lol.

Being totally clueless when it comes to looking through code, I must say that Rawr is much easier for me to use.  If FBN had a more useable front end, or even better, was able to import Rawr character files like the shammy sim then that would be different.

I can also say that there are times when Rawr gives me crazy results.    I know and understand many of the reasons behind this (haste etc.) but it can still be confusing and leaves me spending time double checking if it is my own stupidity causing it, the anomolies of the calculator (haste etc) or more unlikely, a bug.

 

Overall, I find Rawr to be my goto option, although I would love to see something like in the Enhance model with the export to EnhSim, but I can't see it happening given the code mess etc. :)       EJ is fast becoming less useless now that more people are using the simulator and posting their results - who knows what settings they have used etc.    At least with Nightcrowlers original posts you could see he had run the sim many times with many different configs to come up with his results, so the conclusions (with that BiS gear) had some evidence behind them and could be treated as general principles (unfortunatley, most people seem to be taking them as hard figures that apply at any gear level).     I have no real problem with the simulator, just how people are using it and interpreting the results.

 

Sep 16, 2009 at 11:57 PM

 

Suffice it to say, I can forgive initial-release quality issues of 6 months ago given that the largest community of druid theorycraft has generally accepted it, I'm not one to totally dismiss it (despite its clear flaws in usability and coding, and I think Nightcrowler would agree with you on both of those points).  As a theorycrafter, my initial response is to look at it, see how its results are different from mine, and find the source of that difference under the assumption that it's at least partially correct.  Which, I guess, is an unacceptable viewpoint for you, because you can't stand reading his code or trusting its outputs.

The conclusion I'd draw if a significant (but possibly misguided) portion of the population believed some other model than mine was outputting raw numbers would not be to assume every result from it is wrong, but rather, assume that there's some truth to it, and figure out if my model could be doing something better which the other is doing right.  I'd trust someone who approached the problem this way more than someone who dismisses it because of style and six-months-ago release issues (think about every problem there has been with trinkets in Rawr loading from Wowhead over the past year or so, and whether people should trust your trinket rankings now).  There have been enough eyes on it now that for any given gear set, the tool itself has a great chance of being as accurate as any.

Coordinator
Sep 17, 2009 at 1:17 AM

I do, indeed look at it from a 'what could it be doing better that Rawr, and how could Rawr improve upon that' view, definitely. It has to be done in a general sense, however, though, not a technical sense, because actually looking at his code or process is an utter waste of time.

Sep 17, 2009 at 4:03 AM

I'm afraid I have to agree with Astrylian on this one; as a professional programmer the moment I see terribly written code there is really no reason to continue looking at it or even consider anything it produces valid.  I have seen far too many novice programmers create mostly unreadable code that appears to create correct answers...and when you look at the code you can't quite pinpoint anything wrong in the mess...it all LOOKS ok...sorta...but then when you end up taking over the project, reformatting and recoding it...yeah...it was all wrong. 

This isn't a "sometimes" sorta thing.  This is a "damn near always" sorta thing.  Poor coders produce poor code.  Poor code produces incorrect answers.  End of story.  A bunch of extra cooks with their own ladles (many of whom also probably aren't really programmers) doesn't generally improve things. The strangest things hide in bad code.  And you'll never find them until you completely rewrite it.

I can, and have, opened up the Rawr cat code.  I can read and understand it clearly and form intelligent questions about the decisions it's making and either answer them for myself or ask them here.  What it does makes clear logical sense and it follows very standard and correct programming methods.  It's far more important than you think.

Coordinator
Sep 17, 2009 at 4:56 AM

Well said, Khanthal. The bottom line is that with code like that, FBN is anything but trustworthy, regardless of whether its results appear correct.

Sep 17, 2009 at 3:40 PM

As a professional programmer, I can understand that you don't trust it-- but you have to understand that a lot of people do. And a single procedure very frequently does function correctly even when written poorly-- it's just that, as a professional, I'm sure most of the code you need to encounter and rewrite that happens to be poor, is also trying to do something more complex than a single-file simulator. I am by no means saying it has no bugs in its output, but I'd argue it has a lot fewer hurdles to clear than code which you all work with (both professionally and in your spare time). 

Even given all that, here's the list of things I would notice if I were to take a good look at a simulator, and then see where Rawr might be inaccurate:

- 100% Rip uptime although even the simplest begin-of-fight analysis can see that you don't  have rip up for at least 6 GCD's of the fight, even if you play perfectly the rest of the way. 
- You don't achieve 100% rake uptime, realistically, in any scenario when you need to maintain mangle, SR, and Rip.  If you disagree, show me evidence that this is the case (any parse from in-game would be fine, even from a training dummy.) 
- You don't calculate misses into your bleed uptimes. 
- You don't  have any correction to realize that you typically don't have the energy to mangle twice, rake twice, and shred three times in order to maintain uptime on all three when they all need to occur (occasionally) within a typical cycle.  So you can't have 100% uptime on mangle and rake while extending rip every cycle.  Also, you'd need to use another finisher because you'd have too many CP.
-  You don't sacrifice rip uptime for FB, ever, even a single tick. I can guarantee you that you get more damage from sacrificing a tick of Rip for an FB.
- You calculate your number of FBs by taking the overall CP available, then calculating the energy cost, and dividing it.  In other words, you assume that all attacks that generate CP will be used efficiently.  I'm amazed someone who has played the game thinks that no CP get wasted (for instance, shreds on OoC procs while already at 5 CP and waiting for the next finisher).  If you tried to play this way, you'd also find that you have decent odds of not having enough CP when it's time for your next finisher. 

These are inaccuracies that I can find in your code in a 30 minutes of looking that I KNOW the feral simulators would handle properly/better.  Most of these are all things you could realize simply by looking at SimulationCraft, what you've already admitted is well-coded and thoroughly looked over by people who know what they're doing, if you prefer not to deal with FBN.  

It's fine if you say "these things can't be fixed", and I'll chalk that up to this being a calculator, and knowing that your answers are an upper-bound that tends to be bleed heavy (since no one ever achieves 100% bleed uptime).  But these are things that you're missing that you might learn from either simulator's outputs, without even looking at its code. 

 

Coordinator
Sep 17, 2009 at 9:11 PM
Allev wrote:

As a professional programmer, I can understand that you don't trust it-- but you have to understand that a lot of people do.

And they shouldn't. That's our point. People who don't know any better are easily deceived, and we hate it when people are deceived.

 

Allev wrote:

And a single procedure very frequently does function correctly even when written poorly-- it's just that, as a professional, I'm sure most of the code you need to encounter and rewrite that happens to be poor, is also trying to do something more complex than a single-file simulator. I am by no means saying it has no bugs in its output, but I'd argue it has a lot fewer hurdles to clear than code which you all work with (both professionally and in your spare time).

No, it really doesn't. I understand how it may seem like it to someone not experienced with this sort of thing, but I promise you; it's really that bad. A huge part of my professional job is maintaining ancient legacy code, written by horrible developers; it's some of the worst designed/written code I've ever seen. And none of it even holds a candle to FBN.

 

Allev wrote:
Even given all that, here's the list of things I would notice if I were to take a good look at a simulator, and then see where Rawr might be inaccurate:

<snip>

These are inaccuracies that I can find in your code in a 30 minutes of looking that I KNOW the feral simulators would handle properly/better.  Most of these are all things you could realize simply by looking at SimulationCraft, what you've already admitted is well-coded and thoroughly looked over by people who know what they're doing, if you prefer not to deal with FBN.  

It's fine if you say "these things can't be fixed", and I'll chalk that up to this being a calculator, and knowing that your answers are an upper-bound that tends to be bleed heavy (since no one ever achieves 100% bleed uptime).  But these are things that you're missing that you might learn from either simulator's outputs, without even looking at its code. 

Rawr.Cat is in no way perfect, nor do I claim that it is. Several of the things you list are specifically improvements that I'd love to do, given the time. If I said "these things can't be fixed", I'd be ashamed. To respond specifically:

- 100% Rip uptime although even the simplest begin-of-fight analysis can see that you don't  have rip up for at least 6 GCD's of the fight, even if you play perfectly the rest of the way. 

I'd like to add calculation for the initial build-up, yes.


- You don't achieve 100% rake uptime, realistically, in any scenario when you need to maintain mangle, SR, and Rip.  If you disagree, show me evidence that this is the case (any parse from in-game would be fine, even from a training dummy.) 

There's definitely a point in that 100% isn't achievable due to lag variance, and that affects Rake alot more than Rip. Same with simply screwing up. Beyond that, there are collisions on multiple skills coming off at the same time. I'd like to add reductions to the effective duration of Mangle and SR to account for having to refresh them early, due to collisions. I'd also like to add an option for lag variance.


- You don't calculate misses into your bleed uptimes. 

Simply untrue. They are indeed calculated. They're applied as an effective increase to the duration of the bleed, but not to its total damage.


- You don't  have any correction to realize that you typically don't have the energy to mangle twice, rake twice, and shred three times in order to maintain uptime on all three when they all need to occur (occasionally) within a typical cycle.  So you can't have 100% uptime on mangle and rake while extending rip every cycle.  Also, you'd need to use another finisher because you'd have too many CP.

I do it all the time; it's totally possible (short of the rake uptime lost to lag variance, of course). 

 

-  You don't sacrifice rip uptime for FB, ever, even a single tick. I can guarantee you that you get more damage from sacrificing a tick of Rip for an FB.

Another point I'd love to implement, given time.


- You calculate your number of FBs by taking the overall CP available, then calculating the energy cost, and dividing it.  In other words, you assume that all attacks that generate CP will be used efficiently.  I'm amazed someone who has played the game thinks that no CP get wasted (for instance, shreds on OoC procs while already at 5 CP and waiting for the next finisher).  If you tried to play this way, you'd also find that you have decent odds of not having enough CP when it's time for your next finisher.

And yet another, kind of. Proper energy pooling solves this to a huge extent (as well as just not using FB), but I realize not entirely. This is a definitely one problem that a simulator would easily handle, and is difficult to handle in a closed form model. However difficult, I do want to attempt to account for this.

 

I strongly believe that Rawr should encourage you to play better, and optimize your performance while doing so, rather than optimizing your performance while playing poorly.

Sep 18, 2009 at 8:52 PM
Edited Sep 18, 2009 at 8:53 PM

And yet, those legacy developers kept their jobs for many years with code that served its purpose well enough while they were maintaining it-- otherwise there wouldn't be so much horrible code for you to fix.  Since I'm not going to convince you that it's possible his code is right, let's look at some outputs: 

- 6.5% damage from FBs is very high relative to realistic in-game scenarios with low Rip uptime, let alone high rip uptime-- Rawr has it this high loading my profile.  For third-party theory reference, SC can only maintain ~80% uptime on Rip with 6% FB damage at the T8 gear level. The WMO Ignis and Jaraxxus leaderboards have no one approaching 30% total damage from Rip+FB (The highest I found was roughly 27%.)
- FBN has less than 100% bleeds when hit capped-- (thank you for explaining where the misses/dodges come in); it's rare to find parses with over 90% uptime, especially among top DPSers.  The top bleed percentage I could find from US players in the top 20 on Jaraxxus was 92% (99 ticks in 3 minutes, 36 seconds.) 
- Similarly, I can't find a single parse on the Jaraxxus leaderboards with rake damage over 10.6% of total damage-- and he only had 92% uptime, and his rakes crit 15% more than his other attacks, and that parse didn't FB; for the gear levels of those players, FBN lands near 9% and SC lands at 10%, while I've seen few configurations of gear where Rawr is under 11% (on my character for instance, Rake is 12.5%, although I'm not in endgame gear).  Most players on the leaderboard are near 8-9%.

-Both tools overestimate DPS relative to in-game, but FBN is ~500 DPS closer to reality than Rawr.

You say FBN's are probably wrong based on a glance at his code; I say Rawr seems more wrong based on the fact its outputs are further away from in-game results, and a lot more people can see the second and confirm that.  For every user where the tool is a black box, they have to assume Rawr is doing a worse job at the cycle theorycraft than FBN, so they'll trust FBN more.  The issues I outlined in the last post are very directly causing the differences between Rawr and reality.  And I contend that they're bigger flaws than anything happening in the FBN code, based on outputs.

I wish they'd use SC instead of FBN as its outputs are similarly accurate, fairly customizable (including decision-making), better written, and more easily repeatable, but that's beside the point.

I'm not advocating accomodating poor play.  I'm advocating a model of the best possible play as opposed to the impossibly good, which is what Rawr.Cat is right now.  It's clearly beyond the upper bound of possible-- and not an "impossible for a person to actually execute" kind of way, but an "impossible for the mechanics to occur like that" kind of way.  And the numbers are coming out wrong.  And FBN's, if incorrect, are at least closer to right. 

To address the individual point which I may have miscommunicated-- I didn't mean to say that it's impossible to mangle twice, rake twice, shred three times, and cast three finishers in a single rip period-- you can, with enough OoC procs or TF or berserk.  My point was that the scenario exists that you can't maintain all three if you don't get OoC procs over that period and your TF/Berserk is on cooldown.  Even if you rip at 100 energy (and not when the timer expires).    It happened to me just last night, after a rip with high energy and expiring mangle/rake right afterwards.  I'm not saying it's a major issue; I'm just saying that it's the kind of edge-scenario which will always exist in calculators that average-case things that happen sporadically, like averaging the energy you get from OoC.  And it can make a difference.  

 

Coordinator
Sep 18, 2009 at 10:03 PM

No, those developers only kept their jobs as long as people who didn't know anything about writing code were managing them. I'm not debating that it's possible that his code is right. It totally is possible. It's just not likely, or trustworthy.

First, regarding a lot of the rest of your post, I must again point out: There are a ton of bad cats out there. Of course it's rare to find good parses; Cats are (IMO) the most intense timing-based (ie, skill-based) DPS spec in the game. FBN is only 'closer to reality' than Rawr for most users, because most users are bad.

Gwindori and I spent 2 hours on a dummy last night, to try to get practice in with new rotation feeling, as we've both recently dropped 4T8, and he just got 2T9. With no raid buffs, we were able to maintain ~90-93% uptime on both bleeds, with him Mangling, and me PvP spec'd (-4% crit). I'd attribute 3-5% of that to lag variance (ie, just not stacking them tight enough, without overlapping), and the other 2-7% due to simply screwing up. That will go up with the increased combo point generation and energy generation of being in a raid, but down due to fight mechanics. And in the interest of full disclosure, it also helps that he plays on the same subnet as our server, and I play only 2 hops away. Yay for 5ms and 13ms pings. Too bad the server doesn't process things that discretely.

As I've said, I'd like to move the upper bound down from 100% to [100%-lag variance], since that's what's actually possible. And, as you point out, that affects Rake more than Rip.

Sep 19, 2009 at 8:35 PM

There are a ton of bad cats out there.  I actually first looked at the "bad" cats-- in particular, the cats asking for help in the WWS thread on EJ.  These people actually had numbers that came very close to Rawr outputs!

What I'm saying is, the best cats out there don't have damage percentage breakdowns like Rawr.Cat-- the worst cats do.  Except, of course, much lower DPS numbers.  

On your test dummy, how much of your damage came from FB, compared to what Rawr says for an unbuffed cat? Is it possible to maintain those bleed uptimes.  But you're handwaving into bad play what I'm arguing as neither possible (even with perfect play) nor ideal, and I have two simulators to back me up.  You can argue that it's possible, but I'd love to see it happen.  Show me any parse by a top-end cat. 

I checked the WMO leaderboards.  I've checked Premonition's WOL parses (one of few top guilds who make things like that available). Remember that Premonition was the guild which caused the calls for feral nerfs because their feral was occasionally leading some of that world-first guild's fights in DPS.  At this point this conversation really needs some hard evidence that a druid playing optimally comes out with the numbers which Rawr shows.  All of the good players I've found don't show that.

Coordinator
Sep 19, 2009 at 9:23 PM

I'm not handwaving here, I'm outright saying, Rawr isn't perfect. Rawr overstated how much FB damage I should deal on a dummy. It pretty much always overstates how much FB damage you do, if you play near-optimally. Mostly because of combo points. As I've said, that's something I want to work on in Rawr. See http://www.worldoflogs.com/guilds/467/ for myself and Gwindori. 

Sep 20, 2009 at 6:55 AM

I looked through those looking at bleed damage on possible stand-and-DPS fights (Ignis, Jaraxxus, etc), and saw more support for my point-- the best DPS fights by skilled players, especially by mangle-bots, fit the simulators' output damage profiles closer than Rawr's (although with much less FBs in your captures!).   

Anyway-- I've reached the point where I don't think I'll ever convince you that, perhaps, FBN's margin of error is smaller than Rawr's, despite its weak points, and it deserves more credibility than you give it. 

Coordinator
Sep 20, 2009 at 6:58 AM

And again, I'm not suggesting that the results for one or the other are necessarily better, only that FBN shouldn't be trusted, it's *so* poorly coded.

Sep 20, 2009 at 6:45 PM
Allev wrote:

I looked through those looking at bleed damage on possible stand-and-DPS fights (Ignis, Jaraxxus, etc), and saw more support for my point-- the best DPS fights by skilled players, especially by mangle-bots, fit the simulators' output damage profiles closer than Rawr's (although with much less FBs in your captures!).   

Anyway-- I've reached the point where I don't think I'll ever convince you that, perhaps, FBN's margin of error is smaller than Rawr's, despite its weak points, and it deserves more credibility than you give it. 

One of your prior points, looking at premonition cats, is flawed in an important way:  Being in a top raid guild does not mean you are skilled.  Go read Tun's guide to feral dps on the Ensidia site.  Your face will hit your palm repeatedly.

Your other failing in logic is that correlation is not causation.  Yes you can look at A and B one looks like it produces numbers close to "reality".  The sun also looks like it rotates around the earth.  Logic dictates it's better to look at the source than the output.  One tool is a very old mathematical model, well coded and documented, with known limitations that has stood the test of time.  One is a poorly coded mess that no one can fully understand due to said poor code but the numbers it kicks out "look" good compared to other random numbers on the internet from people doing fights that are not modeled in their entirety in the FbN tool.

And thus the argument concludes.  FbN, without a complete rewrite, is simply not worth trusting.  The chances it will advise you to make bad decisions are just as high as good decisions (or higher).  So if you don't like Rawr's limitations then just use a Pawn system or eyeball it.  You are just as likely to be right.

Coordinator
Sep 21, 2009 at 12:34 AM

Just checked in a bunch of changes to Rawr.Cat. They don't change the results much, but do remove a good bit of the 'impossibility'. I've also created a thread (http://rawr.codeplex.com/Thread/View.aspx?ThreadId=69548) to discuss Rawr.Cat Rotation Logic. Everyone reading this thread should probably check that thread out. 

That said, please leave/continue the 'Rawr vs FBN vs SimCraft' discussion here; I'd like that thread to remain Rawr-specific.

Sep 21, 2009 at 3:09 PM

khanthal, I not only looked at Premonition because they are an elite guild, but because their parses have been the focal point of cat nerf discussions in the past.  The guild is well-known and it should be well-known that Darthn is a good player-- good enough to frequently lead the most successful guild in the US.  I also looked at the WMO leaderboards-- and again, you can say that those players aren't playing well, but they are the highest recorded damage that I can easily find collected.   I did not select parses based purely upon celebrity!  I'm going to refrain from commenting on Astrylian's parses because I don't think that moves the conversation anywhere-- other than being Rip-focused, I didn't come across anything to shift my thinking about good cat play. 

Astryl, I want to thank you for continuing work on the cat model-- I certainly hope you haven't taken anything in this thread personally, I certainly haven't meant it that way.  I think the best outcome for all involved is a better Rawr.Cat, so thank you for moving the conversation towards that point. 

Coordinator
Sep 21, 2009 at 4:04 PM

Not at all. And I'm pretty sloppy (and have been PvP spec'd the last few weeks) because I almost exclusively tank lately. Gwin's the one to look at mostly.

Sep 21, 2009 at 7:08 PM

My second point was actually the important one.  That the numbers look good and compare well to reality is not actually very relevant.  Correlation is not causation; though I suppose that is sort of a stretched metaphor.  When deciding if a tool is good one must start at the source, not at the output.  I could code you something that produces numbers that look right very quickly.  It doesn't mean it's right.

Moreover if the project is coded by someone that doesn't code very well it's extremely likely that most of the changes are made to make the output look right rather than to make the model/simulator more correct.  This is one of the most common faults in new programmers.

Sep 21, 2009 at 10:29 PM

That's a fine argument khanthal, except Rawr.Cat's entire closed-form model is simply an approximation of what "looks right." Approaching 100% rip uptime "looks right." Approaching 100% rake uptime "looks right." Getting all three shreds every cycle "looks right."   I've ACTUALLY STARTED at FbN's source, and Rawr's source.  I'm one of the people who found original bugs in his code, because I wasn't afraid of its organization (it's a straight-ahead procedural program, which I know doesn't make sense to coders who need everything to be object-oriented) or the author's weaknesses in English (he's Italian, but has done his best to spell things properly).  

Your entire argument is "I don't like his code" and not at all "he's wrong here, here, here, and here." It's like I'm talking to a Republican! You have so much reason to believe he's made mistakes, and now you're even trying to describe the types of mistakes he's making without even finding them!  Nevermind the fact that he's got a pretty good record of seeking out, finding, and explaining somewhat unconventional methods and finding them to be true. 

The fact that the numbers are closer to the game reality is entirely relevant, because the one and only reason these tools exist and are useful is to give insight into what happens in the game reality.  What is your yardstick for correctness?  Because right now the only benchmark you're proposing is code, and we already know Rawr has flaws. 

Sep 22, 2009 at 1:20 AM
Allev wrote:

...It's like I'm talking to a Republican! ..

While it amuses me that you think calling someone a Republican is an ad hominem argument, neither politics nor said ad hominems are needed in this discussion.

Correctness starts and ends at the the source not the output.  You create a model or tool that attempts to reproduce reality.  You do so in a way that others can clearly understand, comment on and contribute to.  You document weaknesses and postulate as to their affects on the output.

The output can tell you when you are wrong.  The output can never tell you if you are right.

Where we are left with FbN is that it MIGHT be right.  It doesn't look like it is returning things wrong.  That means very little, however, if you can't go back to the source and show how it's doing everything right to get there, or at least show the shortcuts and weaknesses of the algorithm and document them.  A tool that MIGHT be right but where there is no way to have any confidence in it other than "well the output looks ok?" is not a useful tool.

Sep 22, 2009 at 4:18 PM

If the output can never tell you if you are right, are you saying that the only people who can judge theorycraft are coders? If someone can't understand code, can they ever trust software? 

Also-- please point me to the full list of flaws in Rawr.Cat? Until this thread, a lot of weaknesses which Astrylian has known about (things he's said he's liked to work on above) aren't obvious anywhere on the site-- they're not in the issue tracker (I searched for "Cat" and didn't see anything resembling things I've mentioned here).  Given how this thread isn't at all the original topic anymore, I know I can't trust this discussion forum to provide me useful info... is there a big list somewhere? 

Coordinator
Sep 22, 2009 at 5:06 PM

No, there's no big list.

And yeah, the only people who can judge the accuracy of a theorycrafting tool are those who can fully understand how it works. Most people can use it, but they're not able to judge the accuracy of it.

Sep 29, 2009 at 4:35 PM

The accuracy of "absolute" DPS values is not nearly as interesting as the accuracy of "relative" DPS values.  Inaccuracies in formulation are meaningless if they do not prevent you from making the right decisions.

Simulation may provide better absolute DPS, but verified formulation will provide more trustworthy relative DPS which is crucial in making gearing decisions.

Ideally you have both.  You periodically run simulation to audit your formulation...... and then put the simulator back in the closet.  Once it has been shown that the formulation is "close enough" there is no reason to run anything else.  The runtime and functional benefits of formulation should be patently obvious.

I have spent countless hours on SimulationCraft.... and I will continue to do so because I still believe it to be one of the best measuring-sticks out there. However, there is a reason I have invested such a small percentage of my coding efforts on the interface: For many (not all, unfortunately) class/specs there are far better tools for day-to-day usage.

 

Sep 29, 2009 at 11:39 PM
I can wholeheartedly agree with the premise that if a formulation is properly verified, then it can be more useful than a simulation, because of functional benefits-- then the sticking point becomes, what counts as a verified formulation.   It's difficult to assume that, in the general case, theorycrafting model X necessarily both uses and trusts simulation Y (see the majority of this topic).  It's very hard to buy into the use-the-formulation ideal when Rawr.Cat doesn't make the same assumptions as SC's simulations (or any public simulations for that matter).

Oct 6, 2009 at 8:52 AM
Edited Oct 6, 2009 at 9:01 AM

I just checked out WOL-logs like this one: http://www.worldoflogs.com/reports/rt-kMiHPrPb7BfSLYs4/sum/damageDone/?s=12577&e=13139 and I can say, it's pretty much low (DPS-wise) for such an encounter (no offense). Compared f.E. with this one:  http://www.worldoflogs.com/reports/rt-ZxLdJQgx0z2y22uZ/sum/damageDone/?s=15441&e=15919  (ArP gemmed, my nickname is on top) or  http://www.worldoflogs.com/reports/rt-W4SQxrR2fIzbDmZF/sum/damageDone/?s=2687&e=3149

I think, FbN calculations are more correct.

/peace

Coordinator
Oct 6, 2009 at 9:03 AM
Edited Oct 6, 2009 at 4:47 PM

Clausm, I don't understand your response. Gwin and you are both top of the raid, doing very good DPS for your raids. Gwin's is of course lower than yours, due to the lower gear, lower raid support, and longer fight time. In fact, FbN says your DPS should be way lower than Rawr does, and you're saying it should be higher, but FbN is better?

EDIT: Err, whoops, this is the correct thread, sorry, heh.

Oct 6, 2009 at 2:59 PM

Actually, this is the "gem choices" thread that quickly evolved into the FbN thread-- you split off the rotation discussion into a separate thread. A little bit of a necro :)

I think what Clausm is showing is that he likes the FB-heavy attack priorities more than Rawr's "damage-heavy" estimates, and focusing on the bleeds doesn't improve his situation much.  And it's a little bit of a self-fulfilling prophecy: focus on armor-negated attacks, stack ArPen, all of a sudden your damage shifts to that direction.

The big thing that's pretty important is that despite vastly different playstyles, the difference-makers in DPS isn't playstyle, it's gear/raid.  Gwin is focusing on bleeds to the largest degree possible while Clausm lets the bleeds fall. While various claims can be made that one is "right" and the other is "wrong", ultimately, their DPS ends up being similar despite significantly different damage profiles: one which benefits more from Agi/AP, and one which benefits more from ArP. 

The recommendation from my end is to guarantee that the gear matches the playstyle, which matches the theorycraft.  Rawr.Cat is still in a bit of flux in that matter until the FB problem and the collision problem get solved as it's doing both at the same time right now: lots of bleeds while simultaneously, lots of bites.   So it might be a middle-ground estimate right now, but it's still a little skewed.