toad.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Mastodon server operated by David Troy, a tech pioneer and investigative journalist addressing threats to democracy. Thoughtful participation and discussion welcome.

Administered by:

Server stats:

334
active users

#effectsize

0 posts0 participants0 posts today

#statstab #260 Effect size measures in a two-independent-samples case with nonnormal and nonhomogeneous data

Thoughts: "A_w and d_r were generally robust to these violations"

#robust #effectsize #ttest #2groups #metaanalysis #assumptions #ttest #cohend

link.springer.com/article/10.3

SpringerLinkEffect size measures in a two-independent-samples case with nonnormal and nonhomogeneous data - Behavior Research MethodsIn psychological science, the “new statistics” refer to the new statistical practices that focus on effect size (ES) evaluation instead of conventional null-hypothesis significance testing (Cumming, Psychological Science, 25, 7–29, 2014). In a two-independent-samples scenario, Cohen’s (1988) standardized mean difference (d) is the most popular ES, but its accuracy relies on two assumptions: normality and homogeneity of variances. Five other ESs—the unscaled robust d (d r * ; Hogarty & Kromrey, 2001), scaled robust d (d r ; Algina, Keselman, & Penfield, Psychological Methods, 10, 317–328, 2005), point-biserial correlation (r pb ; McGrath & Meyer, Psychological Methods, 11, 386–401, 2006), common-language ES (CL; Cliff, Psychological Bulletin, 114, 494–509, 1993), and nonparametric estimator for CL (A w ; Ruscio, Psychological Methods, 13, 19–30, 2008)—may be robust to violations of these assumptions, but no study has systematically evaluated their performance. Thus, in this simulation study the performance of these six ESs was examined across five factors: data distribution, sample, base rate, variance ratio, and sample size. The results showed that A w and d r were generally robust to these violations, and A w slightly outperformed d r . Implications for the use of A w and d r in real-world research are discussed.

#statstab #254 Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Thoughts: I share tutorial papers, as people resonate with different writing styles and explanations.

#statistics #guide #tutorial #effectsize

link.springer.com/article/10.1

SpringerLinkStatistical tests, P values, confidence intervals, and power: a guide to misinterpretations - European Journal of EpidemiologyMisinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously so—and yet these misinterpretations dominate much of the scientific literature. In light of this problem, we provide definitions and a discussion of basic statistics that are more general and critical than typically found in traditional introductory expositions. Our goal is to provide a resource for instructors, researchers, and consumers of statistics whose knowledge of statistical theory and technique may be limited but who wish to avoid and spot misinterpretations. We emphasize how violation of often unstated analysis protocols (such as selecting analyses for presentation based on the P values they produce) can lead to small P values even if the declared test hypothesis is correct, and can lead to large P values even if that hypothesis is incorrect. We then provide an explanatory list of 25 misinterpretations of P values, confidence intervals, and power. We conclude with guidelines for improving statistical interpretation and reporting.

#malcolmGladwell has another book, I guess trying to rescue his much-nitpicked #TippingPoint.

IDK if he's a net positive force in the world or not. As a #psychologist I've occasionally looked up the original #research he cites. He tends to portray findings in black-and-white terms, like "People do X in Y situation!" when, most often, I've found the research best supports something like "In some studies 12% of people did X in Y situation despite previous #models predicting it should only be 7%" or "The mean of the P group was 0.3 standard deviations higher than the mean of the Q group".

I see many of his grand arguments as built more or less on a house of cards. Or rather, built on a house of semi-firm jell-o that he treats as if it were solid bricks.

I'm not knocking (most of) the #behavioralScience he cites; Hell, I'm a behavioral scientist and I think this meta-field has a ton to offer. I just think it's important to keep #EffectSize and #PracticalSignificance built into any more complex theories or models that rely on the relevant research instead of assuming that #StatisticalSignificance means "Everything at 100%". I'm sure there's some concise way to say this.

Overall, I think he plays fast and loose with a lot of scientific facts, stacking them up as if they were all Absolutely Yes when they're actually Kinda Maybe or Probably Sort Of and I don't think the weight of the stack can be borne by the accumulated uncertainty and partial applicability indicated by the component research.

So I take everything he says with huge grains of salt and sometimes grimaces, even though I think sometimes he identifies really interesting perspectives or trends.

But is it overall good to have someone presenting behavioral research, heavily oversimplified to fit the author's pet theory? It gets behavioral science in the public eye. It helps many people with no connection to behavioral science understand the potential usefulness and perhaps scale of the fields. It also sets everyone--especially behavioral scientists--up for a fall. It's only a matter of time after each of his books before people who understand the research far better than he does show up to try to set the record straight, and then what has happened to public confidence in behavioral science?

Meh.

#statstab #157 Effect size measures in a two-independent-samples case with nonnormal and nonhomogeneous data

Thoughts: You can never have enough (confusing) effect size measure. At least make them appropriate for your data.

#effectsize #statistics #nonparametric #es #estimation

link.springer.com/article/10.3

SpringerLinkEffect size measures in a two-independent-samples case with nonnormal and nonhomogeneous data - Behavior Research MethodsIn psychological science, the “new statistics” refer to the new statistical practices that focus on effect size (ES) evaluation instead of conventional null-hypothesis significance testing (Cumming, Psychological Science, 25, 7–29, 2014). In a two-independent-samples scenario, Cohen’s (1988) standardized mean difference (d) is the most popular ES, but its accuracy relies on two assumptions: normality and homogeneity of variances. Five other ESs—the unscaled robust d (d r * ; Hogarty & Kromrey, 2001), scaled robust d (d r ; Algina, Keselman, & Penfield, Psychological Methods, 10, 317–328, 2005), point-biserial correlation (r pb ; McGrath & Meyer, Psychological Methods, 11, 386–401, 2006), common-language ES (CL; Cliff, Psychological Bulletin, 114, 494–509, 1993), and nonparametric estimator for CL (A w ; Ruscio, Psychological Methods, 13, 19–30, 2008)—may be robust to violations of these assumptions, but no study has systematically evaluated their performance. Thus, in this simulation study the performance of these six ESs was examined across five factors: data distribution, sample, base rate, variance ratio, and sample size. The results showed that A w and d r were generally robust to these violations, and A w slightly outperformed d r . Implications for the use of A w and d r in real-world research are discussed.

#statstab #65 MOTE effect size calculator/convertor

Thoughts: The #r package is great, but this is a quick app for reporting various effects. It also provides handy interpretations, useful when teaching/learning.

#shinyapp #rstats #effectsize

shiny.posit.co/r/gallery/life-

ShinyShiny - MOTE: An Effect Size CalculatorShiny is a package that makes it easy to create interactive web apps using R and Python.