999精品在线视频,手机成人午夜在线视频,久久不卡国产精品无码,中日无码在线观看,成人av手机在线观看,日韩精品亚洲一区中文字幕,亚洲av无码人妻,四虎国产在线观看 ?

Variability:Human nature and its impact on measurement and statistical analysis

2019-11-21 01:18:06HengLiZezhoChenWeimoZhu
Journal of Sport and Health Science 2019年6期

Heng Li ,Zezho Chen ,Weimo Zhu ,*

a Department of Physical Education,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China b Department of Kinesiology&Community Health,University of Illinois at Urbana-Champaign,Urbana,IL 61801,USA Received 21 January 2019;revised 3 February 2019;accepted 5 June 2019 Available online 18 June 2019

The world is colorful, different, and diverse! So are humans.Human variability refers to the variability among individuals,which could be the variability of a human trait(e.g.,body fat percent),or the difference in the response of a trait to a simulation(e.g.,losing or not losing weight when facing the same intervention).In research,the most commonly studied types of variability are between-individual variability and within-individual variability.Between-individual variability is the difference among individuals(e.g.,the differences in height among individuals).Within-individual variability refers to the variability of an individual at different times(e.g.,the difference of one's weight,performance,and mood at different times).Variability is also a common phenomenon in human performance.1,2

Is a large variability bad?Should an outlier in a data set be considered as a part of variability?What is the impact of variability on commonly used measurement and statistical methods?Variability has long been of interest in human-movement research.A set of measurement(e.g.,reliability coefficients)and statistical indexes(e.g.,standard deviation(SD)and variance)have been developed to measure and analyze variability.However,due to the complex nature of variability,and the lack of advanced measurement techniques and statistical training for researchers,misunderstandings of variability often occur.As a result,variability in research has often been analyzed incorrectly and has led to findings being interpreted erroneously.Here,we summarize common errors related to variability and how to address them.We hope that this discussion helps researchers to understand variability better,and thus contributes to its proper use.

1.Common errors in measuring variability

A common error in measuring variability is ignoring the sensitivity of the measurement.Sensitivity,in this context,is defined as the ability to discriminate differences.In practice,sensitivity is the ability to measure variability in stimuli or responses,detect a change,or classify a status.Without appropriate sensitivity,a difference,a change,or a different status may not be detected.For example,if meaningful changes in a child's height are in centimeter(cm)units,but the test administrators use a ruler with the smallest unit in inches(2.54 cm),the measurement tool may not have the needed sensitivity.However,greater sensitivity may not always be better.For example,using a ruler with millimeter(mm)units to measure height may not be appropriate,since such a small mm change in height may be associated with natural fluctuations within a day,and thus may provide a sense of accuracy that is not present.Therefore,it is important to understand the degree of variability to be measured,what a meaningful variability is,and whether the measure is sensitive enough to detect meaningful variability.

Another“error”,also related to the measures used,involves mixing the variability of humans and measures.For example,researchers have reported the“reliability”of physical activity measurement devices using the following design:Ask a group of subjects to wear a device for 3 days,7 days,or more days and use the data collected to make conclusions about the reliability of the device.Obviously,this analysis cannot be used to evaluate the day-by-day variability of the measurement device,because variability in daily subjects'physical activity behavior is(likely a big)part of the variability that is measured.To measure the reliability of a device,a different repeated-measurement design should be used(e.g.,ask participants to repeat their walking for the same distance,or exercising for the same duration in the same environment at a single point in time3).To distinguish different types of variability,there has been a call for eliminating the term“reliability”,and instead replace it by terms such as“score reliability”(when all variabilities are mixed together),“personal stability”(when measuring intraindividual variability),and“instrument reliability”(when measuring intra-instrument variability).The last type of variabilitycan be further broken down into“location invariance”(when variability in the location of where the device is worn is being investigated) and “device equivalence” (when betweendevice variability is being studied3,4).

Table 1 Data for examples 1,2,and 3.

Failing to recognize the potential impact of variability on measurement coefficients is another common error.Using the Pearson correlation coefficient as a measure of reliability may illustrate this point.The Data 1 in Table 1 represent a hypothetical test-retest data set.By glancing at the data,one can easily detect that the test and retest are not consistent.Furthermore,it can be seen that the inconsistency is systematic because higher pretest scores seem to be associated with larger differences between the test-retest scores.Using the Pearson correlation coefficient for this data set gives a result of r=0.99,an almost perfect correlation!However,does this strong correlation also mean a strong reliability?The answer,of course,is no!The incorrect estimation of the reliability or variability is due to the fact that the Pearson correlation is biased by the order of 2 data sets.As long as the order of a set of a pair of data is kept the same or similar,the correlation will be high,even if there is a large absolute difference between the pairs.This limitation of the Pearson correlation coefficient can be overcome by applying a regression analysis in which both slope and intercept are examined simultaneously:A slope of 1.0 or near 1.0 and an intercept of 0.0 or near 0.0 indicate a high test-retest reliability;a slope of 1.0 or near 1.0,but an intercept far from 0.0 indicate a poor test-retest reliability caused likely by a systematic error;and finally,a slope far from 1.0 and an intercept far from 0.0 indicate a poor test-retest reliability.Another commonly used approach to overcome the limitations is to use an interclass coefficient(ICC)calculation.The relationship among reliability(R),variability,and ICC can be explained using Eq.(1),in which reliability is defined as the ratio of the variability between subjects'true scores(VT),and variability between subjects'obtained scores(VB),which includes VT and an error:

According to Eq.(1),when there is no error(error=0),reliability is perfect(=1).In contrast,when everything observed is an error,VT becomes 0,and reliability will be equal to 0,too.VT can be considered the variability among subjects,which is expressed as MSbetween-MSwithinin the context of a two-way analysis of variance(ANOVA;see Refs 5 and 6 for a variety of ICCs and their applications),whereas VB can be considered as subject variability plus error,which can be represented by MSbetweenin ANOVA testing.In ANOVA terms,Eq.(1)

By applying Eq.(2)to Data 1,the systematic error is detected and taken into consideration,and the new reliability coefficient becomes 0.31:

However,the ICC does not take care of all reliability problems caused by variability.Consider Data 2 in Table 1,which is a small sample from a real study in which the reliability of a pedometer instrument was evaluated.3Specifically,subjects were asked to wear 10 pedometers and walk 100 steps 10 times in a row.Data 2 is a sample from 5 subjects.In contrast with Data 1, the variability among trials was small in this experiment and most results were close to the correct value of 100.Using Eq.(2)for this data set,one obtains a low ICC coefficient:0.34!

What went wrong?Again,variability is the problem!But in this case,it is the small variability.More specifically,it was due to the fact that everyone was asked to walk the same 100 steps,so the small between-subject variability among trials caused the problem.As a result,the within-subjects and between-subjects variabilities became similar,so R,or the ratio in Eq.(2b),became small.As illustrated in Table 2,Pearson correlations also failed this time due to the lack of variability,and most of the computed between-trial correlations were low.Thus,the pedometers were so reliable(or the variability between trials was so small)that they caused 2 commonly used reliability coefficients to fail!These 2 opposite variability impacts,one from the large variability and the other from the small one,indicate that when applying measurement coefficients,the degree of variability for all variables,as well as their potential impact on a specific coefficient,should be carefully examined.

Another common error in human performance research involves failing to understand the measures of variability,and applying them incorrectly.Table 3 summarizes a set of commonly used variability measures,including their advantagesand limitations.The SD is probably the most commonly used variability measure.However,SD is sometimes applied to a skewed data distribution where the interpercentile range should be used instead.If,as described,the point of interest is to understand the impact of a variety of variabilities in measurement practice,such as person stability,instrument reliability,location invariance,and device equivalence,the method of generalizability theory is the most appropriate.Specifically,in generalizability theory,variance or variability is broken down using a carefully designed study and ANOVA-based analysis.Unfortunately,only few studies in human performance have taken advantage of this powerful approach,even though generalizability theory was introduced in the physical education literature7,8more than 40 years ago.

Table 2 Correlations among trials(Ts).

Table 3 Commonly used measures of variability.

Fig.1.Using a scatter plot to help identify the outlier.VO2max=maximal oxygen uptake.

Fig.2.Same treatment effect but with different variabilities(A,smaller;B,larger)in the control(left)and treatment(right)groups.

2.Common errors caused by variability when analyzing statistical data

As is the case of measuring variability,failure to recognize the impact of variability when analyzing data can also lead to errors.For a small data set,a single outlier may lead to a false conclusion.Let us use Data 3 in Table 1,another small data set randomly selected from a real study,in which the researchers were interested in determining if a 1.5-mile running time is valid to predict maximal oxygen uptake(VO2max).To examine the validity of the 1.5-mile running time to predict VO2max,the Pearson correlation was used,resulting in a coefficient of r=-0.17.Based on this short correlation,one might conclude that the 1.5-mile running time is not a valid predictor of VO2max.Although the negative correlation indicates(correctly)that a low running time is associated with a high VO2max,the correlation seems too low.Inspecting the data(Fig.1),we detected an outlier(Subject 10).This subject had one of the highest VO2maxscores,but the slowest running time.If we were able to contact this subject,we could ask what had occurred during the running test and make a decision as to whether a retest was warranted.From a data analysis standpoint,we judge Subject 10 to be a clear outlier.Removing Subject 10 from analysis gives a Pearson correlation of r=-0.82.Based on this analysis,we reach the conclusion that the 1.5-mile running time may indeed be a valid measure for predicting VO2max.

The impact of variability on parametric statistical analysis or null hypothesis testing can also be significant.Recall that the probability for rejecting a null hypothesis when it is false(or for detecting a difference when the treatment really works)is called“(statistical)power”.There are 4 factors that affect power:the α(type I error)level,one-or two-tailed test,sample size,and effect size(ES).In practice,the α level is commonly set at 0.05 or 0.01.Also,most studies use a two-tailed testing approach.Regarding sample size,the chance of detecting a true difference becomes greater as sample size increases.Note,however,that a large sample size may result in the rejection of a null hypothesis whether there is a true treatment effect even if the difference between the treatment and control groups is small and does not have a practical meaning.9,10The last factor to affect the power of a statistic is the ES:the larger the ES,the higher the power.ES can be expressed using Eq.(3):11

where Mtreatment=the mean of the treatment group,Mcontrol=the mean of the control group,and SDpooled=pooled SD of the treatment and control groups.From Eq.(3),we see that the ES becomes large when the treatment effect is strong and variability in the treatment and control groups is small.For a given treatment effect,ES increases when variability decreases,and therefore,it is easier to obtain p <0.05 and reject the null hypothesis(Fig.2).

3.Conclusion

Variability is a natural part of the human condition and human performance.Understanding variability in all its forms,and selecting appropriate measurement techniques and statistical methods to best measure and analyze variability,is essential in scientific research related to human performance.Otherwise,the measurement tools used may fail to detect the true variability of human performance and led commonly used measurement and statistical indexes useless or misleading and research finding interpreted erroneously.Improving measurement techniques and providing better statistical training at the graduate level is thus urgently needed in human performance research.

Authors’contributions

HL carried out the idea and assisted the initial manuscript preparation;ZC made contributions to the early draft of this manuscript;and WZ involved in the study idea and design,final manuscript preparation,and modifications based on the feedback of reviewers and editors.All authors have read and approved the final version of the manuscript,and agree with the order of presentation of the authors.

Competing interests

The authors declare that they have no competing interests.

主站蜘蛛池模板: A级全黄试看30分钟小视频| 亚洲av无码人妻| 久久精品国产亚洲麻豆| 中文字幕乱码二三区免费| 亚洲第一中文字幕| 国产网站在线看| 久99久热只有精品国产15| 无码精品国产VA在线观看DVD| AV网站中文| 欧美笫一页| 国产91小视频在线观看| 国产婬乱a一级毛片多女| 亚洲成年人网| 91蜜芽尤物福利在线观看| 色综合久久88| 亚洲精品第1页| 2020国产精品视频| 激情成人综合网| 中国黄色一级视频| 日本一区二区三区精品国产| 亚洲欧洲美色一区二区三区| 97在线公开视频| 一级毛片在线播放| 日韩欧美国产综合| 不卡的在线视频免费观看| 亚洲最新在线| 久久精品一卡日本电影| 毛片国产精品完整版| 精品国产自在现线看久久| 亚洲欧州色色免费AV| 91精品国产无线乱码在线| 亚洲不卡av中文在线| 在线毛片网站| 91精品专区国产盗摄| 亚洲午夜国产片在线观看| 亚洲欧洲AV一区二区三区| 日韩在线第三页| 亚洲精品视频网| 性喷潮久久久久久久久| 97青草最新免费精品视频| 国产不卡一级毛片视频| 亚洲狼网站狼狼鲁亚洲下载| 欧美日韩北条麻妃一区二区| 欧美笫一页| 凹凸精品免费精品视频| 免费在线不卡视频| 亚洲欧美日韩中文字幕一区二区三区| 亚洲欧美综合精品久久成人网| 一本综合久久| 亚洲精品自拍区在线观看| 亚洲性日韩精品一区二区| 99尹人香蕉国产免费天天拍| 国产精品九九视频| 91热爆在线| 午夜国产精品视频| 亚洲欧洲日本在线| 亚洲日韩图片专区第1页| 这里只有精品在线播放| 欧美性猛交一区二区三区| 国产午夜无码片在线观看网站| 欧美成人怡春院在线激情| 国产精品女熟高潮视频| 毛片手机在线看| 国产精品自在线拍国产电影| 国产内射在线观看| 美女亚洲一区| 久久夜色精品国产嚕嚕亚洲av| 国产真实乱子伦精品视手机观看 | 国产拍在线| 欧美色图第一页| 欧美综合区自拍亚洲综合绿色| 57pao国产成视频免费播放| 亚洲综合色吧| 欧美日韩资源| 3344在线观看无码| 亚洲成人www| 99精品伊人久久久大香线蕉| 日本一区二区不卡视频| 99久久免费精品特色大片| 国产成人在线无码免费视频| 国产精品自在在线午夜区app| 99精品免费在线|